|   | 
Details
   web
Records
Author Subhajit Maity; Sanket Biswas; Siladittya Manna; Ayan Banerjee; Josep Llados; Saumik Bhattacharya; Umapada Pal
Title SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation Type Conference Article
Year 2023 Publication (down) 17th International Conference on Doccument Analysis and Recognition Abbreviated Journal
Volume 14187 Issue Pages 342–360
Keywords
Abstract Document layout analysis is a known problem to the documents research community and has been vastly explored yielding a multitude of solutions ranging from text mining, and recognition to graph-based representation, visual feature extraction, etc. However, most of the existing works have ignored the crucial fact regarding the scarcity of labeled data. With growing internet connectivity to personal life, an enormous amount of documents had been available in the public domain and thus making data annotation a tedious task. We address this challenge using self-supervision and unlike, the few existing self-supervised document segmentation approaches which use text mining and textual labels, we use a complete vision-based approach in pre-training without any ground-truth label or its derivative. Instead, we generate pseudo-layouts from the document images to pre-train an image encoder to learn the document object representation and localization in a self-supervised framework before fine-tuning it with an object detection model. We show that our pipeline sets a new benchmark in this context and performs at par with the existing methods and the supervised counterparts, if not outperforms. The code is made publicly available at: this https URL
Address Document Layout Analysis; Document
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ MBM2023 Serial 3990
Permanent link to this record
 

 
Author Jorge Charco; Angel Sappa; Boris X. Vintimilla
Title Human Pose Estimation through a Novel Multi-view Scheme Type Conference Article
Year 2022 Publication (down) 17th International Conference on Computer Vision Theory and Applications (VISAPP 2022) Abbreviated Journal
Volume 5 Issue Pages 855-862
Keywords Multi-view Scheme; Human Pose Estimation; Relative Camera Pose; Monocular Approach
Abstract This paper presents a multi-view scheme to tackle the challenging problem of the self-occlusion in human pose estimation problem. The proposed approach first obtains the human body joints of a set of images, which are captured from different views at the same time. Then, it enhances the obtained joints by using a
multi-view scheme. Basically, the joints from a given view are used to enhance poorly estimated joints from another view, especially intended to tackle the self occlusions cases. A network architecture initially proposed for the monocular case is adapted to be used in the proposed multi-view scheme. Experimental results and
comparisons with the state-of-the-art approaches on Human3.6m dataset are presented showing improvements in the accuracy of body joints estimations.
Address On line; Feb 6, 2022 – Feb 8, 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 2184-4321 ISBN 978-989-758-555-5 Medium
Area Expedition Conference VISAPP
Notes MSIAU; 600.160 Approved no
Call Number Admin @ si @ CSV2022 Serial 3689
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla
Title Multi-Image Super-Resolution for Thermal Images Type Conference Article
Year 2022 Publication (down) 17th International Conference on Computer Vision Theory and Applications (VISAPP 2022) Abbreviated Journal
Volume 4 Issue Pages 635-642
Keywords Thermal Images; Multi-view; Multi-frame; Super-Resolution; Deep Learning; Attention Block
Abstract This paper proposes a novel CNN architecture for the multi-thermal image super-resolution problem. In the proposed scheme, the multi-images are synthetically generated by downsampling and slightly shifting the given image; noise is also added to each of these synthesized images. The proposed architecture uses two
attention blocks paths to extract high-frequency details taking advantage of the large information extracted from multiple images of the same scene. Experimental results are provided, showing the proposed scheme has overcome the state-of-the-art approaches.
Address Online; Feb 6-8, 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VISAPP
Notes MSIAU; 601.349 Approved no
Call Number Admin @ si @ RSV2022a Serial 3690
Permanent link to this record
 

 
Author Maedeh Aghaei; Petia Radeva
Title Bag-of-Tracklets for Person Tracking in Life-Logging Data Type Conference Article
Year 2014 Publication (down) 17th International Conference of the Catalan Association for Artificial Intelligence Abbreviated Journal
Volume 269 Issue Pages 35-44
Keywords
Abstract By increasing popularity of wearable cameras, life-logging data analysis is becoming more and more important and useful to derive significant events out of this substantial collection of images. In this study, we introduce a new tracking method applied to visual life-logging, called bag-of-tracklets, which is based on detecting, localizing and tracking of people. Given the low spatial and temporal resolution of the image data, our model generates and groups tracklets in a unsupervised framework and extracts image sequences of person appearance according to a similarity score of the bag-of-tracklets. The model output is a meaningful sequence of events expressing human appearance and tracking them in life-logging data. The achieved results prove the robustness of our model in terms of efficiency and accuracy despite the low spatial and temporal resolution of the data.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-61499-451-0 Medium
Area Expedition Conference CCIA
Notes MILAB Approved no
Call Number Admin @ si @ AgR2015 Serial 2607
Permanent link to this record
 

 
Author Agata Lapedriza; David Masip; D.Sanchez
Title Emotions Classification using Facial Action Units Recognition Type Conference Article
Year 2014 Publication (down) 17th International Conference of the Catalan Association for Artificial Intelligence Abbreviated Journal
Volume 269 Issue Pages 55-64
Keywords
Abstract In this work we build a system for automatic emotion classification from image sequences. We analyze subtle changes in facial expressions by detecting a subset of 12 representative facial action units (AUs). Then, we classify emotions based on the output of these AUs classifiers, i.e. the presence/absence of AUs. We base the AUs classification upon a set of spatio-temporal geometric and appearance features for facial representation, fusing them within the emotion classifier. A decision tree is trained for emotion classifying, making the resulting model easy to interpret by capturing the combination of AUs activation that lead to a particular emotion. For Cohn-Kanade database, the proposed system classifies 7 emotions with a mean accuracy of near 90%, attaining a similar recognition accuracy in comparison with non-interpretable models that are not based in AUs detection.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-61499-451-0 Medium
Area Expedition Conference CCIA
Notes OR;MV Approved no
Call Number Admin @ si @ LMS2014 Serial 2622
Permanent link to this record
 

 
Author Fernando Barrera; Felipe Lumbreras; Angel Sappa
Title Multimodal Template Matching based on Gradient and Mutual Information using Scale-Space Type Conference Article
Year 2010 Publication (down) 17th IEEE International Conference on Image Processing Abbreviated Journal
Volume Issue Pages 2749–2752
Keywords
Abstract This paper presents the combined use of gradient and mutual information for infrared and intensity templates matching. We propose to joint: (i) feature matching in a multiresolution context and (ii) information propagation through scale-space representations. Our method consists in combining mutual information with a shape descriptor based on gradient, and propagate them following a coarse-to-fine strategy. The main contributions of this work are: to offer a theoretical formulation towards a multimodal stereo matching; to show that gradient and mutual information can be reinforced while they are propagated between consecutive levels; and to show that they are valid cost functions in multimodal template matchings. Comparisons are presented showing the improvements and viability of the proposed approach.
Address Hong-Kong
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1522-4880 ISBN 978-1-4244-7992-4 Medium
Area Expedition Conference ICIP
Notes ADAS Approved no
Call Number ADAS @ adas @ BLS2010 Serial 1358
Permanent link to this record
 

 
Author Mohammad Rouhani; Angel Sappa
Title A Fast accurate Implicit Polynomial Fitting Approach Type Conference Article
Year 2010 Publication (down) 17th IEEE International Conference on Image Processing Abbreviated Journal
Volume Issue Pages 1429–1432
Keywords
Abstract This paper presents a novel hybrid approach that combines state of the art fitting algorithms: algebraic-based and geometric-based. It consists of two steps; first, the 3L algorithm is used as an initialization and then, the obtained result, is improved through a geometric approach. The adopted geometric approach is based on a distance estimation that avoids costly search for the real orthogonal distance. Experimental results are presented as well as quantitative comparisons.
Address Hong-Kong
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1522-4880 ISBN 978-1-4244-7992-4 Medium
Area Expedition Conference ICIP
Notes ADAS Approved no
Call Number ADAS @ adas @ RoS2010b Serial 1359
Permanent link to this record
 

 
Author Marc Masana; Joost Van de Weijer; Luis Herranz;Andrew Bagdanov; Jose Manuel Alvarez
Title Domain-adaptive deep network compression Type Conference Article
Year 2017 Publication (down) 17th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Deep Neural Networks trained on large datasets can be easily transferred to new domains with far fewer labeled examples by a process called fine-tuning. This has the advantage that representations learned in the large source domain can be exploited on smaller target domains. However, networks designed to be optimal for the source task are often prohibitively large for the target task. In this work we address the compression of networks after domain transfer.
We focus on compression algorithms based on low-rank matrix decomposition. Existing methods base compression solely on learned network weights and ignore the statistics of network activations. We show that domain transfer leads to large shifts in network activations and that it is desirable to take this into account when compressing.
We demonstrate that considering activation statistics when compressing weights leads to a rank-constrained regression problem with a closed-form solution. Because our method takes into account the target domain, it can more optimally
remove the redundancy in the weights. Experiments show that our Domain Adaptive Low Rank (DALR) method significantly outperforms existing low-rank compression techniques. With our approach, the fc6 layer of VGG19 can be compressed more than 4x more than using truncated SVD alone – with only a minor or no loss in accuracy. When applied to domain-transferred networks it allows for compression down to only 5-20% of the original number of parameters with only a minor drop in performance.
Address Venice; Italy; October 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes LAMP; 601.305; 600.106; 600.120 Approved no
Call Number Admin @ si @ Serial 3034
Permanent link to this record
 

 
Author Xialei Liu; Joost Van de Weijer; Andrew Bagdanov
Title RankIQA: Learning from Rankings for No-reference Image Quality Assessment Type Conference Article
Year 2017 Publication (down) 17th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages
Keywords
Abstract We propose a no-reference image quality assessment (NR-IQA) approach that learns from rankings (RankIQA). To address the problem of limited IQA dataset size, we train a Siamese Network to rank images in terms of image quality by using synthetically generated distortions for which relative image quality is known. These ranked image sets can be automatically generated without laborious human labeling. We then use fine-tuning to transfer the knowledge represented in the trained Siamese Network to a traditional CNN that estimates absolute image quality from single images. We demonstrate how our approach can be made significantly more efficient than traditional Siamese Networks by forward propagating a batch of images through a single network and backpropagating gradients derived from all pairs of images in the batch. Experiments on the TID2013 benchmark show that we improve the state-of-the-art by over 5%. Furthermore, on the LIVE benchmark we show that our approach is superior to existing NR-IQA techniques and that we even outperform the state-of-the-art in full-reference IQA (FR-IQA) methods without having to resort to high-quality reference images to infer IQA.
Address Venice; Italy; October 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes LAMP; 600.106; 600.109; 600.120 Approved no
Call Number Admin @ si @ LWB2017b Serial 3036
Permanent link to this record
 

 
Author Ekaterina Zaytseva; Santiago Segui; Jordi Vitria
Title Sketchable Histograms of Oriented Gradients for Object Detection Type Conference Article
Year 2012 Publication (down) 17th Iberomerican Conference on Pattern Recognition Abbreviated Journal
Volume 7441 Issue Pages 374-381
Keywords
Abstract In this paper we investigate a new representation approach for visual object recognition. The new representation, called sketchable-HoG, extends the classical histogram of oriented gradients (HoG) feature by adding two different aspects: the stability of the majority orientation and the continuity of gradient orientations. In this way, the sketchable-HoG locally characterizes the complexity of an object model and introduces global structure information while still keeping simplicity, compactness and robustness. We evaluated the proposed image descriptor on publicly Catltech 101 dataset. The obtained results outperforms classical HoG descriptor as well as other reported descriptors in the literature.
Address Buenos Aires, Argentina
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-33274-6 Medium
Area Expedition Conference CIARP
Notes OR; MILAB;MV Approved no
Call Number Admin @ si @ ZSV2012 Serial 2048
Permanent link to this record
 

 
Author Onur Ferhat; Fernando Vilariño
Title A Cheap Portable Eye-Tracker Solution for Common Setups Type Conference Article
Year 2013 Publication (down) 17th European Conference on Eye Movements Abbreviated Journal
Volume Issue Pages
Keywords Low cost; eye-tracker; software; webcam; Raspberry Pi
Abstract We analyze the feasibility of a cheap eye-tracker where the hardware consists of a single webcam and a Raspberry Pi device. Our aim is to discover the limits of such a system and to see whether it provides an acceptable performance. We base our work on the open source Opengazer (Zielinski, 2013) and we propose several improvements to create a robust, real-time system. After assessing the accuracy of our eye-tracker in elaborated experiments involving 18 subjects under 4 different system setups, we developed a simple game to see how it performs in practice and we also installed it on a Raspberry Pi to create a portable stand-alone eye-tracker which achieves 1.62° horizontal accuracy with 3 fps refresh rate for a building cost of 70 Euros.
Address Lund; Sweden; August 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECEM
Notes MV;SIAI Approved no
Call Number Admin @ si @ FeV2013 Serial 2374
Permanent link to this record
 

 
Author Andrea Gemelli; Sanket Biswas; Enrico Civitelli; Josep Llados; Simone Marinai
Title Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks Type Conference Article
Year 2022 Publication (down) 17th European Conference on Computer Vision Workshops Abbreviated Journal
Volume 13804 Issue Pages 329–344
Keywords
Abstract Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis. The application of Graph Neural Networks (GNNs) has become crucial in various document-related tasks since they can unravel important structural patterns, fundamental in key information extraction processes. Previous works in the literature propose task-driven models and do not take into account the full power of graphs. We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model, to solve different tasks given different types of documents. We evaluated our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-031-25068-2 Medium
Area Expedition Conference ECCV-TiE
Notes DAG; 600.162; 600.140; 110.312 Approved no
Call Number Admin @ si @ GBC2022 Serial 3795
Permanent link to this record
 

 
Author Oriol Ramos Terrades; N. Serrano; Albert Gordo; Ernest Valveny; Alfons Juan-Ciscar
Title Interactive-predictive detection of handwritten text blocks Type Conference Article
Year 2010 Publication (down) 17th Document Recognition and Retrieval Conference, part of the IS&T-SPIE Electronic Imaging Symposium Abbreviated Journal
Volume 7534 Issue Pages 75340Q–75340Q–10
Keywords
Abstract A method for text block detection is introduced for old handwritten documents. The proposed method takes advantage of sequential book structure, taking into account layout information from pages previously transcribed. This glance at the past is used to predict the position of text blocks in the current page with the help of conventional layout analysis methods. The method is integrated into the GIDOC prototype: a first attempt to provide integrated support for interactive-predictive page layout analysis, text line detection and handwritten text transcription. Results are given in a transcription task on a 764-page Spanish manuscript from 1891.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference DRR
Notes DAG Approved no
Call Number DAG @ dag @ TSG2010 Serial 1479
Permanent link to this record
 

 
Author Olivier Lefebvre; Pau Riba; Charles Fournier; Alicia Fornes; Josep Llados; Rejean Plamondon; Jules Gagnon-Marchand
Title Monitoring neuromotricity on-line: a cloud computing approach Type Conference Article
Year 2015 Publication (down) 17th Conference of the International Graphonomics Society IGS2015 Abbreviated Journal
Volume Issue Pages
Keywords
Abstract The goal of our experiment is to develop a useful and accessible tool that can be used to evaluate a patient's health by analyzing handwritten strokes. We use a cloud computing approach to analyze stroke data sampled on a commercial tablet working on the Android platform and a distant server to perform complex calculations using the Delta and Sigma lognormal algorithms. A Google Drive account is used to store the data and to ease the development of the project. The communication between the tablet, the cloud and the server is encrypted to ensure biomedical information confidentiality. Highly parameterized biomedical tests are implemented on the tablet as well as a free drawing test to evaluate the validity of the data acquired by the first test compared to the second one. A blurred shape model descriptor pattern recognition algorithm is used to classify the data obtained by the free drawing test. The functions presented in this paper are still currently under development and other improvements are needed before launching the application in the public domain.
Address Pointe-à-Pitre; Guadeloupe; June 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IGS
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ LRF2015 Serial 2617
Permanent link to this record
 

 
Author Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera
Title Error Correcting Output Codes for multiclass classification: Application to two image vision problems Type Conference Article
Year 2012 Publication (down) 16th symposium on Artificial Intelligence & Signal Processing Abbreviated Journal
Volume Issue Pages 508-513
Keywords
Abstract Error-correcting output codes (ECOC) represents a powerful framework to deal with multiclass classification problems based on combining binary classifiers. The key factor affecting the performance of ECOC methods is the independence of binary classifiers, without which the ECOC method would be ineffective. In spite of its ability on classification of problems with relatively large number of classes, it has been applied in few real world problems. In this paper, we investigate the behavior of the ECOC approach on two image vision problems: logo recognition and shape classification using Decision Tree and AdaBoost as the base learners. The results show that the ECOC method can be used to improve the classification performance in comparison with the classical multiclass approaches.
Address Shiraz, Iran
Corporate Author Thesis
Publisher IEEE Xplore Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4673-1478-7 Medium
Area Expedition Conference AISP
Notes HuPBA;MILAB Approved no
Call Number Admin @ si @ BGE2012b Serial 2042
Permanent link to this record