|   | 
Details
   web
Records
Author Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados; Tomokazu Sato; Masakazu Iwamura; Koichi Kise
Title Key-region detection for document images -applications to administrative document retrieval Type Conference Article
Year 2013 Publication 12th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages 230-234
Keywords
Abstract In this paper we argue that a key-region detector designed to take into account the special characteristics of document images can result in the detection of less and more meaningful key-regions. We propose a fast key-region detector able to capture aspects of the structural information of the document, and demonstrate its efficiency by comparing against standard detectors in an administrative document retrieval scenario. We show that using the proposed detector results to a smaller number of detected key-regions and higher performance without any drop in speed compared to standard state of the art detectors.
Address Washington; USA; August 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1520-5363 ISBN (up) Medium
Area Expedition Conference ICDAR
Notes DAG; 600.056; 600.045 Approved no
Call Number Admin @ si @ GRK2013b Serial 2293
Permanent link to this record
 

 
Author Andreas Fischer; Volkmar Frinken; Horst Bunke; Ching Y. Suen
Title Improving HMM-Based Keyword Spotting with Character Language Models Type Conference Article
Year 2013 Publication 12th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages 506-510
Keywords
Abstract Facing high error rates and slow recognition speed for full text transcription of unconstrained handwriting images, keyword spotting is a promising alternative to locate specific search terms within scanned document images. We have previously proposed a learning-based method for keyword spotting using character hidden Markov models that showed a high performance when compared with traditional template image matching. In the lexicon-free approach pursued, only the text appearance was taken into account for recognition. In this paper, we integrate character n-gram language models into the spotting system in order to provide an additional language context. On the modern IAM database as well as the historical George Washington database, we demonstrate that character language models significantly improve the spotting performance.
Address Washington; USA; August 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1520-5363 ISBN (up) Medium
Area Expedition Conference ICDAR
Notes DAG; 600.045; 605.203 Approved no
Call Number Admin @ si @ FFB2013 Serial 2295
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Muhammad Anwer Rao; Joost Van de Weijer; Andrew Bagdanov; Antonio Lopez; Michael Felsberg
Title Coloring Action Recognition in Still Images Type Journal Article
Year 2013 Publication International Journal of Computer Vision Abbreviated Journal IJCV
Volume 105 Issue 3 Pages 205-221
Keywords
Abstract In this article we investigate the problem of human action recognition in static images. By action recognition we intend a class of problems which includes both action classification and action detection (i.e. simultaneous localization and classification). Bag-of-words image representations yield promising results for action classification, and deformable part models perform very well object detection. The representations for action recognition typically use only shape cues and ignore color information. Inspired by the recent success of color in image classification and object detection, we investigate the potential of color for action classification and detection in static images. We perform a comprehensive evaluation of color descriptors and fusion approaches for action recognition. Experiments were conducted on the three datasets most used for benchmarking action recognition in still images: Willow, PASCAL VOC 2010 and Stanford-40. Our experiments demonstrate that incorporating color information considerably improves recognition performance, and that a descriptor based on color names outperforms pure color descriptors. Our experiments demonstrate that late fusion of color and shape information outperforms other approaches on action recognition. Finally, we show that the different color–shape fusion approaches result in complementary information and combining them yields state-of-the-art performance for action classification.
Address
Corporate Author Thesis
Publisher Springer US Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0920-5691 ISBN (up) Medium
Area Expedition Conference
Notes CIC; ADAS; 600.057; 600.048 Approved no
Call Number Admin @ si @ KRW2013 Serial 2285
Permanent link to this record
 

 
Author Jorge Bernal; F. Javier Sanchez; Fernando Vilariño
Title Impact of Image Preprocessing Methods on Polyp Localization in Colonoscopy Frames Type Conference Article
Year 2013 Publication 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society Abbreviated Journal
Volume Issue Pages 7350 - 7354
Keywords
Abstract In this paper we present our image preprocessing methods as a key part of our automatic polyp localization scheme. These methods are used to assess the impact of different endoluminal scene elements when characterizing polyps. More precisely we tackle the influence of specular highlights, blood vessels and black mask surrounding the scene. Experimental results prove that the appropriate handling of these elements leads to a great improvement in polyp localization results.
Address Osaka; Japan; July 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1557-170X ISBN (up) Medium
Area 800 Expedition Conference EMBC
Notes MV; 600.047; 600.060;SIAI Approved no
Call Number Admin @ si @ BSV2013 Serial 2286
Permanent link to this record
 

 
Author Jordi Roca; C. Alejandro Parraga; Maria Vanrell
Title Chromatic settings and the structural color constancy index Type Journal Article
Year 2013 Publication Journal of Vision Abbreviated Journal JV
Volume 13 Issue 4-3 Pages 1-26
Keywords
Abstract Color constancy is usually measured by achromatic setting, asymmetric matching, or color naming paradigms, whose results are interpreted in terms of indexes and models that arguably do not capture the full complexity of the phenomenon. Here we propose a new paradigm, chromatic setting, which allows a more comprehensive characterization of color constancy through the measurement of multiple points in color space under immersive adaptation. We demonstrated its feasibility by assessing the consistency of subjects' responses over time. The paradigm was applied to two-dimensional (2-D) Mondrian stimuli under three different illuminants, and the results were used to fit a set of linear color constancy models. The use of multiple colors improved the precision of more complex linear models compared to the popular diagonal model computed from gray. Our results show that a diagonal plus translation matrix that models mechanisms other than cone gain might be best suited to explain the phenomenon. Additionally, we calculated a number of color constancy indices for several points in color space, and our results suggest that interrelations among colors are not as uniform as previously believed. To account for this variability, we developed a new structural color constancy index that takes into account the magnitude and orientation of the chromatic shift in addition to the interrelations among colors and memory effects.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (up) Medium
Area Expedition Conference
Notes CIC; 600.052; 600.051; 605.203 Approved no
Call Number Admin @ si @ RPV2013 Serial 2288
Permanent link to this record
 

 
Author Naila Murray; Maria Vanrell; Xavier Otazu; C. Alejandro Parraga
Title Low-level SpatioChromatic Grouping for Saliency Estimation Type Journal Article
Year 2013 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 35 Issue 11 Pages 2810-2816
Keywords
Abstract We propose a saliency model termed SIM (saliency by induction mechanisms), which is based on a low-level spatiochromatic model that has successfully predicted chromatic induction phenomena. In so doing, we hypothesize that the low-level visual mechanisms that enhance or suppress image detail are also responsible for making some image regions more salient. Moreover, SIM adds geometrical grouplets to enhance complex low-level features such as corners, and suppress relatively simpler features such as edges. Since our model has been fitted on psychophysical chromatic induction data, it is largely nonparametric. SIM outperforms state-of-the-art methods in predicting eye fixations on two datasets and using two metrics.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0162-8828 ISBN (up) Medium
Area Expedition Conference
Notes CIC; 600.051; 600.052; 605.203 Approved no
Call Number Admin @ si @ MVO2013 Serial 2289
Permanent link to this record
 

 
Author Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Apostolos Antonacopoulos; Josep Llados
Title An interactive appearance-based document retrieval system for historical newspapers Type Conference Article
Year 2013 Publication Proceedings of the International Conference on Computer Vision Theory and Applications Abbreviated Journal
Volume Issue Pages 84-87
Keywords
Abstract In this paper we present a retrieval-based application aimed at assisting a user to semi-automatically segment an incoming flow of historical newspaper images by automatically detecting a particular type of pages based on their appearance. A visual descriptor is used to assess page similarity while a relevance feedback process allow refining the results iteratively. The application is tested on a large dataset of digitised historic newspapers.
Address Barcelona; February 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (up) Medium
Area Expedition Conference VISAPP
Notes DAG; 600.056; 600.045; 605.203 Approved no
Call Number Admin @ si @ GRK2013a Serial 2290
Permanent link to this record
 

 
Author Albert Gordo
Title A Cyclic Page Layout Descriptor for Document Classification & Retrieval Type Report
Year 2009 Publication CVC Technical Report Abbreviated Journal
Volume 128 Issue Pages
Keywords
Abstract
Address
Corporate Author Computer Vision Center Thesis Master's thesis
Publisher Place of Publication Bellaterra, Barcelona Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (up) Medium
Area Expedition Conference
Notes CIC;DAG Approved no
Call Number Admin @ si @ Gor2009 Serial 2387
Permanent link to this record
 

 
Author Jaume Gibert; Ernest Valveny; Horst Bunke
Title Embedding of Graphs with Discrete Attributes Via Label Frequencies Type Journal Article
Year 2013 Publication International Journal of Pattern Recognition and Artificial Intelligence Abbreviated Journal IJPRAI
Volume 27 Issue 3 Pages 1360002-1360029
Keywords Discrete attributed graphs; graph embedding; graph classification
Abstract Graph-based representations of patterns are very flexible and powerful, but they are not easily processed due to the lack of learning algorithms in the domain of graphs. Embedding a graph into a vector space solves this problem since graphs are turned into feature vectors and thus all the statistical learning machinery becomes available for graph input patterns. In this work we present a new way of embedding discrete attributed graphs into vector spaces using node and edge label frequencies. The methodology is experimentally tested on graph classification problems, using patterns of different nature, and it is shown to be competitive to state-of-the-art classification algorithms for graphs, while being computationally much more efficient.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (up) Medium
Area Expedition Conference
Notes DAG Approved no
Call Number Admin @ si @ GVB2013 Serial 2305
Permanent link to this record
 

 
Author Muhammad Anwer Rao
Title Color for Object Detection and Action Recognition Type Book Whole
Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Recognizing object categories in real world images is a challenging problem in computer vision. The deformable part based framework is currently the most successful approach for object detection. Generally, HOG are used for image representation within the part-based framework. For action recognition, the bag-of-word framework has shown to provide promising results. Within the bag-of-words framework, local image patches are described by SIFT descriptor. Contrary to object detection and action recognition, combining color and shape has shown to provide the best performance for object and scene recognition.

In the first part of this thesis, we analyze the problem of person detection in still images. Standard person detection approaches rely on intensity based features for image representation while ignoring the color. Channel based descriptors is one of the most commonly used approaches in object recognition. This inspires us to evaluate incorporating color information using the channel based fusion approach for the task of person detection.

In the second part of the thesis, we investigate the problem of object detection in still images. Due to high dimensionality, channel based fusion increases the computational cost. Moreover, channel based fusion has been found to obtain inferior results for object category where one of the visual varies significantly. On the other hand, late fusion is known to provide improved results for a wide range of object categories. A consequence of late fusion strategy is the need of a pure color descriptor. Therefore, we propose to use Color attributes as an explicit color representation for object detection. Color attributes are compact and computationally efficient. Consequently color attributes are combined with traditional shape features providing excellent results for object detection task.

Finally, we focus on the problem of action detection and classification in still images. We investigate the potential of color for action classification and detection in still images. We also evaluate different fusion approaches for combining color and shape information for action recognition. Additionally, an analysis is performed to validate the contribution of color for action recognition. Our results clearly demonstrate that combining color and shape information significantly improve the performance of both action classification and detection in still images.
Address Barcelona
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;Joost Van de Weijer
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (up) Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number Admin @ si @ Rao2013 Serial 2281
Permanent link to this record
 

 
Author Javier Marin
Title Pedestrian Detection Based on Local Experts Type Book Whole
Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract During the last decade vision-based human detection systems have started to play a key rolein multiple applications linked to driver assistance, surveillance, robot sensing and home automation.
Detecting humans is by far one of the most challenging tasks in Computer Vision.
This is mainly due to the high degree of variability in the human appearanceassociated to
the clothing, pose, shape and size. Besides, other factors such as cluttered scenarios, partial occlusions, or environmental conditions can make the detection task even harder.
Most promising methods of the state-of-the-art rely on discriminative learning paradigms which are fed with positive and negative examples. The training data is one of the most
relevant elements in order to build a robust detector as it has to cope the large variability of the target. In order to create this dataset human supervision is required. The drawback at this point is the arduous effort of annotating as well as looking for such claimed variability.
In this PhD thesis we address two recurrent problems in the literature. In the first stage,we aim to reduce the consuming task of annotating, namely, by using computer graphics.
More concretely, we develop a virtual urban scenario for later generating a pedestrian dataset.
Then, we train a detector using this dataset, and finally we assess if this detector can be successfully applied in a real scenario.
In the second stage, we focus on increasing the robustness of our pedestrian detectors
under partial occlusions. In particular, we present a novel occlusion handling approach to increase the performance of block-based holistic methods under partial occlusions. For this purpose, we make use of local experts via a RandomSubspaceMethod (RSM) to handle these cases. If the method infers a possible partial occlusion, then the RSM, based on performance statistics obtained from partially occluded data, is applied. The last objective of this thesis
is to propose a robust pedestrian detector based on an ensemble of local experts. To achieve this goal, we use the random forest paradigm, where the trees act as ensembles an their nodesare the local experts. In particular, each expert focus on performing a robust classification ofa pedestrian body patch. This approach offers computational efficiency and far less design complexity when compared to other state-of-the-artmethods, while reaching better accuracy
Address Barcelona
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;Jaume Amores
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (up) Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number Admin @ si @ Mar2013 Serial 2280
Permanent link to this record
 

 
Author Wenjuan Gong
Title 3D Motion Data aided Human Action Recognition and Pose Estimation Type Book Whole
Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In this work, we explore human action recognition and pose estimation prob-
lems. Different from traditional works of learning from 2D images or video
sequences and their annotated output, we seek to solve the problems with ad-
ditional 3D motion capture information, which helps to fill the gap between 2D
image features and human interpretations.
We first compare two different schools of approaches commonly used for 3D
pose estimation from 2D pose configuration: modeling and learning methods.
By looking into experiments results and considering our problems, we fixed a
learning method as the following approaches to do pose estimation. We then
establish a framework by adding a module of detecting 2D pose configuration
from images with varied background, which widely extend the application of
the approach. We also seek to directly estimate 3D poses from image features,
instead of estimating 2D poses as a intermediate module. We explore a robust
input feature, which combined with the proposed distance measure, provides
a solution for noisy or corrupted inputs. We further utilize the above method
to estimate weak poses,which is a concise representation of the original poses
by using dimension deduction technologies, from image features. Weak pose
space is where we calculate vocabulary and label action types using a bog of
words pipeline. Temporal information of an action is taken into consideration by
considering several consecutive frames as a single unit for computing vocabulary
and histogram assignments.
Address Barcelona
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (up) Medium
Area Expedition Conference
Notes ISE Approved no
Call Number Admin @ si @ Gon2013 Serial 2279
Permanent link to this record
 

 
Author Murad Al Haj
Title Looking at Faces: Detection, Tracking and Pose Estimation Type Book Whole
Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Humans can effortlessly perceive faces, follow them over space and time, and decode their rich content, such as pose, identity and expression. However, despite many decades of research on automatic facial perception in areas like face detection, expression recognition, pose estimation and face recognition, and despite many successes, a complete solution remains elusive. This thesis is dedicated to three problems in automatic face perception, namely face detection, face tracking and pose estimation.

In face detection, an initial simple model is presented that uses pixel-based heuristics to segment skin locations and hand-crafted rules to determine the locations of the faces present in an image. Different colorspaces are studied to judge whether a colorspace transformation can aid skin color detection. The output of this study is used in the design of a more complex face detector that is able to successfully generalize to different scenarios.

In face tracking, a framework that combines estimation and control in a joint scheme is presented to track a face with a single pan-tilt-zoom camera. While this work is mainly motivated by tracking faces, it can be easily applied atop of any detector to track different objects. The applicability of this method is demonstrated on simulated as well as real-life scenarios.

The last and most important part of this thesis is dedicate to monocular head pose estimation. In this part, a method based on partial least squares (PLS) regression is proposed to estimate pose and solve the alignment problem simultaneously. The contributions of this work are two-fold: 1) demonstrating that the proposed method achieves better than state-of-the-art results on the estimation problem and 2) developing a technique to reduce misalignment based on the learned PLS factors that outperform multiple instance learning (MIL) without the need for any re-training or the inclusion of misaligned samples in the training process, as normally done in MIL.
Address Barcelona
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (up) Medium
Area Expedition Conference
Notes ISE Approved no
Call Number Admin @ si @ Haj2013 Serial 2278
Permanent link to this record
 

 
Author Albert Gordo
Title Document Image Representation, Classification and Retrieval in Large-Scale Domains Type Book Whole
Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Despite the “paperless office” ideal that started in the decade of the seventies, businesses still strive against an increasing amount of paper documentation. Companies still receive huge amounts of paper documentation that need to be analyzed and processed, mostly in a manual way. A solution for this task consists in, first, automatically scanning the incoming documents. Then, document images can be analyzed and information can be extracted from the data. Documents can also be automatically dispatched to the appropriate workflows, used to retrieve similar documents in the dataset to transfer information, etc.

Due to the nature of this “digital mailroom”, we need document representation methods to be general, i.e., able to cope with very different types of documents. We need the methods to be sound, i.e., able to cope with unexpected types of documents, noise, etc. And, we need to methods to be scalable, i.e., able to cope with thousands or millions of documents that need to be processed, stored, and consulted. Unfortunately, current techniques of document representation, classification and retrieval are not apt for this digital mailroom framework, since they do not fulfill some or all of these requirements.

Through this thesis we focus on the problem of document representation aimed at classification and retrieval tasks under this digital mailroom framework. We first propose a novel document representation based on runlength histograms, and extend it to cope with more complex documents such as multiple-page documents, or documents that contain more sources of information such as extracted OCR text. Then we focus on the scalability requirements and propose a novel binarization method which we dubbed PCAE, as well as two general asymmetric distances between binary embeddings that can significantly improve the retrieval results at a minimal extra computational cost. Finally, we note the importance of supervised learning when performing large-scale retrieval, and study several approaches that can significantly boost the results at no extra cost at query time.
Address Barcelona
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Ernest Valveny;Florent Perronnin
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (up) Medium
Area Expedition Conference
Notes DAG Approved no
Call Number Admin @ si @ Gor2013 Serial 2277
Permanent link to this record
 

 
Author Jose Manuel Alvarez; Theo Gevers; Ferran Diego; Antonio Lopez
Title Road Geometry Classification by Adaptative Shape Models Type Journal Article
Year 2013 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS
Volume 14 Issue 1 Pages 459-468
Keywords road detection
Abstract Vision-based road detection is important for different applications in transportation, such as autonomous driving, vehicle collision warning, and pedestrian crossing detection. Common approaches to road detection are based on low-level road appearance (e.g., color or texture) and neglect of the scene geometry and context. Hence, using only low-level features makes these algorithms highly depend on structured roads, road homogeneity, and lighting conditions. Therefore, the aim of this paper is to classify road geometries for road detection through the analysis of scene composition and temporal coherence. Road geometry classification is proposed by building corresponding models from training images containing prototypical road geometries. We propose adaptive shape models where spatial pyramids are steered by the inherent spatial structure of road images. To reduce the influence of lighting variations, invariant features are used. Large-scale experiments show that the proposed road geometry classifier yields a high recognition rate of 73.57% ± 13.1, clearly outperforming other state-of-the-art methods. Including road shape information improves road detection results over existing appearance-based methods. Finally, it is shown that invariant features and temporal information provide robustness against disturbing imaging conditions.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1524-9050 ISBN (up) Medium
Area Expedition Conference
Notes ADAS;ISE Approved no
Call Number Admin @ si @ AGD2013;; ADAS @ adas @ Serial 2269
Permanent link to this record