|   | 
Details
   web
Records
Author Andreas Fischer; Volkmar Frinken; Horst Bunke; Ching Y. Suen
Title Improving HMM-Based Keyword Spotting with Character Language Models Type Conference Article
Year 2013 Publication 12th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages 506-510
Keywords
Abstract Facing high error rates and slow recognition speed for full text transcription of unconstrained handwriting images, keyword spotting is a promising alternative to locate specific search terms within scanned document images. We have previously proposed a learning-based method for keyword spotting using character hidden Markov models that showed a high performance when compared with traditional template image matching. In the lexicon-free approach pursued, only the text appearance was taken into account for recognition. In this paper, we integrate character n-gram language models into the spotting system in order to provide an additional language context. On the modern IAM database as well as the historical George Washington database, we demonstrate that character language models significantly improve the spotting performance.
Address Washington; USA; August 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1520-5363 ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.045; 605.203 Approved no
Call Number Admin @ si @ FFB2013 Serial 2295
Permanent link to this record
 

 
Author Christophe Rigaud; Dimosthenis Karatzas; Joost Van de Weijer; Jean-Christophe Burie; Jean-Marc Ogier
Title An active contour model for speech balloon detection in comics Type Conference Article
Year 2013 Publication 12th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages 1240-1244
Keywords
Abstract Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent comic book understanding would enable a variety of new applications, including content-based retrieval and content retargeting. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. Few studies have been done in this direction. In this work we detail a novel approach for closed and non-closed speech balloon localization in scanned comic book pages, an essential step towards a fully automatic comic book understanding. The approach is compared with existing methods for closed balloon localization found in the literature and results are presented.
Address washington; USA; August 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1520-5363 ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; CIC; 600.056 Approved no
Call Number Admin @ si @ RKW2013a Serial 2260
Permanent link to this record
 

 
Author Alicia Fornes; Xavier Otazu; Josep Llados
Title Show through cancellation and image enhancement by multiresolution contrast processing Type Conference Article
Year 2013 Publication 12th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages 200-204
Keywords
Abstract Historical documents suffer from different types of degradation and noise such as background variation, uneven illumination or dark spots. In case of double-sided documents, another common problem is that the back side of the document usually interferes with the front side because of the transparency of the document or ink bleeding. This effect is called the show through phenomenon. Many methods are developed to solve these problems, and in the case of show-through, by scanning and matching both the front and back sides of the document. In contrast, our approach is designed to use only one side of the scanned document. We hypothesize that show-trough are low contrast components, while foreground components are high contrast ones. A Multiresolution Contrast (MC) decomposition is presented in order to estimate the contrast of features at different spatial scales. We cancel the show-through phenomenon by thresholding these low contrast components. This decomposition is also able to enhance the image removing shadowed areas by weighting spatial scales. Results show that the enhanced images improve the readability of the documents, allowing scholars both to recover unreadable words and to solve ambiguities.
Address Washington; USA; August 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1520-5363 ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 602.006; 600.045; 600.061; 600.052;CIC Approved no
Call Number Admin @ si @ FOL2013 Serial 2241
Permanent link to this record
 

 
Author David Aldavert; Marçal Rusiñol; Ricardo Toledo; Josep Llados
Title Integrating Visual and Textual Cues for Query-by-String Word Spotting Type Conference Article
Year 2013 Publication 12th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages 511 - 515
Keywords
Abstract In this paper, we present a word spotting framework that follows the query-by-string paradigm where word images are represented both by textual and visual representations. The textual representation is formulated in terms of character $n$-grams while the visual one is based on the bag-of-visual-words scheme. These two representations are merged together and projected to a sub-vector space. This transform allows to, given a textual query, retrieve word instances that were only represented by the visual modality. Moreover, this statistical representation can be used together with state-of-the-art indexation structures in order to deal with large-scale scenarios. The proposed method is evaluated using a collection of historical documents outperforming state-of-the-art performances.
Address Washington; USA; August 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1520-5363 ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; ADAS; 600.045; 600.055; 600.061 Approved no
Call Number Admin @ si @ ART2013 Serial 2224
Permanent link to this record
 

 
Author Ferran Diego; Joan Serrat; Antonio Lopez
Title Joint spatio-temporal alignment of sequences Type Journal Article
Year 2013 Publication IEEE Transactions on Multimedia Abbreviated Journal TMM
Volume 15 Issue 6 Pages 1377-1387
Keywords video alignment
Abstract Video alignment is important in different areas of computer vision such as wide baseline matching, action recognition, change detection, video copy detection and frame dropping prevention. Current video alignment methods usually deal with a relatively simple case of fixed or rigidly attached cameras or simultaneous acquisition. Therefore, in this paper we propose a joint video alignment for bringing two video sequences into a spatio-temporal alignment. Specifically, the novelty of the paper is to formulate the video alignment to fold the spatial and temporal alignment into a single alignment framework. This simultaneously satisfies a frame-correspondence and frame-alignment similarity; exploiting the knowledge among neighbor frames by a standard pairwise Markov random field (MRF). This new formulation is able to handle the alignment of sequences recorded at different times by independent moving cameras that follows a similar trajectory, and also generalizes the particular cases that of fixed geometric transformation and/or linear temporal mapping. We conduct experiments on different scenarios such as sequences recorded simultaneously or by moving cameras to validate the robustness of the proposed approach. The proposed method provides the highest video alignment accuracy compared to the state-of-the-art methods on sequences recorded from vehicles driving along the same track at different times.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1520-9210 ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number Admin @ si @ DSL2013; ADAS @ adas @ Serial 2228
Permanent link to this record
 

 
Author Jose Manuel Alvarez; Theo Gevers; Ferran Diego; Antonio Lopez
Title Road Geometry Classification by Adaptative Shape Models Type Journal Article
Year 2013 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS
Volume 14 Issue 1 Pages 459-468
Keywords road detection
Abstract Vision-based road detection is important for different applications in transportation, such as autonomous driving, vehicle collision warning, and pedestrian crossing detection. Common approaches to road detection are based on low-level road appearance (e.g., color or texture) and neglect of the scene geometry and context. Hence, using only low-level features makes these algorithms highly depend on structured roads, road homogeneity, and lighting conditions. Therefore, the aim of this paper is to classify road geometries for road detection through the analysis of scene composition and temporal coherence. Road geometry classification is proposed by building corresponding models from training images containing prototypical road geometries. We propose adaptive shape models where spatial pyramids are steered by the inherent spatial structure of road images. To reduce the influence of lighting variations, invariant features are used. Large-scale experiments show that the proposed road geometry classifier yields a high recognition rate of 73.57% ± 13.1, clearly outperforming other state-of-the-art methods. Including road shape information improves road detection results over existing appearance-based methods. Finally, it is shown that invariant features and temporal information provide robustness against disturbing imaging conditions.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1524-9050 ISBN Medium
Area Expedition Conference
Notes ADAS;ISE Approved no
Call Number Admin @ si @ AGD2013;; ADAS @ adas @ Serial 2269
Permanent link to this record
 

 
Author Javier Marin; David Vazquez; Antonio Lopez; Jaume Amores; Bastian Leibe
Title Random Forests of Local Experts for Pedestrian Detection Type Conference Article
Year 2013 Publication 15th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 2592 - 2599
Keywords ADAS; Random Forest; Pedestrian Detection
Abstract Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one.
Address Sydney; Australia; December 2013
Corporate Author Thesis
Publisher IEEE Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1550-5499 ISBN Medium
Area Expedition Conference ICCV
Notes ADAS; 600.057; 600.054 Approved no
Call Number ADAS @ adas @ MVL2013 Serial 2333
Permanent link to this record
 

 
Author Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny
Title Handwritten Word Spotting with Corrected Attributes Type Conference Article
Year 2013 Publication 15th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 1017-1024
Keywords
Abstract We propose an approach to multi-writer word spotting, where the goal is to find a query word in a dataset comprised of document images. We propose an attributes-based approach that leads to a low-dimensional, fixed-length representation of the word images that is fast to compute and, especially, fast to compare. This approach naturally leads to an unified representation of word images and strings, which seamlessly allows one to indistinctly perform query-by-example, where the query is an image, and query-by-string, where the query is a string. We also propose a calibration scheme to correct the attributes scores based on Canonical Correlation Analysis that greatly improves the results on a challenging dataset. We test our approach on two public datasets showing state-of-the-art results.
Address Sydney; Australia; December 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1550-5499 ISBN Medium
Area Expedition Conference ICCV
Notes DAG Approved no
Call Number Admin @ si @ AGF2013 Serial 2327
Permanent link to this record
 

 
Author Gemma Roig; Xavier Boix; R. de Nijs; Sebastian Ramos; K. Kühnlenz; Luc Van Gool
Title Active MAP Inference in CRFs for Efficient Semantic Segmentation Type Conference Article
Year 2013 Publication 15th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 2312 - 2319
Keywords Semantic Segmentation
Abstract Most MAP inference algorithms for CRFs optimize an energy function knowing all the potentials. In this paper, we focus on CRFs where the computational cost of instantiating the potentials is orders of magnitude higher than MAP inference. This is often the case in semantic image segmentation, where most potentials are instantiated by slow classifiers fed with costly features. We introduce Active MAP inference 1) to on-the-fly select a subset of potentials to be instantiated in the energy function, leaving the rest of the parameters of the potentials unknown, and 2) to estimate the MAP labeling from such incomplete energy function. Results for semantic segmentation benchmarks, namely PASCAL VOC 2010 [5] and MSRC-21 [19], show that Active MAP inference achieves similar levels of accuracy but with major efficiency gains.
Address Sydney; Australia; December 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1550-5499 ISBN Medium
Area Expedition Conference ICCV
Notes ADAS; 600.057 Approved no
Call Number ADAS @ adas @ RBN2013 Serial 2377
Permanent link to this record
 

 
Author Jorge Bernal; F. Javier Sanchez; Fernando Vilariño
Title Impact of Image Preprocessing Methods on Polyp Localization in Colonoscopy Frames Type Conference Article
Year 2013 Publication 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society Abbreviated Journal
Volume Issue Pages 7350 - 7354
Keywords
Abstract In this paper we present our image preprocessing methods as a key part of our automatic polyp localization scheme. These methods are used to assess the impact of different endoluminal scene elements when characterizing polyps. More precisely we tackle the influence of specular highlights, blood vessels and black mask surrounding the scene. Experimental results prove that the appropriate handling of these elements leads to a great improvement in polyp localization results.
Address Osaka; Japan; July 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1557-170X ISBN Medium
Area 800 Expedition Conference EMBC
Notes MV; 600.047; 600.060;SIAI Approved no
Call Number Admin @ si @ BSV2013 Serial 2286
Permanent link to this record
 

 
Author Mariella Dimiccoli; Benoît Girard; Alain Berthoz; Daniel Bennequin
Title Striola Magica: a functional explanation of otolith organs Type Journal Article
Year 2013 Publication Journal of Computational Neuroscience Abbreviated Journal JCN
Volume 35 Issue 2 Pages 125-154
Keywords Otolith organs ;Striola; Vestibular pathway
Abstract Otolith end organs of vertebrates sense linear accelerations of the head and gravitation. The hair cells on their epithelia are responsible for transduction. In mammals, the striola, parallel to the line where hair cells reverse their polarization, is a narrow region centered on a curve with curvature and torsion. It has been shown that the striolar region is functionally different from the rest, being involved in a phasic vestibular pathway. We propose a mathematical and computational model that explains the necessity of this amazing geometry for the striola to be able to carry out its function. Our hypothesis, related to the biophysics of the hair cells and to the physiology of their afferent neurons, is that striolar afferents collect information from several type I hair cells to detect the jerk in a large domain of acceleration directions. This predicts a mean number of two calyces for afferent neurons, as measured in rodents. The domain of acceleration directions sensed by our striolar model is compatible with the experimental results obtained on monkeys considering all afferents. Therefore, the main result of our study is that phasic and tonic vestibular afferents cover the same geometrical fields, but at different dynamical and frequency domains.
Address
Corporate Author Thesis
Publisher Springer US Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1573-6873. 2013 ISBN Medium
Area Expedition Conference
Notes MILAB Approved no
Call Number Admin @ si @DBG2013 Serial 2787
Permanent link to this record
 

 
Author Ernest Valveny; Oriol Ramos Terrades; Joan Mas; Marçal Rusiñol
Title Interactive Document Retrieval and Classification. Type Book Chapter
Year 2013 Publication Multimodal Interaction in Image and Video Applications Abbreviated Journal
Volume 48 Issue Pages 17-30
Keywords
Abstract In this chapter we describe a system for document retrieval and classification following the interactive-predictive framework. In particular, the system addresses two different scenarios of document analysis: document classification based on visual appearance and logo detection. These two classical problems of document analysis are formulated following the interactive-predictive model, taking the user interaction into account to make easier the process of annotating and labelling the documents. A system implementing this model in a real scenario is presented and analyzed. This system also takes advantage of active learning techniques to speed up the task of labelling the documents.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor Angel Sappa; Jordi Vitria
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1868-4394 ISBN 978-3-642-35931-6 Medium
Area Expedition Conference
Notes DAG Approved no
Call Number Admin @ si @ VRM2013 Serial 2341
Permanent link to this record
 

 
Author Joost Van de Weijer; Fahad Shahbaz Khan; Marc Masana
Title Interactive Visual and Semantic Image Retrieval Type Book Chapter
Year 2013 Publication Multimodal Interaction in Image and Video Applications Abbreviated Journal
Volume 48 Issue Pages 31-35
Keywords
Abstract One direct consequence of recent advances in digital visual data generation and the direct availability of this information through the World-Wide Web, is a urgent demand for efficient image retrieval systems. The objective of image retrieval is to allow users to efficiently browse through this abundance of images. Due to the non-expert nature of the majority of the internet users, such systems should be user friendly, and therefore avoid complex user interfaces. In this chapter we investigate how high-level information provided by recently developed object recognition techniques can improve interactive image retrieval. Wel apply a bagof- word based image representation method to automatically classify images in a number of categories. These additional labels are then applied to improve the image retrieval system. Next to these high-level semantic labels, we also apply a low-level image description to describe the composition and color scheme of the scene. Both descriptions are incorporated in a user feedback image retrieval setting. The main objective is to show that automatic labeling of images with semantic labels can improve image retrieval results.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor Angel Sappa; Jordi Vitria
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1868-4394 ISBN 978-3-642-35931-6 Medium
Area Expedition Conference
Notes CIC; 605.203; 600.048 Approved no
Call Number Admin @ si @ WKC2013 Serial 2284
Permanent link to this record
 

 
Author Abel Gonzalez-Garcia; Robert Benavente; Olivier Penacchio; Javier Vazquez; Maria Vanrell; C. Alejandro Parraga
Title Coloresia: An Interactive Colour Perception Device for the Visually Impaired Type Book Chapter
Year 2013 Publication Multimodal Interaction in Image and Video Applications Abbreviated Journal
Volume 48 Issue Pages 47-66
Keywords
Abstract A significative percentage of the human population suffer from impairments in their capacity to distinguish or even see colours. For them, everyday tasks like navigating through a train or metro network map becomes demanding. We present a novel technique for extracting colour information from everyday natural stimuli and presenting it to visually impaired users as pleasant, non-invasive sound. This technique was implemented inside a Personal Digital Assistant (PDA) portable device. In this implementation, colour information is extracted from the input image and categorised according to how human observers segment the colour space. This information is subsequently converted into sound and sent to the user via speakers or headphones. In the original implementation, it is possible for the user to send its feedback to reconfigure the system, however several features such as these were not implemented because the current technology is limited.We are confident that the full implementation will be possible in the near future as PDA technology improves.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1868-4394 ISBN 978-3-642-35931-6 Medium
Area Expedition Conference
Notes CIC; 600.052; 605.203 Approved no
Call Number Admin @ si @ GBP2013 Serial 2266
Permanent link to this record
 

 
Author Michal Drozdzal; Santiago Segui; Petia Radeva; Carolina Malagelada; Fernando Azpiroz; Jordi Vitria
Title An Application for Efficient Error-Free Labeling of Medical Images Type Book Chapter
Year 2013 Publication Multimodal Interaction in Image and Video Applications Abbreviated Journal
Volume 48 Issue Pages 1-16
Keywords
Abstract In this chapter we describe an application for efficient error-free labeling of medical images. In this scenario, the compilation of a complete training set for building a realistic model of a given class of samples is not an easy task, making the process tedious and time consuming. For this reason, there is a need for interactive labeling applications that minimize the effort of the user while providing error-free labeling. We propose a new algorithm that is based on data similarity in feature space. This method actively explores data in order to find the best label-aligned clustering and exploits it to reduce the labeler effort, that is measured by the number of “clicks. Moreover, error-free labeling is guaranteed by the fact that all data and their labels proposals are visually revised by en expert.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) 1868-4394 ISBN 978-3-642-35931-6 Medium
Area Expedition Conference
Notes MILAB; OR;MV Approved no
Call Number Admin @ si @ DSR2013 Serial 2235
Permanent link to this record