|   | 
Details
   web
Records
Author Marçal Rusiñol; J. Chazalon; Jean-Marc Ogier
Title Combining Focus Measure Operators to Predict OCR Accuracy in Mobile-Captured Document Images Type Conference Article
Year 2014 Publication 11th IAPR International Workshop on Document Analysis and Systems Abbreviated Journal (down)
Volume Issue Pages 181 - 185
Keywords
Abstract Mobile document image acquisition is a new trend raising serious issues in business document processing workflows. Such digitization procedure is unreliable, and integrates many distortions which must be detected as soon as possible, on the mobile, to avoid paying data transmission fees, and losing information due to the inability to re-capture later a document with temporary availability. In this context, out-of-focus blur is major issue: users have no direct control over it, and it seriously degrades OCR recognition. In this paper, we concentrate on the estimation of focus quality, to ensure a sufficient legibility of a document image for OCR processing. We propose two contributions to improve OCR accuracy prediction for mobile-captured document images. First, we present 24 focus measures, never tested on document images, which are fast to compute and require no training. Second, we show that a combination of those measures enables state-of-the art performance regarding the correlation with OCR accuracy. The resulting approach is fast, robust, and easy to implement in a mobile device. Experiments are performed on a public dataset, and precise details about image processing are given.
Address Tours; France; April 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4799-3243-6 Medium
Area Expedition Conference DAS
Notes DAG; 601.223; 600.077 Approved no
Call Number Admin @ si @ RCO2014a Serial 2545
Permanent link to this record
 

 
Author Marçal Rusiñol; J. Chazalon; Jean-Marc Ogier
Title Normalisation et validation d'images de documents capturées en mobilité Type Conference Article
Year 2014 Publication Colloque International Francophone sur l'Écrit et le Document Abbreviated Journal (down)
Volume Issue Pages 109-124
Keywords mobile document image acquisition; perspective correction; illumination correction; quality assessment; focus measure; OCR accuracy prediction
Abstract Mobile document image acquisition integrates many distortions which must be corrected or detected on the device, before the document becomes unavailable or paying data transmission fees. In this paper, we propose a system to correct perspective and illumination issues, and estimate the sharpness of the image for OCR recognition. The correction step relies on fast and accurate border detection followed by illumination normalization. Its evaluation on a private dataset shows a clear improvement on OCR accuracy. The quality assessment
step relies on a combination of focus measures. Its evaluation on a public dataset shows that this simple method compares well to state of the art, learning-based methods which cannot be embedded on a mobile, and outperforms metric-based methods.
Address Nancy; France; March 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CIFED
Notes DAG; 601.223; 600.077 Approved no
Call Number Admin @ si @ RCO2014b Serial 2546
Permanent link to this record
 

 
Author Eloi Puertas; Miguel Angel Bautista; Daniel Sanchez; Sergio Escalera; Oriol Pujol
Title Learning to Segment Humans by Stacking their Body Parts, Type Conference Article
Year 2014 Publication ECCV Workshop on ChaLearn Looking at People Abbreviated Journal (down)
Volume 8925 Issue Pages 685-697
Keywords Human body segmentation; Stacked Sequential Learning
Abstract Human segmentation in still images is a complex task due to the wide range of body poses and drastic changes in environmental conditions. Usually, human body segmentation is treated in a two-stage fashion. First, a human body part detection step is performed, and then, human part detections are used as prior knowledge to be optimized by segmentation strategies. In this paper, we present a two-stage scheme based on Multi-Scale Stacked Sequential Learning (MSSL). We define an extended feature set by stacking a multi-scale decomposition of body
part likelihood maps. These likelihood maps are obtained in a first stage
by means of a ECOC ensemble of soft body part detectors. In a second stage, contextual relations of part predictions are learnt by a binary classifier, obtaining an accurate body confidence map. The obtained confidence map is fed to a graph cut optimization procedure to obtain the final segmentation. Results show improved segmentation when MSSL is included in the human segmentation pipeline.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes HuPBA;MILAB Approved no
Call Number Admin @ si @ PBS2014 Serial 2553
Permanent link to this record
 

 
Author Marc Bolaños; Maite Garolera; Petia Radeva
Title Video Segmentation of Life-Logging Videos Type Conference Article
Year 2014 Publication 8th Conference on Articulated Motion and Deformable Objects Abbreviated Journal (down)
Volume 8563 Issue Pages 1-9
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference AMDO
Notes MILAB Approved no
Call Number Admin @ si @ BGR2014 Serial 2558
Permanent link to this record
 

 
Author Francesco Brughi; Debora Gil; Llorenç Badiella; Eva Jove Casabella; Oriol Ramos Terrades
Title Exploring the impact of inter-query variability on the performance of retrieval systems Type Conference Article
Year 2014 Publication 11th International Conference on Image Analysis and Recognition Abbreviated Journal (down)
Volume 8814 Issue Pages 413–420
Keywords
Abstract This paper introduces a framework for evaluating the performance of information retrieval systems. Current evaluation metrics provide an average score that does not consider performance variability across the query set. In this manner, conclusions lack of any statistical significance, yielding poor inference to cases outside the query set and possibly unfair comparisons. We propose to apply statistical methods in order to obtain a more informative measure for problems in which different query classes can be identified. In this context, we assess the performance variability on two levels: overall variability across the whole query set and specific query class-related variability. To this end, we estimate confidence bands for precision-recall curves, and we apply ANOVA in order to assess the significance of the performance across different query classes.
Address Algarve; Portugal; October 2014
Corporate Author Thesis
Publisher Springer International Publishing Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-319-11757-7 Medium
Area Expedition Conference ICIAR
Notes IAM; DAG; 600.060; 600.061; 600.077; 600.075 Approved no
Call Number Admin @ si @ BGB2014 Serial 2559
Permanent link to this record
 

 
Author Marcelo D. Pistarelli; Angel Sappa; Ricardo Toledo
Title Multispectral Stereo Image Correspondence Type Conference Article
Year 2013 Publication 15th International Conference on Computer Analysis of Images and Patterns Abbreviated Journal (down)
Volume 8048 Issue Pages 217-224
Keywords
Abstract This paper presents a novel multispectral stereo image correspondence approach. It is evaluated using a stereo rig constructed with a visible spectrum camera and a long wave infrared spectrum camera. The novelty of the proposed approach lies on the usage of Hough space as a correspondence search domain. In this way it avoids searching for correspondence in the original multispectral image domains, where information is low correlated, and a common domain is used. The proposed approach is intended to be used in outdoor urban scenarios, where images contain large amount of edges. These edges are used as distinctive characteristics for the matching in the Hough space. Experimental results are provided showing the validity of the proposed approach.
Address York; uk; August 2013
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-40245-6 Medium
Area Expedition Conference CAIP
Notes ADAS; 600.055 Approved no
Call Number Admin @ si @ PST2013 Serial 2561
Permanent link to this record
 

 
Author Gioacchino Vino; Angel Sappa
Title Revisiting Harris Corner Detector Algorithm: a Gradual Thresholding Approach Type Conference Article
Year 2013 Publication 10th International Conference on Image Analysis and Recognition Abbreviated Journal (down)
Volume 7950 Issue Pages 354-363
Keywords
Abstract This paper presents an adaptive thresholding approach intended to increase the number of detected corners, while reducing the amount of those ones corresponding to noisy data. The proposed approach works by using the classical Harris corner detector algorithm and overcome the difficulty in finding a general threshold that work well for all the images in a given data set by proposing a novel adaptive thresholding scheme. Initially, two thresholds are used to discern between strong corners and flat regions. Then, a region based criteria is used to discriminate between weak corners and noisy points in the midway interval. Experimental results show that the proposed approach has a better capability to reject false corners and, at the same time, to detect weak ones. Comparisons with the state of the art are provided showing the validity of the proposed approach.
Address Póvoa de Varzim; Portugal; June 2013
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-39093-7 Medium
Area Expedition Conference ICIAR
Notes ADAS; 600.055 Approved no
Call Number Admin @ si @ ViS2013 Serial 2562
Permanent link to this record
 

 
Author Alejandro Gonzalez Alzate; Gabriel Villalonga; Jiaolong Xu; David Vazquez; Jaume Amores; Antonio Lopez
Title Multiview Random Forest of Local Experts Combining RGB and LIDAR data for Pedestrian Detection Type Conference Article
Year 2015 Publication IEEE Intelligent Vehicles Symposium IV2015 Abbreviated Journal (down)
Volume Issue Pages 356-361
Keywords Pedestrian Detection
Abstract Despite recent significant advances, pedestrian detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities and a strong multi-view classifier that accounts for different pedestrian views and poses. In this paper we provide an extensive evaluation that gives insight into how each of these aspects (multi-cue, multimodality and strong multi-view classifier) affect performance both individually and when integrated together. In the multimodality component we explore the fusion of RGB and depth maps obtained by high-definition LIDAR, a type of modality that is only recently starting to receive attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the performance, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient. These simple blocks can be easily replaced with more sophisticated ones recently proposed, such as the use of convolutional neural networks for feature representation, to further improve the accuracy.
Address Seoul; Corea; June 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area ACDC Expedition Conference IV
Notes ADAS; 600.076; 600.057; 600.054 Approved no
Call Number ADAS @ adas @ GVX2015 Serial 2625
Permanent link to this record
 

 
Author P. Wang; V. Eglin; C. Garcia; C. Largeron; Josep Llados; Alicia Fornes
Title Représentation par graphe de mots manuscrits dans les images pour la recherche par similarité Type Conference Article
Year 2014 Publication Colloque International Francophone sur l'Écrit et le Document Abbreviated Journal (down)
Volume Issue Pages 233-248
Keywords word spotting; graph-based representation; shape context description; graph edit distance; DTW; block merging; query by example
Abstract Effective information retrieval on handwritten document images has always been
a challenging task. In this paper, we propose a novel handwritten word spotting approach based on graph representation. The presented model comprises both topological and morphological signatures of handwriting. Skeleton-based graphs with the Shape Context labeled vertexes are established for connected components. Each word image is represented as a sequence of graphs. In order to be robust to the handwriting variations, an exhaustive merging process based on DTW alignment results introduced in the similarity measure between word images. With respect to the computation complexity, an approximate graph edit distance approach using bipartite matching is employed for graph matching. The experiments on the George Washington dataset and the marriage records from the Barcelona Cathedral dataset demonstrate that the proposed approach outperforms the state-of-the-art structural methods.
Address Nancy; Francia; March 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CIFED
Notes DAG; 600.061; 602.006; 600.077 Approved no
Call Number Admin @ si @ WEG2014c Serial 2564
Permanent link to this record
 

 
Author Michal Drozdzal; Jordi Vitria; Santiago Segui; Carolina Malagelada; Fernando Azpiroz; Petia Radeva
Title Intestinal event segmentation for endoluminal video analysis Type Conference Article
Year 2014 Publication 21st IEEE International Conference on Image Processing Abbreviated Journal (down)
Volume Issue Pages 3592 - 3596
Keywords
Abstract
Address Paris; Francia; October 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIP
Notes MILAB; OR;MV Approved no
Call Number Admin @ si @ DVS2014 Serial 2565
Permanent link to this record
 

 
Author Alicia Fornes; V.C.Kieu; M. Visani; N.Journet; Anjan Dutta
Title The ICDAR/GREC 2013 Music Scores Competition: Staff Removal Type Book Chapter
Year 2014 Publication Graphics Recognition. Current Trends and Challenges Abbreviated Journal (down)
Volume 8746 Issue Pages 207-220
Keywords Competition; Graphics recognition; Music scores; Writer identification; Staff removal
Abstract The first competition on music scores that was organized at ICDAR and GREC in 2011 awoke the interest of researchers, who participated in both staff removal and writer identification tasks. In this second edition, we focus on the staff removal task and simulate a real case scenario concerning old and degraded music scores. For this purpose, we have generated a new set of semi-synthetic images using two degradation models that we previously introduced: local noise and 3D distortions. In this extended paper we provide an extended description of the dataset, degradation models, evaluation metrics, the participant’s methods and the obtained results that could not be presented at ICDAR and GREC proceedings due to page limitations.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor B.Lamiroy; J.-M. Ogier
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-662-44853-3 Medium
Area Expedition Conference
Notes DAG; 600.077; 600.061 Approved no
Call Number Admin @ si @ FKV2014 Serial 2581
Permanent link to this record
 

 
Author G.Thorvaldsen; Joana Maria Pujadas-Mora; T.Andersen ; L.Eikvil; Josep Llados; Alicia Fornes; Anna Cabre
Title A Tale of two Transcriptions Type Journal
Year 2015 Publication Historical Life Course Studies Abbreviated Journal (down)
Volume 2 Issue Pages 1-19
Keywords Nominative Sources; Census; Vital Records; Computer Vision; Optical Character Recognition; Word Spotting
Abstract non-indexed
This article explains how two projects implement semi-automated transcription routines: for census sheets in Norway and marriage protocols from Barcelona. The Spanish system was created to transcribe the marriage license books from 1451 to 1905 for the Barcelona area; one of the world’s longest series of preserved vital records. Thus, in the Project “Five Centuries of Marriages” (5CofM) at the Autonomous University of Barcelona’s Center for Demographic Studies, the Barcelona Historical Marriage Database has been built. More than 600,000 records were transcribed by 150 transcribers working online. The Norwegian material is cross-sectional as it is the 1891 census, recorded on one sheet per person. This format and the underlining of keywords for several variables made it more feasible to semi-automate data entry than when many persons are listed on the same page. While Optical Character Recognition (OCR) for printed text is scientifically mature, computer vision research is now focused on more difficult problems such as handwriting recognition. In the marriage project, document analysis methods have been proposed to automatically recognize the marriage licenses. Fully automatic recognition is still a challenge, but some promising results have been obtained. In Spain, Norway and elsewhere the source material is available as scanned pictures on the Internet, opening up the possibility for further international cooperation concerning automating the transcription of historic source materials. Like what is being done in projects to digitize printed materials, the optimal solution is likely to be a combination of manual transcription and machine-assisted recognition also for hand-written sources.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 2352-6343 ISBN Medium
Area Expedition Conference
Notes DAG; 600.077; 602.006 Approved no
Call Number Admin @ si @ TPA2015 Serial 2582
Permanent link to this record
 

 
Author Jiaolong Xu; Sebastian Ramos; David Vazquez; Antonio Lopez
Title DA-DPM Pedestrian Detection Type Conference Article
Year 2013 Publication ICCV Workshop on Reconstruction meets Recognition Abbreviated Journal (down)
Volume Issue Pages
Keywords Domain Adaptation; Pedestrian Detection
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCVW-RR
Notes ADAS Approved no
Call Number Admin @ si @ XRV2013 Serial 2569
Permanent link to this record
 

 
Author Gabriel Villalonga; Sebastian Ramos; German Ros; David Vazquez; Antonio Lopez
Title 3d Pedestrian Detection via Random Forest Type Miscellaneous
Year 2014 Publication European Conference on Computer Vision Abbreviated Journal (down)
Volume Issue Pages 231-238
Keywords Pedestrian Detection
Abstract Our demo focuses on showing the extraordinary performance of our novel 3D pedestrian detector along with its simplicity and real-time capabilities. This detector has been designed for autonomous driving applications, but it can also be applied in other scenarios that cover both outdoor and indoor applications.
Our pedestrian detector is based on the combination of a random forest classifier with HOG-LBP features and the inclusion of a preprocessing stage based on 3D scene information in order to precisely determinate the image regions where the detector should search for pedestrians. This approach ends up in a high accurate system that runs real-time as it is required by many computer vision and robotics applications.
Address Zurich; suiza; September 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCV-Demo
Notes ADAS; 600.076 Approved no
Call Number Admin @ si @ VRR2014 Serial 2570
Permanent link to this record
 

 
Author Antonio Clavelli
Title A computational model of eye guidance, searching for text in real scene images Type Book Whole
Year 2014 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal (down)
Volume Issue Pages
Keywords
Abstract Searching for text objects in real scene images is an open problem and a very active computer vision research area. A large number of methods have been proposed tackling the text search as extension of the ones from the document analysis field or inspired by general purpose object detection methods. However the general problem of object search in real scene images remains an extremely challenging problem due to the huge variability in object appearance. This thesis builds on top of the most recent findings in the visual attention literature presenting a novel computational model of eye guidance aiming to better describe text object search in real scene images.
First are presented the relevant state-of-the-art results from the visual attention literature regarding eye movements and visual search. Relevant models of attention are discussed and integrated with recent observations on the role of top-down constraints and the emerging need for a layered model of attention in which saliency is not the only factor guiding attention. Visual attention is then explained by the interaction of several modulating factors, such as objects, value, plans and saliency. Then we introduce our probabilistic formulation of attention deployment in real scene. The model is based on the rationale that oculomotor control depends on two interacting but distinct processes: an attentional process that assigns value to the sources of information and motor process that flexibly links information with action.
In such framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the reward of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects.
In the experimental section the model is tested in laboratory condition, comparing model simulations with data from eye tracking experiments. The comparison is qualitative in terms of observable scan paths and quantitative in terms of statistical similarity of gaze shift amplitude. Experiments are performed using eye tracking data from both a publicly available dataset of face and text and from newly performed eye-tracking experiments on a dataset of street view pictures containing text. The last part of this thesis is dedicated to study the extent to which the proposed model can account for human eye movements in a low constrained setting. We used a mobile eye tracking device and an ad-hoc developed methodology to compare model simulated eye data with the human eye data from mobile eye tracking recordings. Such setting allow to test the model in an incomplete visual information condition, reproducing a close to real-life search task.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Dimosthenis Karatzas;Giuseppe Boccignone;Josep Llados
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-940902-6-4 Medium
Area Expedition Conference
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ Cla2014 Serial 2571
Permanent link to this record