|   | 
Details
   web
Records
Author Marçal Rusiñol; Volkmar Frinken; Dimosthenis Karatzas; Andrew Bagdanov; Josep Llados
Title Multimodal page classification in administrative document image streams Type Journal Article
Year 2014 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume 17 Issue 4 Pages 331-341
Keywords Digital mail room; Multimodal page classification; Visual and textual document description
Abstract In this paper, we present a page classification application in a banking workflow. The proposed architecture represents administrative document images by merging visual and textual descriptions. The visual description is based on a hierarchical representation of the pixel intensity distribution. The textual description uses latent semantic analysis to represent document content as a mixture of topics. Several off-the-shelf classifiers and different strategies for combining visual and textual cues have been evaluated. A final step uses an n-gram model of the page stream allowing a finer-grained classification of pages. The proposed method has been tested in a real large-scale environment and we report results on a dataset of 70,000 pages.
Address (up)
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1433-2833 ISBN Medium
Area Expedition Conference
Notes DAG; LAMP; 600.056; 600.061; 601.240; 601.223; 600.077; 600.079 Approved no
Call Number Admin @ si @ RFK2014 Serial 2523
Permanent link to this record
 

 
Author Lorenzo Seidenari; Giuseppe Serra; Andrew Bagdanov; Alberto del Bimbo
Title Local pyramidal descriptors for image recognition Type Journal Article
Year 2014 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 36 Issue 5 Pages 1033 - 1040
Keywords Object categorization; local features; kernel methods
Abstract In this paper we present a novel method to improve the flexibility of descriptor matching for image recognition by using local multiresolution
pyramids in feature space. We propose that image patches be represented at multiple levels of descriptor detail and that these levels be defined in terms of local spatial pooling resolution. Preserving multiple levels of detail in local descriptors is a way of hedging one’s bets on which levels will most relevant for matching during learning and recognition. We introduce the Pyramid SIFT (P-SIFT) descriptor and show that its use in four state-of-the-art image recognition pipelines improves accuracy and yields state-of-the-art results. Our technique is applicable independently of spatial pyramid matching and we show that spatial pyramids can be combined with local pyramids to obtain
further improvement.We achieve state-of-the-art results on Caltech-101
(80.1%) and Caltech-256 (52.6%) when compared to other approaches based on SIFT features over intensity images. Our technique is efficient and is extremely easy to integrate into image recognition pipelines.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0162-8828 ISBN Medium
Area Expedition Conference
Notes LAMP; 600.079 Approved no
Call Number Admin @ si @ SSB2014 Serial 2524
Permanent link to this record
 

 
Author Cristhian A. Aguilera-Carrasco
Title Evaluation of feature detectors and descriptors in VISIBLE-LWIR cross-spectral imaging Type Report
Year 2014 Publication CVC Technical Report Abbreviated Journal
Volume 177 Issue Pages
Keywords Multi-spectral; Cross-spectral; Visible-LWIR imaging; Multimodal.
Abstract This thesis evaluates the performance of different state-of-art feature detectors and descriptors algorithms in the Visible-LWIR cross-spectral scenario. The focus is to determine if current detector and descriptor algorithms can be used to match features between the LWIR spectrum and the visible spectrum in applications such as, visual odometry, object recognition, image registration and stereo vision. An outdoor cross-spectral dataset was created to evaluate the suitability of the different algorithms. The results
show that the tested algorithms are not suitable to the task of matching features across different spectra. The repeatability ratio was smaller than the 30 percent in the best case and in general matched features were not accurate located. Additionally, these results also suggest that is necessary to create new algorithms that take into account the nature of the different spectra, describing characteristics that exist in both spectra such as discontinuities.
Address (up)
Corporate Author Thesis Master's thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.076 Approved no
Call Number Admin @ si @Agu2014 Serial 2526
Permanent link to this record
 

 
Author Xim Cerda-Company; C. Alejandro Parraga; Xavier Otazu
Title Which tone-mapping is the best? A comparative study of tone-mapping perceived quality Type Abstract
Year 2014 Publication Perception Abbreviated Journal
Volume 43 Issue Pages 106
Keywords
Abstract Perception 43 ECVP Abstract Supplement
High-dynamic-range (HDR) imaging refers to the methods designed to increase the brightness dynamic range present in standard digital imaging techniques. This increase is achieved by taking the same picture under di erent exposure values and mapping the intensity levels into a single image by way of a tone-mapping operator (TMO). Currently, there is no agreement on how to evaluate the quality
of di erent TMOs. In this work we psychophysically evaluate 15 di erent TMOs obtaining rankings based on the perceived properties of the resulting tone-mapped images. We performed two di erent experiments on a CRT calibrated display using 10 subjects: (1) a study of the internal relationships between grey-levels and (2) a pairwise comparison of the resulting 15 tone-mapped images. In (1) observers internally matched the grey-levels to a reference inside the tone-mapped images and in the real scene. In (2) observers performed a pairwise comparison of the tone-mapped images alongside the real scene. We obtained two rankings of the TMOs according their performance. In (1) the best algorithm
was ICAM by J.Kuang et al (2007) and in (2) the best algorithm was a TMO by Krawczyk et al (2005). Our results also show no correlation between these two rankings.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECVP
Notes NEUROBIT; 600.074 Approved no
Call Number Admin @ si @ CPO2014 Serial 2527
Permanent link to this record
 

 
Author Noha Elfiky; Theo Gevers; Arjan Gijsenij; Jordi Gonzalez
Title Color Constancy using 3D Scene Geometry derived from a Single Image Type Journal Article
Year 2014 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP
Volume 23 Issue 9 Pages 3855-3868
Keywords
Abstract The aim of color constancy is to remove the effect of the color of the light source. As color constancy is inherently an ill-posed problem, most of the existing color constancy algorithms are based on specific imaging assumptions (e.g. grey-world and white patch assumption).
In this paper, 3D geometry models are used to determine which color constancy method to use for the different geometrical regions (depth/layer) found
in images. The aim is to classify images into stages (rough 3D geometry models). According to stage models; images are divided into stage regions using hard and soft segmentation. After that, the best color constancy methods is selected for each geometry depth. To this end, we propose a method to combine color constancy algorithms by investigating the relation between depth, local image statistics and color constancy. Image statistics are then exploited per depth to select the proper color constancy method. Our approach opens the possibility to estimate multiple illuminations by distinguishing
nearby light source from distant illuminations. Experiments on state-of-the-art data sets show that the proposed algorithm outperforms state-of-the-art
single color constancy algorithms with an improvement of almost 50% of median angular error. When using a perfect classifier (i.e, all of the test images are correctly classified into stages); the performance of the proposed method achieves an improvement of 52% of the median angular error compared to the best-performing single color constancy algorithm.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1057-7149 ISBN Medium
Area Expedition Conference
Notes ISE; 600.078 Approved no
Call Number Admin @ si @ EGG2014 Serial 2528
Permanent link to this record
 

 
Author Sergio Escalera; Xavier Baro; Jordi Gonzalez; Miguel Angel Bautista; Meysam Madadi; Miguel Reyes; Victor Ponce; Hugo Jair Escalante; Jaime Shotton; Isabelle Guyon
Title ChaLearn Looking at People Challenge 2014: Dataset and Results Type Conference Article
Year 2014 Publication ECCV Workshop on ChaLearn Looking at People Abbreviated Journal
Volume 8925 Issue Pages 459-473
Keywords Human Pose Recovery; Behavior Analysis; Action and in- teractions; Multi-modal gestures; recognition
Abstract This paper summarizes the ChaLearn Looking at People 2014 challenge data and the results obtained by the participants. The competition was split into three independent tracks: human pose recovery from RGB data, action and interaction recognition from RGB data sequences, and multi-modal gesture recognition from RGB-Depth sequences. For all the tracks, the goal was to perform user-independent recognition in sequences of continuous images using the overlapping Jaccard index as the evaluation measure. In this edition of the ChaLearn challenge, two large novel data sets were made publicly available and the Microsoft Codalab platform were used to manage the competition. Outstanding results were achieved in the three challenge tracks, with accuracy results of 0.20, 0.50, and 0.85 for pose recovery, action/interaction recognition, and multi-modal gesture recognition, respectively.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes HuPBA; ISE; 600.063;MV Approved no
Call Number Admin @ si @ EBG2014 Serial 2529
Permanent link to this record
 

 
Author Francisco Cruz; Oriol Ramos Terrades
Title EM-Based Layout Analysis Method for Structured Documents Type Conference Article
Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 315-320
Keywords
Abstract In this paper we present a method to perform layout analysis in structured documents. We proposed an EM-based algorithm to fit a set of Gaussian mixtures to the different regions according to the logical distribution along the page. After the convergence, we estimate the final shape of the regions according
to the parameters computed for each component of the mixture. We evaluated our method in the task of record detection in a collection of historical structured documents and performed a comparison with other previous works in this task.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN Medium
Area Expedition Conference ICPR
Notes DAG; 602.006; 600.061; 600.077 Approved no
Call Number Admin @ si @ CrR2014 Serial 2530
Permanent link to this record
 

 
Author Lluis Pere de las Heras; Ernest Valveny; Gemma Sanchez
Title Unsupervised and Notation-Independent Wall Segmentation in Floor Plans Using a Combination of Statistical and Structural Strategies Type Book Chapter
Year 2014 Publication Graphics Recognition. Current Trends and Challenges Abbreviated Journal
Volume 8746 Issue Pages 109-121
Keywords Graphics recognition; Floor plan analysis; Object segmentation
Abstract In this paper we present a wall segmentation approach in floor plans that is able to work independently to the graphical notation, does not need any pre-annotated data for learning, and is able to segment multiple-shaped walls such as beams and curved-walls. This method results from the combination of the wall segmentation approaches [3, 5] presented recently by the authors. Firstly, potential straight wall segments are extracted in an unsupervised way similar to [3], but restricting even more the wall candidates considered in the original approach. Then, based on [5], these segments are used to learn the texture pattern of walls and spot the lost instances. The presented combination of both methods has been tested on 4 available datasets with different notations and compared qualitatively and quantitatively to the state-of-the-art applied on these collections. Additionally, some qualitative results on floor plans directly downloaded from the Internet are reported in the paper. The overall performance of the method demonstrates either its adaptability to different wall notations and shapes, and to document qualities and resolutions.
Address (up)
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-662-44853-3 Medium
Area Expedition Conference
Notes DAG; ADAS; 600.076; 600.077 Approved no
Call Number Admin @ si @ HVS2014 Serial 2535
Permanent link to this record
 

 
Author Lluis Pere de las Heras; David Fernandez; Alicia Fornes; Ernest Valveny; Gemma Sanchez; Josep Llados
Title Runlength Histogram Image Signature for Perceptual Retrieval of Architectural Floor Plans Type Book Chapter
Year 2014 Publication Graphics Recognition. Current Trends and Challenges Abbreviated Journal
Volume 8746 Issue Pages 135-146
Keywords Graphics recognition; Graphics retrieval; Image classification
Abstract This paper proposes a runlength histogram signature as a perceptual descriptor of architectural plans in a retrieval scenario. The style of an architectural drawing is characterized by the perception of lines, shapes and texture. Such visual stimuli are the basis for defining semantic concepts as space properties, symmetry, density, etc. We propose runlength histograms extracted in vertical, horizontal and diagonal directions as a characterization of line and space properties in floorplans, so it can be roughly associated to a description of walls and room structure. A retrieval application illustrates the performance of the proposed approach, where given a plan as a query, similar ones are obtained from a database. A ground truth based on human observation has been constructed to validate the hypothesis. Additional retrieval results on sketched building’s facades are reported qualitatively in this paper. Its good description and its adaptability to two different sketch drawings despite its simplicity shows the interest of the proposed approach and opens a challenging research line in graphics recognition.
Address (up)
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-662-44853-3 Medium
Area Expedition Conference
Notes DAG; ADAS; 600.045; 600.056; 600.061; 600.076; 600.077 Approved no
Call Number Admin @ si @ HFF2014 Serial 2536
Permanent link to this record
 

 
Author Lluis Gomez; Dimosthenis Karatzas
Title Scene Text Recognition: No Country for Old Men? Type Conference Article
Year 2014 Publication 1st International Workshop on Robust Reading Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IWRR
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ GoK2014c Serial 2538
Permanent link to this record
 

 
Author Xavier Perez Sala; Fernando De la Torre; Laura Igual; Sergio Escalera; Cecilio Angulo
Title Subspace Procrustes Analysis Type Conference Article
Year 2014 Publication ECCV Workshop on ChaLearn Looking at People Abbreviated Journal
Volume 8925 Issue Pages 654-668
Keywords
Abstract Procrustes Analysis (PA) has been a popular technique to align and build 2-D statistical models of shapes. Given a set of 2-D shapes PA is applied to remove rigid transformations. Then, a non-rigid 2-D model is computed by modeling (e.g., PCA) the residual. Although PA has been widely used, it has several limitations for modeling 2-D shapes: occluded landmarks and missing data can result in local minima solutions, and there is no guarantee that the 2-D shapes provide a uniform sampling of the 3-D space of rotations for the object. To address previous issues, this paper proposes Subspace PA (SPA). Given several instances of a 3-D object, SPA computes the mean and a 2-D subspace that can simultaneously model all rigid and non-rigid deformations of the 3-D object. We propose a discrete (DSPA) and continuous (CSPA) formulation for SPA, assuming that 3-D samples of an object are provided. DSPA extends the traditional PA, and produces unbiased 2-D models by uniformly sampling di erent views of the 3-D object. CSPA provides a continuous approach to uniformly sample the space of 3-D rotations, being more ecient in space and time. Experiments using SPA to learn 2-D models of bodies from motion capture data illustrate the bene ts of our approach.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes OR; HuPBA;MILAB Approved no
Call Number Admin @ si @ PTI2014 Serial 2539
Permanent link to this record
 

 
Author Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
Title Spotting Symbol Using Sparsity over Learned Dictionary of Local Descriptors Type Conference Article
Year 2014 Publication 11th IAPR International Workshop on Document Analysis and Systems Abbreviated Journal
Volume Issue Pages 156-160
Keywords
Abstract This paper proposes a new approach to spot symbols into graphical documents using sparse representations. More specifically, a dictionary is learned from a training database of local descriptors defined over the documents. Following their sparse representations, interest points sharing similar properties are used to define interest regions. Using an original adaptation of information retrieval techniques, a vector model for interest regions and for a query symbol is built based on its sparsity in a visual vocabulary where the visual words are columns in the learned dictionary. The matching process is performed comparing the similarity between vector models. Evaluation on SESYD datasets demonstrates that our method is promising.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4799-3243-6 Medium
Area Expedition Conference DAS
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ DTR2014 Serial 2543
Permanent link to this record
 

 
Author Frederic Sampedro; Anna Domenech; Sergio Escalera
Title Static and dynamic computational cancer spread quantification in whole body FDG-PET/CT scans Type Journal Article
Year 2014 Publication Journal of Medical Imaging and Health Informatics Abbreviated Journal JMIHI
Volume 4 Issue 6 Pages 825-831
Keywords CANCER SPREAD; COMPUTER AIDED DIAGNOSIS; MEDICAL IMAGING; TUMOR QUANTIFICATION
Abstract In this work we address the computational cancer spread quantification scenario in whole body FDG-PET/CT scans. At the static level, this setting can be modeled as a clustering problem on the set of 3D connected components of the whole body PET tumoral segmentation mask carried out by nuclear medicine physicians. At the dynamic level, and ad-hoc algorithm is proposed in order to quantify the cancer spread time evolution which, when combined with other existing indicators, gives rise to the metabolic tumor volume-aggressiveness-spread time evolution chart, a novel tool that we claim that would prove useful in nuclear medicine and oncological clinical or research scenarios. Good performance results of the proposed methodologies both at the clinical and technological level are shown using a dataset of 48 segmented whole body FDG-PET/CT scans.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA;MILAB Approved no
Call Number Admin @ si @ SDE2014b Serial 2548
Permanent link to this record
 

 
Author Frederic Sampedro; Sergio Escalera; Anna Puig
Title Iterative Multiclass Multiscale Stacked Sequential Learning: definition and application to medical volume segmentation Type Journal Article
Year 2014 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 46 Issue Pages 1-10
Keywords Machine learning; Sequential learning; Multi-class problems; Contextual learning; Medical volume segmentation
Abstract In this work we present the iterative multi-class multi-scale stacked sequential learning framework (IMMSSL), a novel learning scheme that is particularly suited for medical volume segmentation applications. This model exploits the inherent voxel contextual information of the structures of interest in order to improve its segmentation performance results. Without any feature set or learning algorithm prior assumption, the proposed scheme directly seeks to learn the contextual properties of a region from the predicted classifications of previous classifiers within an iterative scheme. Performance results regarding segmentation accuracy in three two-class and multi-class medical volume datasets show a significant improvement with respect to state of the art alternatives. Due to its easiness of implementation and its independence of feature space and learning algorithm, the presented machine learning framework could be taken into consideration as a first choice in complex volume segmentation scenarios.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA;MILAB Approved no
Call Number Admin @ si @ SEP2014 Serial 2550
Permanent link to this record
 

 
Author Eloi Puertas; Miguel Angel Bautista; Daniel Sanchez; Sergio Escalera; Oriol Pujol
Title Learning to Segment Humans by Stacking their Body Parts, Type Conference Article
Year 2014 Publication ECCV Workshop on ChaLearn Looking at People Abbreviated Journal
Volume 8925 Issue Pages 685-697
Keywords Human body segmentation; Stacked Sequential Learning
Abstract Human segmentation in still images is a complex task due to the wide range of body poses and drastic changes in environmental conditions. Usually, human body segmentation is treated in a two-stage fashion. First, a human body part detection step is performed, and then, human part detections are used as prior knowledge to be optimized by segmentation strategies. In this paper, we present a two-stage scheme based on Multi-Scale Stacked Sequential Learning (MSSL). We define an extended feature set by stacking a multi-scale decomposition of body
part likelihood maps. These likelihood maps are obtained in a first stage
by means of a ECOC ensemble of soft body part detectors. In a second stage, contextual relations of part predictions are learnt by a binary classifier, obtaining an accurate body confidence map. The obtained confidence map is fed to a graph cut optimization procedure to obtain the final segmentation. Results show improved segmentation when MSSL is included in the human segmentation pipeline.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes HuPBA;MILAB Approved no
Call Number Admin @ si @ PBS2014 Serial 2553
Permanent link to this record