|   | 
Details
   web
Records
Author Marçal Rusiñol; Volkmar Frinken; Dimosthenis Karatzas; Andrew Bagdanov; Josep Llados
Title Multimodal page classification in administrative document image streams Type Journal Article
Year 2014 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume 17 Issue 4 Pages 331-341
Keywords Digital mail room; Multimodal page classification; Visual and textual document description
Abstract In this paper, we present a page classification application in a banking workflow. The proposed architecture represents administrative document images by merging visual and textual descriptions. The visual description is based on a hierarchical representation of the pixel intensity distribution. The textual description uses latent semantic analysis to represent document content as a mixture of topics. Several off-the-shelf classifiers and different strategies for combining visual and textual cues have been evaluated. A final step uses an n-gram model of the page stream allowing a finer-grained classification of pages. The proposed method has been tested in a real large-scale environment and we report results on a dataset of 70,000 pages.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1433-2833 ISBN Medium
Area Expedition Conference
Notes DAG; LAMP; 600.056; 600.061; 601.240; 601.223; 600.077; 600.079 Approved no
Call Number Admin @ si @ RFK2014 Serial 2523
Permanent link to this record
 

 
Author Lluis Pere de las Heras; Oriol Ramos Terrades; Sergi Robles; Gemma Sanchez
Title CVC-FP and SGT: a new database for structural floor plan analysis and its groundtruthing tool Type Journal Article
Year 2015 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume 18 Issue 1 Pages 15-30
Keywords
Abstract Recent results on structured learning methods have shown the impact of structural information in a wide range of pattern recognition tasks. In the field of document image analysis, there is a long experience on structural methods for the analysis and information extraction of multiple types of documents. Yet, the lack of conveniently annotated and free access databases has not benefited the progress in some areas such as technical drawing understanding. In this paper, we present a floor plan database, named CVC-FP, that is annotated for the architectural objects and their structural relations. To construct this database, we have implemented a groundtruthing tool, the SGT tool, that allows to make specific this sort of information in a natural manner. This tool has been made for general purpose groundtruthing: It allows to define own object classes and properties, multiple labeling options are possible, grants the cooperative work, and provides user and version control. We finally have collected some of the recent work on floor plan interpretation and present a quantitative benchmark for this database. Both CVC-FP database and the SGT tool are freely released to the research community to ease comparisons between methods and boost reproducible research.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1433-2833 ISBN Medium
Area Expedition Conference
Notes DAG; ADAS; 600.061; 600.076; 600.077 Approved no
Call Number Admin @ si @ HRR2015 Serial 2567
Permanent link to this record
 

 
Author Christophe Rigaud; Clement Guerin; Dimosthenis Karatzas; Jean-Christophe Burie; Jean-Marc Ogier
Title Knowledge-driven understanding of images in comic books Type Journal Article
Year 2015 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume 18 Issue 3 Pages 199-221
Keywords Document Understanding; comics analysis; expert system
Abstract Document analysis is an active field of research, which can attain a complete understanding of the semantics of a given document. One example of the document understanding process is enabling a computer to identify the key elements of a comic book story and arrange them according to a predefined domain knowledge. In this study, we propose a knowledge-driven system that can interact with bottom-up and top-down information to progressively understand the content of a document. We model the comic book’s and the image processing domains knowledge for information consistency analysis. In addition, different image processing methods are improved or developed to extract panels, balloons, tails, texts, comic characters and their semantic relations in an unsupervised way.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1433-2833 ISBN Medium
Area Expedition Conference
Notes DAG; 600.056; 600.077 Approved no
Call Number RGK2015 Serial 2595
Permanent link to this record
 

 
Author David Aldavert; Marçal Rusiñol; Ricardo Toledo; Josep Llados
Title A Study of Bag-of-Visual-Words Representations for Handwritten Keyword Spotting Type Journal Article
Year 2015 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume 18 Issue 3 Pages 223-234
Keywords Bag-of-Visual-Words; Keyword spotting; Handwritten documents; Performance evaluation
Abstract The Bag-of-Visual-Words (BoVW) framework has gained popularity among the document image analysis community, specifically as a representation of handwritten words for recognition or spotting purposes. Although in the computer vision field the BoVW method has been greatly improved, most of the approaches in the document image analysis domain still rely on the basic implementation of the BoVW method disregarding such latest refinements. In this paper, we present a review of those improvements and its application to the keyword spotting task. We thoroughly evaluate their impact against a baseline system in the well-known George Washington dataset and compare the obtained results against nine state-of-the-art keyword spotting methods. In addition, we also compare both the baseline and improved systems with the methods presented at the Handwritten Keyword Spotting Competition 2014.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1433-2833 ISBN Medium
Area Expedition Conference
Notes DAG; ADAS; 600.055; 600.061; 601.223; 600.077; 600.097 Approved no
Call Number Admin @ si @ ART2015 Serial 2679
Permanent link to this record
 

 
Author Alejandro Gonzalez Alzate; Zhijie Fang; Yainuvis Socarras; Joan Serrat; David Vazquez; Jiaolong Xu; Antonio Lopez
Title Pedestrian Detection at Day/Night Time with Visible and FIR Cameras: A Comparison Type Journal Article
Year 2016 Publication Sensors Abbreviated Journal SENS
Volume 16 Issue 6 Pages 820
Keywords Pedestrian Detection; FIR
Abstract Despite all the significant advances in pedestrian detection brought by computer vision for driving assistance, it is still a challenging problem. One reason is the extremely varying lighting conditions under which such a detector should operate, namely day and night time. Recent research has shown that the combination of visible and non-visible imaging modalities may increase detection accuracy, where the infrared spectrum plays a critical role. The goal of this paper is to assess the accuracy gain of different pedestrian models (holistic, part-based, patch-based) when training with images in the far infrared spectrum. Specifically, we want to compare detection accuracy on test images recorded at day and nighttime if trained (and tested) using (a) plain color images, (b) just infrared images and (c) both of them. In order to obtain results for the last item we propose an early fusion approach to combine features from both modalities. We base the evaluation on a new dataset we have built for this purpose as well as on the publicly available KAIST multispectral dataset.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1424-8220 ISBN Medium
Area Expedition Conference
Notes ADAS; 600.085; 600.076; 600.082; 601.281 Approved no
Call Number ADAS @ adas @ GFS2016 Serial 2754
Permanent link to this record
 

 
Author Marçal Rusiñol; Lluis Pere de las Heras; Oriol Ramos Terrades
Title Flowchart Recognition for Non-Textual Information Retrieval in Patent Search Type Journal Article
Year 2014 Publication Information Retrieval Abbreviated Journal IR
Volume 17 Issue 5-6 Pages 545-562
Keywords Flowchart recognition; Patent documents; Text/graphics separation; Raster-to-vector conversion; Symbol recognition
Abstract Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1386-4564 ISBN Medium
Area Expedition Conference
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ RHR2013 Serial 2342
Permanent link to this record
 

 
Author Bogdan Raducanu; D. Gatica-Perez
Title Inferring competitive role patterns in reality TV show through nonverbal analysis Type Journal Article
Year 2012 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 56 Issue 1 Pages 207-226
Keywords
Abstract This paper introduces a new facet of social media, namely that depicting social interaction. More concretely, we address this problem from the perspective of nonverbal behavior-based analysis of competitive meetings. For our study, we made use of “The Apprentice” reality TV show, which features a competition for a real, highly paid corporate job. Our analysis is centered around two tasks regarding a person's role in a meeting: predicting the person with the highest status, and predicting the fired candidates. We address this problem by adopting both supervised and unsupervised strategies. The current study was carried out using nonverbal audio cues. Our approach is based only on the nonverbal interaction dynamics during the meeting without relying on the spoken words. The analysis is based on two types of data: individual and relational measures. Results obtained from the analysis of a full season of the show are promising (up to 85.7% of accuracy in the first case and up to 92.8% in the second case). Our approach has been conveniently compared with the Influence Model, demonstrating its superiority.
Address
Corporate Author Thesis
Publisher Elsevier Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1380-7501 ISBN Medium
Area Expedition Conference
Notes OR;MV Approved no
Call Number BCNPCL @ bcnpcl @ RaG2012 Serial 1360
Permanent link to this record
 

 
Author Palaiahnakote Shivakumara; Anjan Dutta; Chew Lim Tan; Umapada Pal
Title Multi-oriented scene text detection in video based on wavelet and angle projection boundary growing Type Journal Article
Year 2014 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 72 Issue 1 Pages 515-539
Keywords
Abstract In this paper, we address two complex issues: 1) Text frame classification and 2) Multi-oriented text detection in video text frame. We first divide a video frame into 16 blocks and propose a combination of wavelet and median-moments with k-means clustering at the block level to identify probable text blocks. For each probable text block, the method applies the same combination of feature with k-means clustering over a sliding window running through the blocks to identify potential text candidates. We introduce a new idea of symmetry on text candidates in each block based on the observation that pixel distribution in text exhibits a symmetric pattern. The method integrates all blocks containing text candidates in the frame and then all text candidates are mapped on to a Sobel edge map of the original frame to obtain text representatives. To tackle the multi-orientation problem, we present a new method called Angle Projection Boundary Growing (APBG) which is an iterative algorithm and works based on a nearest neighbor concept. APBG is then applied on the text representatives to fix the bounding box for multi-oriented text lines in the video frame. Directional information is used to eliminate false positives. Experimental results on a variety of datasets such as non-horizontal, horizontal, publicly available data (Hua’s data) and ICDAR-03 competition data (camera images) show that the proposed method outperforms existing methods proposed for video and the state of the art methods for scene text as well.
Address
Corporate Author Thesis
Publisher Springer US Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1380-7501 ISBN Medium
Area Expedition Conference
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ SDT2014 Serial 2357
Permanent link to this record
 

 
Author Cesar Isaza; Joaquin Salas; Bogdan Raducanu
Title Rendering ground truth data sets to detect shadows cast by static objects in outdoors Type Journal Article
Year 2014 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 70 Issue 1 Pages 557-571
Keywords Synthetic ground truth data set; Sun position; Shadow detection; Static objects shadow detection
Abstract In our work, we are particularly interested in studying the shadows cast by static objects in outdoor environments, during daytime. To assess the accuracy of a shadow detection algorithm, we need ground truth information. The collection of such information is a very tedious task because it is a process that requires manual annotation. To overcome this severe limitation, we propose in this paper a methodology to automatically render ground truth using a virtual environment. To increase the degree of realism and usefulness of the simulated environment, we incorporate in the scenario the precise longitude, latitude and elevation of the actual location of the object, as well as the sun’s position for a given time and day. To evaluate our method, we consider a qualitative and a quantitative comparison. In the quantitative one, we analyze the shadow cast by a real object in a particular geographical location and its corresponding rendered model. To evaluate qualitatively the methodology, we use some ground truth images obtained both manually and automatically.
Address
Corporate Author Thesis
Publisher Springer US Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1380-7501 ISBN Medium
Area Expedition Conference
Notes LAMP; Approved no
Call Number Admin @ si @ ISR2014 Serial 2229
Permanent link to this record
 

 
Author Naveen Onkarappa; Angel Sappa
Title Synthetic sequences and ground-truth flow field generation for algorithm validation Type Journal Article
Year 2015 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 74 Issue 9 Pages 3121-3135
Keywords Ground-truth optical flow; Synthetic sequence; Algorithm validation
Abstract Research in computer vision is advancing by the availability of good datasets that help to improve algorithms, validate results and obtain comparative analysis. The datasets can be real or synthetic. For some of the computer vision problems such as optical flow it is not possible to obtain ground-truth optical flow with high accuracy in natural outdoor real scenarios directly by any sensor, although it is possible to obtain ground-truth data of real scenarios in a laboratory setup with limited motion. In this difficult situation computer graphics offers a viable option for creating realistic virtual scenarios. In the current work we present a framework to design virtual scenes and generate sequences as well as ground-truth flow fields. Particularly, we generate a dataset containing sequences of driving scenarios. The sequences in the dataset vary in different speeds of the on-board vision system, different road textures, complex motion of vehicle and independent moving vehicles in the scene. This dataset enables analyzing and adaptation of existing optical flow methods, and leads to invention of new approaches particularly for driver assistance systems.
Address
Corporate Author Thesis
Publisher Springer US Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1380-7501 ISBN Medium
Area Expedition Conference
Notes ADAS; 600.055; 601.215; 600.076 Approved no
Call Number Admin @ si @ OnS2014b Serial 2472
Permanent link to this record
 

 
Author Svebor Karaman; Andrew Bagdanov; Lea Landucci; Gianpaolo D'Amico; Andrea Ferracani; Daniele Pezzatini; Alberto del Bimbo
Title Personalized multimedia content delivery on an interactive table by passive observation of museum visitors Type Journal Article
Year 2016 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 75 Issue 7 Pages 3787-3811
Keywords Computer vision; Video surveillance; Cultural heritage; Multimedia museum; Personalization; Natural interaction; Passive profiling
Abstract The amount of multimedia data collected in museum databases is growing fast, while the capacity of museums to display information to visitors is acutely limited by physical space. Museums must seek the perfect balance of information given on individual pieces in order to provide sufficient information to aid visitor understanding while maintaining sparse usage of the walls and guaranteeing high appreciation of the exhibit. Moreover, museums often target the interests of average visitors instead of the entire spectrum of different interests each individual visitor might have. Finally, visiting a museum should not be an experience contained in the physical space of the museum but a door opened onto a broader context of related artworks, authors, artistic trends, etc. In this paper we describe the MNEMOSYNE system that attempts to address these issues through a new multimedia museum experience. Based on passive observation, the system builds a profile of the artworks of interest for each visitor. These profiles of interest are then used to drive an interactive table that personalizes multimedia content delivery. The natural user interface on the interactive table uses the visitor’s profile, an ontology of museum content and a recommendation system to personalize exploration of multimedia content. At the end of their visit, the visitor can take home a personalized summary of their visit on a custom mobile application. In this article we describe in detail each component of our approach as well as the first field trials of our prototype system built and deployed at our permanent exhibition space at LeMurate (http://www.lemurate.comune.fi.it/lemurate/) in Florence together with the first results of the evaluation process during the official installation in the National Museum of Bargello (http://www.uffizi.firenze.it/musei/?m=bargello).
Address
Corporate Author Thesis
Publisher Springer US Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1380-7501 ISBN Medium
Area Expedition Conference
Notes LAMP; 601.240; 600.079 Approved no
Call Number Admin @ si @ KBL2016 Serial 2520
Permanent link to this record
 

 
Author Antonio Lopez; Joan Serrat; Cristina Cañero; Felipe Lumbreras; T. Graf
Title Robust lane markings detection and road geometry computation Type Journal Article
Year 2010 Publication International Journal of Automotive Technology Abbreviated Journal IJAT
Volume 11 Issue 3 Pages 395–407
Keywords lane markings
Abstract Detection of lane markings based on a camera sensor can be a low-cost solution to lane departure and curve-over-speed warnings. A number of methods and implementations have been reported in the literature. However, reliable detection is still an issue because of cast shadows, worn and occluded markings, variable ambient lighting conditions, for example. We focus on increasing detection reliability in two ways. First, we employed an image feature other than the commonly used edges: ridges, which we claim addresses this problem better. Second, we adapted RANSAC, a generic robust estimation method, to fit a parametric model of a pair of lane lines to the image features, based on both ridgeness and ridge orientation. In addition, the model was fitted for the left and right lane lines simultaneously to enforce a consistent result. Four measures of interest for driver assistance applications were directly computed from the fitted parametric model at each frame: lane width, lane curvature, and vehicle yaw angle and lateral offset with regard the lane medial axis. We qualitatively assessed our method in video sequences captured on several road types and under very different lighting conditions. We also quantitatively assessed it on synthetic but realistic video sequences for which road geometry and vehicle trajectory ground truth are known.
Address
Corporate Author Thesis
Publisher The Korean Society of Automotive Engineers Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1229-9138 ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number ADAS @ adas @ LSC2010 Serial 1300
Permanent link to this record
 

 
Author Mikhail Mozerov; Ignasi Rius; Xavier Roca; Jordi Gonzalez
Title Nonlinear synchronization for automatic learning of 3D pose variability in human motion sequences Type Journal Article
Year 2010 Publication EURASIP Journal on Advances in Signal Processing Abbreviated Journal EURASIPJ
Volume Issue Pages
Keywords
Abstract Article ID 507247
A dense matching algorithm that solves the problem of synchronizing prerecorded human motion sequences, which show different speeds and accelerations, is proposed. The approach is based on minimization of MRF energy and solves the problem by using Dynamic Programming. Additionally, an optimal sequence is automatically selected from the input dataset to be a time-scale pattern for all other sequences. The paper utilizes an action specific model which automatically learns the variability of 3D human postures observed in a set of training sequences. The model is trained using the public CMU motion capture dataset for the walking action, and a mean walking performance is automatically learnt. Additionally, statistics about the observed variability of the postures and motion direction are also computed at each time step. The synchronized motion sequences are used to learn a model of human motion for action recognition and full-body tracking purposes.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1110-8657 ISBN Medium
Area Expedition Conference
Notes ISE Approved no
Call Number ISE @ ise @ MRR2010 Serial 1208
Permanent link to this record
 

 
Author Sergio Escalera; Oriol Pujol; Petia Radeva; Jordi Vitria; Maria Teresa Anguera
Title Automatic Detection of Dominance and Expected Interest Type Journal Article
Year 2010 Publication EURASIP Journal on Advances in Signal Processing Abbreviated Journal EURASIPJ
Volume Issue Pages 12
Keywords
Abstract Article ID 491819
Social Signal Processing is an emergent area of research that focuses on the analysis of social constructs. Dominance and interest are two of these social constructs. Dominance refers to the level of influence a person has in a conversation. Interest, when referred in terms of group interactions, can be defined as the degree of engagement that the members of a group collectively display during their interaction. In this paper, we argue that only using behavioral motion information, we are able to predict the interest of observers when looking at face-to-face interactions as well as the dominant people. First, we propose a simple set of movement-based features from body, face, and mouth activity in order to define a higher set of interaction indicators. The considered indicators are manually annotated by observers. Based on the opinions obtained, we define an automatic binary dominance detection problem and a multiclass interest quantification problem. Error-Correcting Output Codes framework is used to learn to rank the perceived observer's interest in face-to-face interactions meanwhile Adaboost is used to solve the dominant detection problem. The automatic system shows good correlation between the automatic categorization results and the manual ranking made by the observers in both dominance and interest detection problems.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1110-8657 ISBN Medium
Area Expedition Conference
Notes OR;MILAB;HUPBA;MV Approved no
Call Number BCNPCL @ bcnpcl @ EPR2010d Serial 1283
Permanent link to this record
 

 
Author Ariel Amato; Mikhail Mozerov; Xavier Roca; Jordi Gonzalez
Title Robust Real-Time Background Subtraction Based on Local Neighborhood Patterns Type Journal Article
Year 2010 Publication EURASIP Journal on Advances in Signal Processing Abbreviated Journal EURASIPJ
Volume Issue Pages 7
Keywords
Abstract Article ID 901205
This paper describes an efficient background subtraction technique for detecting moving objects. The proposed approach is able to overcome difficulties like illumination changes and moving shadows. Our method introduces two discriminative features based on angular and modular patterns, which are formed by similarity measurement between two sets of RGB color vectors: one belonging to the background image and the other to the current image. We show how these patterns are used to improve foreground detection in the presence of moving shadows and in the case when there are strong similarities in color between background and foreground pixels. Experimental results over a collection of public and own datasets of real image sequences demonstrate that the proposed technique achieves a superior performance compared with state-of-the-art methods. Furthermore, both the low computational and space complexities make the presented algorithm feasible for real-time applications.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (down) 1110-8657 ISBN Medium
Area Expedition Conference
Notes ISE Approved no
Call Number ISE @ ise @ AMR2010 Serial 1463
Permanent link to this record