Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	106–120 of 148 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >>

List View

Citations

Details

	Records
	Author	E. Bondi ; L. Sidenari; Andrew Bagdanov; Alberto del Bimbo
	Title	Real-time people counting from depth imagery of crowded environments			Type	Conference Article
	Year	2014	Publication	11th IEEE International Conference on Advanced Video and Signal based Surveillance	Abbreviated Journal
	Volume		Issue		Pages	337 - 342
	Keywords
	Abstract	In this paper we describe a system for automatic people counting in crowded environments. The approach we propose is a counting-by-detection method based on depth imagery. It is designed to be deployed as an autonomous appliance for crowd analysis in video surveillance application scenarios. Our system performs foreground/background segmentation on depth image streams in order to coarsely segment persons, then depth information is used to localize head candidates which are then tracked in time on an automatically estimated ground plane. The system runs in real-time, at a frame-rate of about 20 fps. We collected a dataset of RGB-D sequences representing three typical and challenging surveillance scenarios, including crowds, queuing and groups. An extensive comparative evaluation is given between our system and more complex, Latent SVM-based head localization for person counting applications.
	Address	Seoul; Korea; August 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	AVSS
	Notes	LAMP; 600.079			Approved	no
	Call Number	Admin @ si @ BSB2014			Serial	2540
Permanent link to this record



	Author	Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
	Title	Spotting Symbol Using Sparsity over Learned Dictionary of Local Descriptors			Type	Conference Article
	Year	2014	Publication	11th IAPR International Workshop on Document Analysis and Systems	Abbreviated Journal
	Volume		Issue		Pages	156-160
	Keywords
	Abstract	This paper proposes a new approach to spot symbols into graphical documents using sparse representations. More specifically, a dictionary is learned from a training database of local descriptors defined over the documents. Following their sparse representations, interest points sharing similar properties are used to define interest regions. Using an original adaptation of information retrieval techniques, a vector model for interest regions and for a query symbol is built based on its sparsity in a visual vocabulary where the visual words are columns in the learned dictionary. The matching process is performed comparing the similarity between vector models. Evaluation on SESYD datasets demonstrates that our method is promising.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-1-4799-3243-6	Medium
	Area		Expedition		Conference	DAS
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ DTR2014			Serial	2543
Permanent link to this record



	Author	Marçal Rusiñol; J. Chazalon; Jean-Marc Ogier
	Title	Combining Focus Measure Operators to Predict OCR Accuracy in Mobile-Captured Document Images			Type	Conference Article
	Year	2014	Publication	11th IAPR International Workshop on Document Analysis and Systems	Abbreviated Journal
	Volume		Issue		Pages	181 - 185
	Keywords
	Abstract	Mobile document image acquisition is a new trend raising serious issues in business document processing workflows. Such digitization procedure is unreliable, and integrates many distortions which must be detected as soon as possible, on the mobile, to avoid paying data transmission fees, and losing information due to the inability to re-capture later a document with temporary availability. In this context, out-of-focus blur is major issue: users have no direct control over it, and it seriously degrades OCR recognition. In this paper, we concentrate on the estimation of focus quality, to ensure a sufficient legibility of a document image for OCR processing. We propose two contributions to improve OCR accuracy prediction for mobile-captured document images. First, we present 24 focus measures, never tested on document images, which are fast to compute and require no training. Second, we show that a combination of those measures enables state-of-the art performance regarding the correlation with OCR accuracy. The resulting approach is fast, robust, and easy to implement in a mobile device. Experiments are performed on a public dataset, and precise details about image processing are given.
	Address	Tours; France; April 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-1-4799-3243-6	Medium
	Area		Expedition		Conference	DAS
	Notes	DAG; 601.223; 600.077			Approved	no
	Call Number	Admin @ si @ RCO2014a			Serial	2545
Permanent link to this record



	Author	Marçal Rusiñol; J. Chazalon; Jean-Marc Ogier
	Title	Normalisation et validation d'images de documents capturées en mobilité			Type	Conference Article
	Year	2014	Publication	Colloque International Francophone sur l'Écrit et le Document	Abbreviated Journal
	Volume		Issue		Pages	109-124
	Keywords	mobile document image acquisition; perspective correction; illumination correction; quality assessment; focus measure; OCR accuracy prediction
	Abstract	Mobile document image acquisition integrates many distortions which must be corrected or detected on the device, before the document becomes unavailable or paying data transmission fees. In this paper, we propose a system to correct perspective and illumination issues, and estimate the sharpness of the image for OCR recognition. The correction step relies on fast and accurate border detection followed by illumination normalization. Its evaluation on a private dataset shows a clear improvement on OCR accuracy. The quality assessment step relies on a combination of focus measures. Its evaluation on a public dataset shows that this simple method compares well to state of the art, learning-based methods which cannot be embedded on a mobile, and outperforms metric-based methods.
	Address	Nancy; France; March 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CIFED
	Notes	DAG; 601.223; 600.077			Approved	no
	Call Number	Admin @ si @ RCO2014b			Serial	2546
Permanent link to this record



	Author	Frederic Sampedro; Anna Domenech; Sergio Escalera
	Title	Static and dynamic computational cancer spread quantification in whole body FDG-PET/CT scans			Type	Journal Article
	Year	2014	Publication	Journal of Medical Imaging and Health Informatics	Abbreviated Journal	JMIHI
	Volume	4	Issue	6	Pages	825-831
	Keywords	CANCER SPREAD; COMPUTER AIDED DIAGNOSIS; MEDICAL IMAGING; TUMOR QUANTIFICATION
	Abstract	In this work we address the computational cancer spread quantification scenario in whole body FDG-PET/CT scans. At the static level, this setting can be modeled as a clustering problem on the set of 3D connected components of the whole body PET tumoral segmentation mask carried out by nuclear medicine physicians. At the dynamic level, and ad-hoc algorithm is proposed in order to quantify the cancer spread time evolution which, when combined with other existing indicators, gives rise to the metabolic tumor volume-aggressiveness-spread time evolution chart, a novel tool that we claim that would prove useful in nuclear medicine and oncological clinical or research scenarios. Good performance results of the proposed methodologies both at the clinical and technological level are shown using a dataset of 48 segmented whole body FDG-PET/CT scans.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ SDE2014b			Serial	2548
Permanent link to this record



	Author	Frederic Sampedro; Sergio Escalera; Anna Puig
	Title	Iterative Multiclass Multiscale Stacked Sequential Learning: definition and application to medical volume segmentation			Type	Journal Article
	Year	2014	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	46	Issue		Pages	1-10
	Keywords	Machine learning; Sequential learning; Multi-class problems; Contextual learning; Medical volume segmentation
	Abstract	In this work we present the iterative multi-class multi-scale stacked sequential learning framework (IMMSSL), a novel learning scheme that is particularly suited for medical volume segmentation applications. This model exploits the inherent voxel contextual information of the structures of interest in order to improve its segmentation performance results. Without any feature set or learning algorithm prior assumption, the proposed scheme directly seeks to learn the contextual properties of a region from the predicted classifications of previous classifiers within an iterative scheme. Performance results regarding segmentation accuracy in three two-class and multi-class medical volume datasets show a significant improvement with respect to state of the art alternatives. Due to its easiness of implementation and its independence of feature space and learning algorithm, the presented machine learning framework could be taken into consideration as a first choice in complex volume segmentation scenarios.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ SEP2014			Serial	2550
Permanent link to this record



	Author	Marc Bolaños; Maite Garolera; Petia Radeva
	Title	Video Segmentation of Life-Logging Videos			Type	Conference Article
	Year	2014	Publication	8th Conference on Articulated Motion and Deformable Objects	Abbreviated Journal
	Volume	8563	Issue		Pages	1-9
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	AMDO
	Notes	MILAB			Approved	no
	Call Number	Admin @ si @ BGR2014			Serial	2558
Permanent link to this record



	Author	Francisco Blanco; Felipe Lumbreras; Joan Serrat; Roswitha Siener; Silvia Serranti; Giuseppe Bonifazi; Montserrat Lopez Mesas; Manuel Valiente
	Title	Taking advantage of Hyperspectral Imaging classification of urinary stones against conventional IR Spectroscopy			Type	Journal Article
	Year	2014	Publication	Journal of Biomedical Optics	Abbreviated Journal	JBiO
	Volume	19	Issue	12	Pages	126004-1 - 126004-9
	Keywords
	Abstract	The analysis of urinary stones is mandatory for the best management of the disease after the stone passage in order to prevent further stone episodes. Thus the use of an appropriate methodology for an individualized stone analysis becomes a key factor for giving the patient the most suitable treatment. A recently developed hyperspectral imaging methodology, based on pixel-to-pixel analysis of near-infrared spectral images, is compared to the reference technique in stone analysis, infrared (IR) spectroscopy. The developed classification model yields >90% correct classification rate when compared to IR and is able to precisely locate stone components within the structure of the stone with a 15 µm resolution. Due to the little sample pretreatment, low analysis time, good performance of the model, and the automation of the measurements, they become analyst independent; this methodology can be considered to become a routine analysis for clinical laboratories.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS; 600.076			Approved	no
	Call Number	Admin @ si @ BLS2014			Serial	2563
Permanent link to this record



	Author	P. Wang; V. Eglin; C. Garcia; C. Largeron; Josep Llados; Alicia Fornes
	Title	Représentation par graphe de mots manuscrits dans les images pour la recherche par similarité			Type	Conference Article
	Year	2014	Publication	Colloque International Francophone sur l'Écrit et le Document	Abbreviated Journal
	Volume		Issue		Pages	233-248
	Keywords	word spotting; graph-based representation; shape context description; graph edit distance; DTW; block merging; query by example
	Abstract	Effective information retrieval on handwritten document images has always been a challenging task. In this paper, we propose a novel handwritten word spotting approach based on graph representation. The presented model comprises both topological and morphological signatures of handwriting. Skeleton-based graphs with the Shape Context labeled vertexes are established for connected components. Each word image is represented as a sequence of graphs. In order to be robust to the handwriting variations, an exhaustive merging process based on DTW alignment results introduced in the similarity measure between word images. With respect to the computation complexity, an approximate graph edit distance approach using bipartite matching is employed for graph matching. The experiments on the George Washington dataset and the marriage records from the Barcelona Cathedral dataset demonstrate that the proposed approach outperforms the state-of-the-art structural methods.
	Address	Nancy; Francia; March 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CIFED
	Notes	DAG; 600.061; 602.006; 600.077			Approved	no
	Call Number	Admin @ si @ WEG2014c			Serial	2564
Permanent link to this record



	Author	Michal Drozdzal; Jordi Vitria; Santiago Segui; Carolina Malagelada; Fernando Azpiroz; Petia Radeva
	Title	Intestinal event segmentation for endoluminal video analysis			Type	Conference Article
	Year	2014	Publication	21st IEEE International Conference on Image Processing	Abbreviated Journal
	Volume		Issue		Pages	3592 - 3596
	Keywords
	Abstract
	Address	Paris; Francia; October 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIP
	Notes	MILAB; OR;MV			Approved	no
	Call Number	Admin @ si @ DVS2014			Serial	2565
Permanent link to this record



	Author	Gabriel Villalonga; Sebastian Ramos; German Ros; David Vazquez; Antonio Lopez
	Title	3d Pedestrian Detection via Random Forest			Type	Miscellaneous
	Year	2014	Publication	European Conference on Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	231-238
	Keywords	Pedestrian Detection
	Abstract	Our demo focuses on showing the extraordinary performance of our novel 3D pedestrian detector along with its simplicity and real-time capabilities. This detector has been designed for autonomous driving applications, but it can also be applied in other scenarios that cover both outdoor and indoor applications. Our pedestrian detector is based on the combination of a random forest classifier with HOG-LBP features and the inclusion of a preprocessing stage based on 3D scene information in order to precisely determinate the image regions where the detector should search for pedestrians. This approach ends up in a high accurate system that runs real-time as it is required by many computer vision and robotics applications.
	Address	Zurich; suiza; September 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCV-Demo
	Notes	ADAS; 600.076			Approved	no
	Call Number	Admin @ si @ VRR2014			Serial	2570
Permanent link to this record



	Author	Antonio Clavelli
	Title	A computational model of eye guidance, searching for text in real scene images			Type	Book Whole
	Year	2014	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Searching for text objects in real scene images is an open problem and a very active computer vision research area. A large number of methods have been proposed tackling the text search as extension of the ones from the document analysis field or inspired by general purpose object detection methods. However the general problem of object search in real scene images remains an extremely challenging problem due to the huge variability in object appearance. This thesis builds on top of the most recent findings in the visual attention literature presenting a novel computational model of eye guidance aiming to better describe text object search in real scene images. First are presented the relevant state-of-the-art results from the visual attention literature regarding eye movements and visual search. Relevant models of attention are discussed and integrated with recent observations on the role of top-down constraints and the emerging need for a layered model of attention in which saliency is not the only factor guiding attention. Visual attention is then explained by the interaction of several modulating factors, such as objects, value, plans and saliency. Then we introduce our probabilistic formulation of attention deployment in real scene. The model is based on the rationale that oculomotor control depends on two interacting but distinct processes: an attentional process that assigns value to the sources of information and motor process that flexibly links information with action. In such framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the reward of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects. In the experimental section the model is tested in laboratory condition, comparing model simulations with data from eye tracking experiments. The comparison is qualitative in terms of observable scan paths and quantitative in terms of statistical similarity of gaze shift amplitude. Experiments are performed using eye tracking data from both a publicly available dataset of face and text and from newly performed eye-tracking experiments on a dataset of street view pictures containing text. The last part of this thesis is dedicated to study the extent to which the proposed model can account for human eye movements in a low constrained setting. We used a mobile eye tracking device and an ad-hoc developed methodology to compare model simulated eye data with the human eye data from mobile eye tracking recordings. Such setting allow to test the model in an incomplete visual information condition, reproducing a close to real-life search task.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Dimosthenis Karatzas;Giuseppe Boccignone;Josep Llados
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-940902-6-4	Medium
	Area		Expedition		Conference
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ Cla2014			Serial	2571
Permanent link to this record



	Author	Jon Almazan
	Title	Learning to Represent Handwritten Shapes and Words for Matching and Recognition			Type	Book Whole
	Year	2014	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Writing is one of the most important forms of communication and for centuries, handwriting had been the most reliable way to preserve knowledge. However, despite the recent development of printing houses and electronic devices, handwriting is still broadly used for taking notes, doing annotations, or sketching ideas. Transferring the ability of understanding handwritten text or recognizing handwritten shapes to computers has been the goal of many researches due to its huge importance for many different fields. However, designing good representations to deal with handwritten shapes, e.g. symbols or words, is a very challenging problem due to the large variability of these kinds of shapes. One of the consequences of working with handwritten shapes is that we need representations to be robust, i.e., able to adapt to large intra-class variability. We need representations to be discriminative, i.e., able to learn what are the differences between classes. And, we need representations to be efficient, i.e., able to be rapidly computed and compared. Unfortunately, current techniques of handwritten shape representation for matching and recognition do not fulfill some or all of these requirements. Through this thesis we focus on the problem of learning to represent handwritten shapes aimed at retrieval and recognition tasks. Concretely, on the first part of the thesis, we focus on the general problem of representing any kind of handwritten shape. We first present a novel shape descriptor based on a deformable grid that deals with large deformations by adapting to the shape and where the cells of the grid can be used to extract different features. Then, we propose to use this descriptor to learn statistical models, based on the Active Appearance Model, that jointly learns the variability in structure and texture of a given class. Then, on the second part, we focus on a concrete application, the problem of representing handwritten words, for the tasks of word spotting, where the goal is to find all instances of a query word in a dataset of images, and recognition. First, we address the segmentation-free problem and propose an unsupervised, sliding-window-based approach that achieves state-of- the-art results in two public datasets. Second, we address the more challenging multi-writer problem, where the variability in words exponentially increases. We describe an approach in which both word images and text strings are embedded in a common vectorial subspace, and where those that represent the same word are close together. This is achieved by a combination of label embedding and attributes learning, and a common subspace regression. This leads to a low-dimensional, unified representation of word images and strings, resulting in a method that allows one to perform either image and text searches, as well as image transcription, in a unified framework. We evaluate our methods on different public datasets of both handwritten documents and natural images showing results comparable or better than the state-of-the-art on spotting and recognition tasks.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Ernest Valveny;Alicia Fornes
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ Alm2014			Serial	2572
Permanent link to this record



	Author	David Fernandez
	Title	Contextual Word Spotting in Historical Handwritten Documents			Type	Book Whole
	Year	2014	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	There are countless collections of historical documents in archives and libraries that contain plenty of valuable information for historians and researchers. The extraction of this information has become a central task among the Document Analysis researches and practitioners. There is an increasing interest to digital preserve and provide access to these kind of documents. But only the digitalization is not enough for the researchers. The extraction and/or indexation of information of this documents has had an increased interest among researchers. In many cases, and in particular in historical manuscripts, the full transcription of these documents is extremely dicult due the inherent deciencies: poor physical preservation, dierent writing styles, obsolete languages, etc. Word spotting has become a popular an ecient alternative to full transcription. It inherently involves a high level of degradation in the images. The search of words is holistically formulated as a visual search of a given query shape in a larger image, instead of recognising the input text and searching the query word with an ascii string comparison. But the performance of classical word spotting approaches depend on the degradation level of the images being unacceptable in many cases . In this thesis we have proposed a novel paradigm called contextual word spotting method that uses the contextual/semantic information to achieve acceptable results whereas classical word spotting does not reach. The contextual word spotting framework proposed in this thesis is a segmentation-based word spotting approach, so an ecient word segmentation is needed. Historical handwritten documents present some common diculties that can increase the diculties the extraction of the words. We have proposed a line segmentation approach that formulates the problem as nding the central part path in the area between two consecutive lines. This is solved as a graph traversal problem. A path nding algorithm is used to nd the optimal path in a graph, previously computed, between the text lines. Once the text lines are extracted, words are localized inside the text lines using a word segmentation technique from the state of the art. Classical word spotting approaches can be improved using the contextual information of the documents. We have introduced a new framework, oriented to handwritten documents that present a highly structure, to extract information making use of context. The framework is an ecient tool for semi-automatic transcription that uses the contextual information to achieve better results than classical word spotting approaches. The contextual information is automatically discovered by recognizing repetitive structures and categorizing all the words according to semantic classes. The most frequent words in each semantic cluster are extracted and the same text is used to transcribe all them. The experimental results achieved in this thesis outperform classical word spotting approaches demonstrating the suitability of the proposed ensemble architecture for spotting words in historical handwritten documents using contextual information.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Josep Llados;Alicia Fornes
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-940902-7-1	Medium
	Area		Expedition		Conference
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ Fer2014			Serial	2573
Permanent link to this record



	Author	Lluis Pere de las Heras
	Title	Relational Models for Visual Understanding of Graphical Documents. Application to Architectural Drawings.			Type	Book Whole
	Year	2014	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Graphical documents express complex concepts using a visual language. This language consists of a vocabulary (symbols) and a syntax (structural relations between symbols) that articulate a semantic meaning in a certain context. Therefore, the automatic interpretation by computers of these sort of documents entails three main steps: the detection of the symbols, the extraction of the structural relations between these symbols, and the modeling of the knowledge that permits the extraction of the semantics. Dierent domains in graphical documents include: architectural and engineering drawings, maps, owcharts, etc. Graphics Recognition in particular and Document Image Analysis in general are born from the industrial need of interpreting a massive amount of digitalized documents after the emergence of the scanner. Although many years have passed, the graphical document understanding problem still seems to be far from being solved. The main reason is that the vast majority of the systems in the literature focus on very specic problems, where the domain of the document dictates the implementation of the interpretation. As a result, it is dicult to reuse these strategies on dierent data and on dierent contexts, hindering thus the natural progress in the eld. In this thesis, we face the graphical document understanding problem by proposing several relational models at dierent levels that are designed from a generic perspective. Firstly, we introduce three dierent strategies for the detection of symbols. The first method tackles the problem structurally, wherein general knowledge of the domain guides the detection. The second is a statistical method that learns the graphical appearance of the symbols and easily adapts to the big variability of the problem. The third method is a combination of the previous two methods that inherits their respective strengths, i.e. copes the big variability and does not need annotated data. Secondly, we present two relational strategies that tackle the problem of the visual context extraction. The first one is a full bottom up method that heuristically searches in a graph representation the contextual relations between symbols. Contrarily, the second is syntactic method that models probabilistically the structure of the documents. It automatically learns the model, which guides the inference algorithm to encounter the best structural representation for a given input. Finally, we construct a knowledge-based model consisting of an ontological denition of the domain and real data. This model permits to perform contextual reasoning and to detect semantic inconsistencies within the data. We evaluate the suitability of the proposed contributions in the framework of floor plan interpretation. Since there is no standard in the modeling of these documents there exists an enormous notation variability from plan to plan in terms of vocabulary and syntax. Therefore, floor plan interpretation is a relevant task in the graphical document understanding problem. It is also worth to mention that we make freely available all the resources used in this thesis {the data, the tool used to generate the data, and the evaluation scripts{ with the aim of fostering research in the graphical document understanding task.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Gemma Sanchez
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-940902-8-8	Medium
	Area		Expedition		Conference
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ Her2014			Serial	2574
Permanent link to this record