Publicacions CVC -- Query Results

[41–50] << 51 52 53 54 55 56 57 58 59 60 >> [61–70]

Details

	Records
	Author	David Aldavert; Marçal Rusiñol; Ricardo Toledo
	Title	Automatic Static/Variable Content Separation in Administrative Document Images			Type	Conference Article
	Year	2017	Publication	14th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In this paper we present an automatic method for separating static and variable content from administrative document images. An alignment approach is able to unsupervisedly build probabilistic templates from a set of examples of the same document kind. Such templates define which is the likelihood of every pixel of being either static or variable content. In the extraction step, the same alignment technique is used to match an incoming image with the template and to locate the positions where variable fields appear. We validate our approach on the public NIST Structured Tax Forms Dataset.
	Address	Kyoto; Japan; November 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; 600.084; 600.121			Approved	no
	Call Number	Admin @ si @ ART2017			Serial	3001
Permanent link to this record



	Author	David Aldavert; Marçal Rusiñol
	Title	Manuscript text line detection and segmentation using second-order derivatives analysis			Type	Conference Article
	Year	2018	Publication	13th IAPR International Workshop on Document Analysis Systems	Abbreviated Journal
	Volume		Issue		Pages	293 - 298
	Keywords	text line detection; text line segmentation; text region detection; second-order derivatives
	Abstract	In this paper, we explore the use of second-order derivatives to detect text lines on handwritten document images. Taking advantage that the second derivative gives a minimum response when a dark linear element over a bright background has the same orientation as the filter, we use this operator to create a map with the local orientation and strength of putative text lines in the document. Then, we detect line segments by selecting and merging the filter responses that have a similar orientation and scale. Finally, text lines are found by merging the segments that are within the same text region. The proposed segmentation algorithm, is learning-free while showing a performance similar to the state of the art methods in publicly available datasets.
	Address	Viena; Austria; April 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	DAS
	Notes	DAG; 600.084; 600.129; 302.065; 600.121			Approved	no
	Call Number	Admin @ si @ AlR2018a			Serial	3104
Permanent link to this record



	Author	David Aldavert; Marçal Rusiñol
	Title	Synthetically generated semantic codebook for Bag-of-Visual-Words based word spotting			Type	Conference Article
	Year	2018	Publication	13th IAPR International Workshop on Document Analysis Systems	Abbreviated Journal
	Volume		Issue		Pages	223 - 228
	Keywords	Word Spotting; Bag of Visual Words; Synthetic Codebook; Semantic Information
	Abstract	Word-spotting methods based on the Bag-ofVisual-Words framework have demonstrated a good retrieval performance even when used in a completely unsupervised manner. Although unsupervised approaches are suitable for large document collections due to the cost of acquiring labeled data, these methods also present some drawbacks. For instance, having to train a suitable “codebook” for a certain dataset has a high computational cost. Therefore, in this paper we present a database agnostic codebook which is trained from synthetic data. The aim of the proposed approach is to generate a codebook where the only information required is the type of script used in the document. The use of synthetic data also allows to easily incorporate semantic information in the codebook generation. So, the proposed method is able to determine which set of codewords have a semantic representation of the descriptor feature space. Experimental results show that the resulting codebook attains a state-of-the-art performance while having a more compact representation.
	Address	Viena; Austria; April 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	DAS
	Notes	DAG; 600.084; 600.129; 600.121			Approved	no
	Call Number	Admin @ si @ AlR2018b			Serial	3105
Permanent link to this record



	Author	David Aldavert
	Title	Efficient and Scalable Handwritten Word Spotting on Historical Documents using Bag of Visual Words			Type	Book Whole
	Year	2021	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Word spotting can be defined as the pattern recognition tasked aimed at locating and retrieving a specific keyword within a document image collection without explicitly transcribing the whole corpus. Its use is particularly interesting when applied in scenarios where Optical Character Recognition performs poorly or can not be used at all. This thesis focuses on such a scenario, word spotting on historical handwritten documents that have been written by a single author or by multiple authors with a similar calligraphy. This problem requires a visual signature that is robust to image artifacts, flexible to accommodate script variations and efficient to retrieve information in a rapid manner. For this, we have developed a set of word spotting methods that on their foundation use the well known Bag-of-Visual-Words (BoVW) representation. This representation has gained popularity among the document image analysis community to characterize handwritten words in an unsupervised manner. However, most approaches on this field rely on a basic BoVW configuration and disregard complex encoding and spatial representations. We determine which BoVW configurations provide the best performance boost to a spotting system. Then, we extend the segmentation-based word spotting, where word candidates are given a priori, to segmentation-free spotting. The proposed approach seeds the document images with overlapping word location candidates and characterizes them with a BoVW signature. Retrieval is achieved comparing the query and candidate signatures and returning the locations that provide a higher consensus. This is a simple but powerful approach that requires a more compact signature than in a segmentation-based scenario. We first project the BoVW signature into a reduced semantic topics space and then compress it further using Product Quantizers. The resulting signature only requires a few dozen bytes, allowing us to index thousands of pages on a common desktop computer. The final system still yields a performance comparable to the state-of-the-art despite all the information loss during the compression phases. Afterwards, we also study how to combine different modalities of information in order to create a query-by-X spotting system where, words are indexed using an information modality and queries are retrieved using another. We consider three different information modalities: visual, textual and audio. Our proposal is to create a latent feature space where features which are semantically related are projected onto the same topics. Creating thus a new feature space where information from different modalities can be compared. Later, we consider the codebook generation and descriptor encoding problem. The codebooks used to encode the BoVW signatures are usually created using an unsupervised clustering algorithm and, they require to test multiple parameters to determine which configuration is best for a certain document collection. We propose a semantic clustering algorithm which allows to estimate the best parameter from data. Since gather annotated data is costly, we use synthetically generated word images. The resulting codebook is database agnostic, i. e. a codebook that yields a good performance on document collections that use the same script. We also propose the use of an additional codebook to approximate descriptors and reduce the descriptor encoding complexity to sub-linear. Finally, we focus on the problem of signatures dimensionality. We propose a new symbol probability signature where each bin represents the probability that a certain symbol is present a certain location of the word image. This signature is extremely compact and combined with compression techniques can represent word images with just a few bytes per signature.
	Address	April 2021
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Marçal Rusiñol;Josep Llados
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-122714-5-4	Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ Ald2021			Serial	3601
Permanent link to this record



	Author	D. Perez; L. Tarazon; N. Serrano; F.M. Castro; Oriol Ramos Terrades; A. Juan
	Title	The GERMANA Database			Type	Conference Article
	Year	2009	Publication	10th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume		Issue		Pages	301-305
	Keywords
	Abstract	A new handwritten text database, GERMANA, is presented to facilitate empirical comparison of different approaches to text line extraction and off-line handwriting recognition. GERMANA is the result of digitising and annotating a 764-page Spanish manuscript from 1891, in which most pages only contain nearly calligraphed text written on ruled sheets of well-separated lines. To our knowledge, it is the first publicly available database for handwriting research, mostly written in Spanish and comparable in size to standard databases. Due to its sequential book structure, it is also well-suited for realistic assessment of interactive handwriting recognition systems. To provide baseline results for reference in future studies, empirical results are also reported, using standard techniques and tools for preprocessing, feature extraction, HMM-based image modelling, and language modelling.
	Address	Barcelona; Spain
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1520-5363	ISBN	978-1-4244-4500-4	Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ PTS2009			Serial	1870
Permanent link to this record



	Author	Clement Guerin; Christophe Rigaud; Karell Bertet; Jean-Christophe Burie; Arnaud Revel ; Jean-Marc Ogier
	Title	Réduction de l’espace de recherche pour les personnages de bandes dessinées			Type	Conference Article
	Year	2014	Publication	19th National Congress Reconnaissance de Formes et l'Intelligence Artificielle	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	contextual search; document analysis; comics characters
	Abstract	Les bandes dessinées représentent un patrimoine culturel important dans de nombreux pays et leur numérisation massive offre la possibilité d'effectuer des recherches dans le contenu des images. À ce jour, ce sont principalement les structures des pages et leurs contenus textuels qui ont été étudiés, peu de travaux portent sur le contenu graphique. Nous proposons de nous appuyer sur des éléments déjà étudiés tels que la position des cases et des bulles, pour réduire l'espace de recherche et localiser les personnages en fonction de la queue des bulles. L'évaluation de nos différentes contributions à partir de la base eBDtheque montre un taux de détection des queues de bulle de 81.2%, de localisation des personnages allant jusqu'à 85% et un gain d'espace de recherche de plus de 50%.
	Address	Rouen; Francia; July 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	RFIA
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ GRB2014			Serial	2480
Permanent link to this record



	Author	ChunYang; Xu Cheng Yin; Hong Yu; Dimosthenis Karatzas; Yu Cao
	Title	ICDAR2017 Robust Reading Challenge on Text Extraction from Biomedical Literature Figures (DeTEXT)			Type	Conference Article
	Year	2017	Publication	14th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume		Issue		Pages	1444-1447
	Keywords
	Abstract	Hundreds of millions of figures are available in the biomedical literature, representing important biomedical experimental evidence. Since text is a rich source of information in figures, automatically extracting such text may assist in the task of mining figure information and understanding biomedical documents. Unlike images in the open domain, biomedical figures present a variety of unique challenges. For example, biomedical figures typically have complex layouts, small font sizes, short text, specific text, complex symbols and irregular text arrangements. This paper presents the final results of the ICDAR 2017 Competition on Text Extraction from Biomedical Literature Figures (ICDAR2017 DeTEXT Competition), which aims at extracting (detecting and recognizing) text from biomedical literature figures. Similar to text extraction from scene images and web pictures, ICDAR2017 DeTEXT Competition includes three major tasks, i.e., text detection, cropped word recognition and end-to-end text recognition. Here, we describe in detail the data set, tasks, evaluation protocols and participants of this competition, and report the performance of the participating methods.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-1-5386-3586-5	Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ YCY2017			Serial	3098
Permanent link to this record



	Author	Christophe Rigaud; Dimosthenis Karatzas; Joost Van de Weijer; Jean-Christophe Burie; Jean-Marc Ogier
	Title	Automatic text localisation in scanned comic books			Type	Conference Article
	Year	2013	Publication	Proceedings of the International Conference on Computer Vision Theory and Applications	Abbreviated Journal
	Volume		Issue		Pages	814-819
	Keywords	Text localization; comics; text/graphic separation; complex background; unstructured document
	Abstract	Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent document understanding enable direct content-based search as opposed to metadata only search (e.g. album title or author name). Few studies have been done in this direction. In this work we detail a novel approach for the automatic text localization in scanned comics book pages, an essential step towards a fully automatic comics book understanding. We focus on speech text as it is semantically important and represents the majority of the text present in comics. The approach is compared with existing methods of text localization found in the literature and results are presented.
	Address	Barcelona; February 2013
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	VISAPP
	Notes	DAG; CIC; 600.056			Approved	no
	Call Number	Admin @ si @ RKW2013b			Serial	2261
Permanent link to this record



	Author	Christophe Rigaud; Dimosthenis Karatzas; Joost Van de Weijer; Jean-Christophe Burie; Jean-Marc Ogier
	Title	An active contour model for speech balloon detection in comics			Type	Conference Article
	Year	2013	Publication	12th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume		Issue		Pages	1240-1244
	Keywords
	Abstract	Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent comic book understanding would enable a variety of new applications, including content-based retrieval and content retargeting. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. Few studies have been done in this direction. In this work we detail a novel approach for closed and non-closed speech balloon localization in scanned comic book pages, an essential step towards a fully automatic comic book understanding. The approach is compared with existing methods for closed balloon localization found in the literature and results are presented.
	Address	washington; USA; August 2013
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1520-5363	ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; CIC; 600.056			Approved	no
	Call Number	Admin @ si @ RKW2013a			Serial	2260
Permanent link to this record



	Author	Christophe Rigaud; Dimosthenis Karatzas; Jean-Christophe Burie; Jean-Marc Ogier
	Title	Speech balloon contour classification in comics			Type	Conference Article
	Year	2013	Publication	10th IAPR International Workshop on Graphics Recognition	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Comic books digitization combined with subsequent comic book understanding create a variety of new applications, including mobile reading and data mining. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. In this work we detail a novel approach for classifying speech balloon in scanned comics book pages based on their contour time series.
	Address	Bethlehem; PA; USA; August 2013
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	GREC
	Notes	DAG; 600.056			Approved	no
	Call Number	Admin @ si @ RKB2013			Serial	2429
Permanent link to this record

Select All Deselect All

[41–50] << 51 52 53 54 55 56 57 58 59 60 >> [61–70]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: