Publicacions CVC -- Query Results

[21–30] << 31 32 33 34 35 36 37 38 39 40 >> [41–50]

Details

Records
Author	Christophe Rigaud; Clement Guerin; Dimosthenis Karatzas; Jean-Christophe Burie; Jean-Marc Ogier
Title	Knowledge-driven understanding of images in comic books			Type	Journal Article
Year	2015	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
Volume	18	Issue	3	Pages	199-221
Keywords	Document Understanding; comics analysis; expert system
Abstract	Document analysis is an active field of research, which can attain a complete understanding of the semantics of a given document. One example of the document understanding process is enabling a computer to identify the key elements of a comic book story and arrange them according to a predefined domain knowledge. In this study, we propose a knowledge-driven system that can interact with bottom-up and top-down information to progressively understand the content of a document. We model the comic book’s and the image processing domains knowledge for information consistency analysis. In addition, different image processing methods are improved or developed to extract panels, balloons, tails, texts, comic characters and their semantic relations in an unsupervised way.
Address
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1433-2833	ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.056; 600.077			Approved	no
Call Number	RGK2015			Serial	2595
Permanent link to this record



Author	Volkmar Frinken; Andreas Fischer; Markus Baumgartner; Horst Bunke
Title	Keyword spotting for self-training of BLSTM NN based handwriting recognition systems			Type	Journal Article
Year	2014	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	47	Issue	3	Pages	1073-1082
Keywords	Document retrieval; Keyword spotting; Handwriting recognition; Neural networks; Semi-supervised learning
Abstract	The automatic transcription of unconstrained continuous handwritten text requires well trained recognition systems. The semi-supervised paradigm introduces the concept of not only using labeled data but also unlabeled data in the learning process. Unlabeled data can be gathered at little or not cost. Hence it has the potential to reduce the need for labeling training data, a tedious and costly process. Given a weak initial recognizer trained on labeled data, self-training can be used to recognize unlabeled data and add words that were recognized with high confidence to the training set for re-training. This process is not trivial and requires great care as far as selecting the elements that are to be added to the training set is concerned. In this paper, we propose to use a bidirectional long short-term memory neural network handwritten recognition system for keyword spotting in order to select new elements. A set of experiments shows the high potential of self-training for bootstrapping handwriting recognition systems, both for modern and historical handwritings, and demonstrate the benefits of using keyword spotting over previously published self-training schemes.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.077; 602.101			Approved	no
Call Number	Admin @ si @ FFB2014			Serial	2297
Permanent link to this record



Author	Ayan Banerjee; Sanket Biswas; Josep Llados; Umapada Pal
Title	SemiDocSeg: Harnessing Semi-Supervised Learning for Document Layout Analysis			Type	Journal Article
Year	2024	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
Volume		Issue		Pages
Keywords	Document layout analysis; Semi-supervised learning; Co-Occurrence matrix; Instance segmentation; Swin transformer
Abstract	Document Layout Analysis (DLA) is the process of automatically identifying and categorizing the structural components (e.g. Text, Figure, Table, etc.) within a document to extract meaningful content and establish the page's layout structure. It is a crucial stage in document parsing, contributing to their comprehension. However, traditional DLA approaches often demand a significant volume of labeled training data, and the labor-intensive task of generating high-quality annotated training data poses a substantial challenge. In order to address this challenge, we proposed a semi-supervised setting that aims to perform learning on limited annotated categories by eliminating exhaustive and expensive mask annotations. The proposed setting is expected to be generalizable to novel categories as it learns the underlying positional information through a support set and class information through Co-Occurrence that can be generalized from annotated categories to novel categories. Here, we first extract features from the input image and support set with a shared multi-scale feature acquisition backbone. Then, the extracted feature representation is fed to the transformer encoder as a query. Later on, we utilize a semantic embedding network before the decoder to capture the underlying semantic relationships and similarities between different instances, enabling the model to make accurate predictions or classifications with only a limited amount of labeled data. Extensive experimentation on competitive benchmarks like PRIMA, DocLayNet, and Historical Japanese (HJ) demonstrate that this generalized setup obtains significant performance compared to the conventional supervised approach.
Address	June 2024
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG			Approved	no
Call Number	Admin @ si @ BBL2024a			Serial	4001
Permanent link to this record



Author	Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados
Title	Automatic Verification of Properly Signed Multi-page Document Images			Type	Conference Article
Year	2015	Publication	Proceedings of the Eleventh International Symposium on Visual Computing	Abbreviated Journal
Volume	9475	Issue		Pages	327-336
Keywords	Document Image; Manual Inspection; Signature Verification; Rejection Criterion; Document Flow
Abstract	In this paper we present an industrial application for the automatic screening of incoming multi-page documents in a banking workflow aimed at determining whether these documents are properly signed or not. The proposed method is divided in three main steps. First individual pages are classified in order to identify the pages that should contain a signature. In a second step, we segment within those key pages the location where the signatures should appear. The last step checks whether the signatures are present or not. Our method is tested in a real large-scale environment and we report the results when checking two different types of real multi-page contracts, having in total more than 14,500 pages.
Address	Las Vegas, Nevada, USA; December 2015
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume	9475	Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ISVC
Notes	DAG; 600.077			Approved	no
Call Number	Admin @ si @			Serial	3189
Permanent link to this record



Author	Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados
Title	Fast Structural Matching for Document Image Retrieval through Spatial Databases			Type	Conference Article
Year	2014	Publication	Document Recognition and Retrieval XXI	Abbreviated Journal
Volume	9021	Issue		Pages
Keywords	Document image retrieval; distance transform; MSER; spatial database
Abstract	The structure of document images plays a signicant role in document analysis thus considerable eorts have been made towards extracting and understanding document structure, usually in the form of layout analysis approaches. In this paper, we rst employ Distance Transform based MSER (DTMSER) to eciently extract stable document structural elements in terms of a dendrogram of key-regions. Then a fast structural matching method is proposed to query the structure of document (dendrogram) based on a spatial database which facilitates the formulation of advanced spatial queries. The experiments demonstrate a signicant improvement in a document retrieval scenario when compared to the use of typical Bag of Words (BoW) and pyramidal BoW descriptors.
Address	Amsterdam; September 2014
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	SPIE-DRR
Notes	DAG; 600.056; 600.061; 600.077			Approved	no
Call Number	Admin @ si @ GRK2014a			Serial	2496
Permanent link to this record



Author	David Fernandez; Josep Llados; Alicia Fornes; R.Manmatha
Title	On Influence of Line Segmentation in Efficient Word Segmentation in Old Manuscripts			Type	Conference Article
Year	2012	Publication	13th International Conference on Frontiers in Handwriting Recognition	Abbreviated Journal
Volume		Issue		Pages	763-768
Keywords	document image processing;handwritten character recognition;history;image segmentation;Spanish document;historical document;line segmentation;old handwritten document;old manuscript;word segmentation;Bifurcation;Dynamic programming;Handwriting recognition;Image segmentation;Measurement;Noise;Skeleton;Segmentation;document analysis;document and text processing;handwriting analysis;heuristics;path-finding
Abstract	he objective of this work is to show the importance of a good line segmentation to obtain better results in the segmentation of words of historical documents. We have used the approach developed by Manmatha and Rothfeder [1] to segment words in old handwritten documents. In their work the lines of the documents are extracted using projections. In this work, we have developed an approach to segment lines more efficiently. The new line segmentation algorithm tackles with skewed, touching and noisy lines, so it is significantly improves word segmentation. Experiments using Spanish documents from the Marriages Database of the Barcelona Cathedral show that this approach reduces the error rate by more than 20%
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4673-2262-1	Medium
Area		Expedition		Conference	ICFHR
Notes	DAG			Approved	no
Call Number	Admin @ si @ FLF2012			Serial	2200
Permanent link to this record



Author	Juan Ignacio Toledo; Sebastian Sudholt; Alicia Fornes; Jordi Cucurull; A. Fink; Josep Llados
Title	Handwritten Word Image Categorization with Convolutional Neural Networks and Spatial Pyramid Pooling			Type	Conference Article
Year	2016	Publication	Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)	Abbreviated Journal
Volume	10029	Issue		Pages	543-552
Keywords	Document image analysis; Word image categorization; Convolutional neural networks; Named entity detection
Abstract	The extraction of relevant information from historical document collections is one of the key steps in order to make these documents available for access and searches. The usual approach combines transcription and grammars in order to extract semantically meaningful entities. In this paper, we describe a new method to obtain word categories directly from non-preprocessed handwritten word images. The method can be used to directly extract information, being an alternative to the transcription. Thus it can be used as a first step in any kind of syntactical analysis. The approach is based on Convolutional Neural Networks with a Spatial Pyramid Pooling layer to deal with the different shapes of the input images. We performed the experiments on a historical marriage record dataset, obtaining promising results.
Address	Merida; Mexico; December 2016
Corporate Author				Thesis
Publisher	Springer International Publishing	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-319-49054-0	Medium
Area		Expedition		Conference	S+SSPR
Notes	DAG; 600.097; 602.006			Approved	no
Call Number	Admin @ si @ TSF2016			Serial	2877
Permanent link to this record



Author	Francisco Alvaro; Francisco Cruz; Joan Andreu Sanchez; Oriol Ramos Terrades; Jose Miguel Benedi
Title	Structure Detection and Segmentation of Documents Using 2D Stochastic Context-Free Grammars			Type	Journal Article
Year	2015	Publication	Neurocomputing	Abbreviated Journal	NEUCOM
Volume	150	Issue	A	Pages	147-154
Keywords	document image analysis; stochastic context-free grammars; text classication features
Abstract	In this paper we dene a bidimensional extension of Stochastic Context-Free Grammars for structure detection and segmentation of images of documents. Two sets of text classication features are used to perform an initial classication of each zone of the page. Then, the document segmentation is obtained as the most likely hypothesis according to a stochastic grammar. We used a dataset of historical marriage license books to validate this approach. We also tested several inference algorithms for Probabilistic Graphical Models and the results showed that the proposed grammatical model outperformed the other methods. Furthermore, grammars also provide the document structure along with its segmentation.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 601.158; 600.077; 600.061			Approved	no
Call Number	Admin @ si @ ACS2015			Serial	2531
Permanent link to this record



Author	Juan Ignacio Toledo; Manuel Carbonell; Alicia Fornes; Josep Llados
Title	Information Extraction from Historical Handwritten Document Images with a Context-aware Neural Model			Type	Journal Article
Year	2019	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	86	Issue		Pages	27-36
Keywords	Document image analysis; Handwritten documents; Named entity recognition; Deep neural networks
Abstract	Many historical manuscripts that hold trustworthy memories of the past societies contain information organized in a structured layout (e.g. census, birth or marriage records). The precious information stored in these documents cannot be effectively used nor accessed without costly annotation efforts. The transcription driven by the semantic categories of words is crucial for the subsequent access. In this paper we describe an approach to extract information from structured historical handwritten text images and build a knowledge representation for the extraction of meaning out of historical data. The method extracts information, such as named entities, without the need of an intermediate transcription step, thanks to the incorporation of context information through language models. Our system has two variants, the first one is based on bigrams, whereas the second one is based on recurrent neural networks. Concretely, our second architecture integrates a Convolutional Neural Network to model visual information from word images together with a Bidirecitonal Long Short Term Memory network to model the relation among the words. This integrated sequential approach is able to extract more information than just the semantic category (e.g. a semantic category can be associated to a person in a record). Our system is generic, it deals with out-of-vocabulary words by design, and it can be applied to structured handwritten texts from different domains. The method has been validated with the ICDAR IEHHR competition protocol, outperforming the existing approaches.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.097; 601.311; 603.057; 600.084; 600.140; 600.121			Approved	no
Call Number	Admin @ si @ TCF2019			Serial	3166
Permanent link to this record



Author	Juan Ignacio Toledo; Jordi Cucurull; Jordi Puiggali; Alicia Fornes; Josep Llados
Title	Document Analysis Techniques for Automatic Electoral Document Processing: A Survey			Type	Conference Article
Year	2015	Publication	E-Voting and Identity, Proceedings of 5th international conference, VoteID 2015	Abbreviated Journal
Volume		Issue		Pages	139-141
Keywords	Document image analysis; Computer vision; Paper ballots; Paper based elections; Optical scan; Tally
Abstract	In this paper, we will discuss the most common challenges in electoral document processing and study the different solutions from the document analysis community that can be applied in each case. We will cover Optical Mark Recognition techniques to detect voter selections in the Australian Ballot, handwritten number recognition for preferential elections and handwriting recognition for write-in areas. We will also propose some particular adjustments that can be made to those general techniques in the specific context of electoral documents.
Address	Bern; Switzerland; September 2015
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	VoteID
Notes	DAG; 600.061; 602.006; 600.077			Approved	no
Call Number	Admin @ si @ TCP2015			Serial	2641
Permanent link to this record



Author	Marçal Rusiñol; Agnes Borras; Josep Llados
Title	Relational Indexing of Vectorial Primitives for Symbol Spotting in Line-Drawing Images			Type	Journal Article
Year	2010	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	31	Issue	3	Pages	188–201
Keywords	Document image analysis and recognition, Graphics recognition, Symbol spotting ,Vectorial representations, Line-drawings
Abstract	This paper presents a symbol spotting approach for indexing by content a database of line-drawing images. As line-drawings are digital-born documents designed by vectorial softwares, instead of using a pixel-based approach, we present a spotting method based on vector primitives. Graphical symbols are represented by a set of vectorial primitives which are described by an off-the-shelf shape descriptor. A relational indexing strategy aims to retrieve symbol locations into the target documents by using a combined numerical-relational description of 2D structures. The zones which are likely to contain the queried symbol are validated by a Hough-like voting scheme. In addition, a performance evaluation framework for symbol spotting in graphical documents is proposed. The presented methodology has been evaluated with a benchmarking set of architectural documents achieving good performance results.
Address
Corporate Author				Thesis
Publisher	Elsevier	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG			Approved	no
Call Number	DAG @ dag @ RBL2010			Serial	1177
Permanent link to this record



Author	Thanh Ha Do; Oriol Ramos Terrades; Salvatore Tabbone
Title	DSD: document sparse-based denoising algorithm			Type	Journal Article
Year	2019	Publication	Pattern Analysis and Applications	Abbreviated Journal	PAA
Volume	22	Issue	1	Pages	177–186
Keywords	Document denoising; Sparse representations; Sparse dictionary learning; Document degradation models
Abstract	In this paper, we present a sparse-based denoising algorithm for scanned documents. This method can be applied to any kind of scanned documents with satisfactory results. Unlike other approaches, the proposed approach encodes noise documents through sparse representation and visual dictionary learning techniques without any prior noise model. Moreover, we propose a precision parameter estimator. Experiments on several datasets demonstrate the robustness of the proposed approach compared to the state-of-the-art methods on document denoising.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.097; 600.140; 600.121			Approved	no
Call Number	Admin @ si @ DRT2019			Serial	3254
Permanent link to this record



Author	Ruben Tito; Dimosthenis Karatzas; Ernest Valveny
Title	Document Collection Visual Question Answering			Type	Conference Article
Year	2021	Publication	16th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume	12822	Issue		Pages	778-792
Keywords	Document collection; Visual Question Answering
Abstract	Current tasks and methods in Document Understanding aims to process documents as single elements. However, documents are usually organized in collections (historical records, purchase invoices), that provide context useful for their interpretation. To address this problem, we introduce Document Collection Visual Question Answering (DocCVQA) a new dataset and related task, where questions are posed over a whole collection of document images and the goal is not only to provide the answer to the given question, but also to retrieve the set of documents that contain the information needed to infer the answer. Along with the dataset we propose a new evaluation metric and baselines which provide further insights to the new dataset and task.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ TKV2021			Serial	3622
Permanent link to this record



Author	Francisco Cruz; Oriol Ramos Terrades
Title	A probabilistic framework for handwritten text line segmentation			Type	Miscellaneous
Year	2018	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords	Document Analysis; Text Line Segmentation; EM algorithm; Probabilistic Graphical Models; Parameter Learning
Abstract	We successfully combine Expectation-Maximization algorithm and variational approaches for parameter learning and computing inference on Markov random fields. This is a general method that can be applied to many computer vision tasks. In this paper, we apply it to handwritten text line segmentation. We conduct several experiments that demonstrate that our method deal with common issues of this task, such as complex document layout or non-latin scripts. The obtained results prove that our method achieve state-of-theart performance on different benchmark datasets without any particular fine tuning step.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.097; 600.121			Approved	no
Call Number	Admin @ si @ CrR2018			Serial	3253
Permanent link to this record



Author	M. Visani; Oriol Ramos Terrades; Salvatore Tabbone
Title	A Protocol to Characterize the Descriptive Power and the Complementarity of Shape Descriptors			Type	Journal Article
Year	2011	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
Volume	14	Issue	1	Pages	87-100
Keywords	Document analysis; Shape descriptors; Symbol description; Performance characterization; Complementarity analysis
Abstract	Most document analysis applications rely on the extraction of shape descriptors, which may be grouped into different categories, each category having its own advantages and drawbacks (O.R. Terrades et al. in Proceedings of ICDAR’07, pp. 227–231, 2007). In order to improve the richness of their description, many authors choose to combine multiple descriptors. Yet, most of the authors who propose a new descriptor content themselves with comparing its performance to the performance of a set of single state-of-the-art descriptors in a specific applicative context (e.g. symbol recognition, symbol spotting...). This results in a proliferation of the shape descriptors proposed in the literature. In this article, we propose an innovative protocol, the originality of which is to be as independent of the final application as possible and which relies on new quantitative and qualitative measures. We introduce two types of measures: while the measures of the first type are intended to characterize the descriptive power (in terms of uniqueness, distinctiveness and robustness towards noise) of a descriptor, the second type of measures characterizes the complementarity between multiple descriptors. Characterizing upstream the complementarity of shape descriptors is an alternative to the usual approach where the descriptors to be combined are selected by trial and error, considering the performance characteristics of the overall system. To illustrate the contribution of this protocol, we performed experimental studies using a set of descriptors and a set of symbols which are widely used by the community namely ART and SC descriptors and the GREC 2003 database.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; IF 1.091			Approved	no
Call Number	Admin @ si @VRT2011			Serial	1856
Permanent link to this record