Publicacions CVC -- Query Results

[1–10] << 11 12 13 14 15 16 17 18 19 20 >> [21–30]

Details

	Records
	Author	Alicia Fornes; Veronica Romero; Arnau Baro; Juan Ignacio Toledo; Joan Andreu Sanchez; Enrique Vidal; Josep Llados
	Title	ICDAR2017 Competition on Information Extraction in Historical Handwritten Records			Type	Conference Article
	Year	2017	Publication	14th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume		Issue		Pages	1389-1394
	Keywords
	Abstract	The extraction of relevant information from historical handwritten document collections is one of the key steps in order to make these manuscripts available for access and searches. In this competition, the goal is to detect the named entities and assign each of them a semantic category, and therefore, to simulate the filling in of a knowledge database. This paper describes the dataset, the tasks, the evaluation metrics, the participants methods and the results.
	Address	Kyoto; Japan; November 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; 600.097; 601.225; 600.121			Approved	no
	Call Number	Admin @ si @ FRB2017			Serial	3052
Permanent link to this record



	Author	David Fernandez; Pau Riba; Alicia Fornes; Josep Llados
	Title	On the Influence of Key Point Encoding for Handwritten Word Spotting			Type	Conference Article
	Year	2014	Publication	14th International Conference on Frontiers in Handwriting Recognition	Abbreviated Journal
	Volume		Issue		Pages	476 - 481
	Keywords	Local descriptors; Interest points; Handwritten documents; Word spotting; Historical document analysis
	Abstract	In this paper we evaluate the influence of the selection of key points and the associated features in the performance of word spotting processes. In general, features can be extracted from a number of characteristic points like corners, contours, skeletons, maxima, minima, crossings, etc. A number of descriptors exist in the literature using different interest point detectors. But the intrinsic variability of handwriting vary strongly on the performance if the interest points are not stable enough. In this paper, we analyze the performance of different descriptors for local interest points. As benchmarking dataset we have used the Barcelona Marriage Database that contains handwritten records of marriages over five centuries.
	Address	Creete Island; Grecia; September 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	2167-6445	ISBN	978-1-4799-4335-7	Medium
	Area		Expedition		Conference	ICFHR
	Notes	DAG; 600.056; 600.061; 602.006; 600.077			Approved	no
	Call Number	Admin @ si @ FRF2014			Serial	2460
Permanent link to this record



	Author	Andreas Fischer; Ching Y. Suen; Volkmar Frinken; Kaspar Riesen; Horst Bunke
	Title	A Fast Matching Algorithm for Graph-Based Handwriting Recognition			Type	Conference Article
	Year	2013	Publication	9th IAPR – TC15 Workshop on Graph-based Representation in Pattern Recognition	Abbreviated Journal
	Volume	7877	Issue		Pages	194-203
	Keywords
	Abstract	The recognition of unconstrained handwriting images is usually based on vectorial representation and statistical classification. Despite their high representational power, graphs are rarely used in this field due to a lack of efficient graph-based recognition methods. Recently, graph similarity features have been proposed to bridge the gap between structural representation and statistical classification by means of vector space embedding. This approach has shown a high performance in terms of accuracy but had shortcomings in terms of computational speed. The time complexity of the Hungarian algorithm that is used to approximate the edit distance between two handwriting graphs is demanding for a real-world scenario. In this paper, we propose a faster graph matching algorithm which is derived from the Hausdorff distance. On the historical Parzival database it is demonstrated that the proposed method achieves a speedup factor of 12.9 without significant loss in recognition accuracy.
	Address	Vienna; Austria; May 2013
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-642-38220-8	Medium
	Area		Expedition		Conference	GBR
	Notes	DAG; 600.045; 605.203			Approved	no
	Call Number	Admin @ si @ FSF2013			Serial	2294
Permanent link to this record



	Author	Volkmar Frinken; Francisco Zamora; Salvador España; Maria Jose Castro; Andreas Fischer; Horst Bunke
	Title	Long-Short Term Memory Neural Networks Language Modeling for Handwriting Recognition			Type	Conference Article
	Year	2012	Publication	21st International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	701-704
	Keywords
	Abstract	Unconstrained handwritten text recognition systems maximize the combination of two separate probability scores. The first one is the observation probability that indicates how well the returned word sequence matches the input image. The second score is the probability that reflects how likely a word sequence is according to a language model. Current state-of-the-art recognition systems use statistical language models in form of bigram word probabilities. This paper proposes to model the target language by means of a recurrent neural network with long-short term memory cells. Because the network is recurrent, the considered context is not limited to a fixed size especially as the memory cells are designed to deal with long-term dependencies. In a set of experiments conducted on the IAM off-line database we show the superiority of the proposed language model over statistical n-gram models.
	Address	Tsukuba Science City, Japan
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1051-4651	ISBN	978-1-4673-2216-4	Medium
	Area		Expedition		Conference	ICPR
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ FZE2012			Serial	2052
Permanent link to this record



	Author	Hongxing Gao
	Title	Focused Structural Document Image Retrieval in Digital Mailroom Applications			Type	Book Whole
	Year	2015	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In this work, we develop a generic framework that is able to handle the document retrieval problem in various scenarios such as searching for full page matches or retrieving the counterparts for specific document areas, focusing on their structural similarity or letting their visual resemblance to play a dominant role. Based on the spatial indexing technique, we propose to search for matches of local key-region pairs carrying both structural and visual information from the collection while a scheme allowing to adjust the relative contribution of structural and visual similarity is presented. Based on the fact that the structure of documents is tightly linked with the distance among their elements, we firstly introduce an efficient detector named Distance Transform based Maximally Stable Extremal Regions (DTMSER). We illustrate that this detector is able to efficiently extract the structure of a document image as a dendrogram (hierarchical tree) of multi-scale key-regions that roughly correspond to letters, words and paragraphs. We demonstrate that, without benefiting from the structure information, the key-regions extracted by the DTMSER algorithm achieve better results comparing with state-of-the-art methods while much less amount of key-regions are employed. We subsequently propose a pair-wise Bag of Words (BoW) framework to efficiently embed the explicit structure extracted by the DTMSER algorithm. We represent each document as a list of key-region pairs that correspond to the edges in the dendrogram where inclusion relationship is encoded. By employing those structural key-region pairs as the pooling elements for generating the histogram of features, the proposed method is able to encode the explicit inclusion relations into a BoW representation. The experimental results illustrate that the pair-wise BoW, powered by the embedded structural information, achieves remarkable improvement over the conventional BoW and spatial pyramidal BoW methods. To handle various retrieval scenarios in one framework, we propose to directly query a series of key-region pairs, carrying both structure and visual information, from the collection. We introduce the spatial indexing techniques to the document retrieval community to speed up the structural relationship computation for key-region pairs. We firstly test the proposed framework in a full page retrieval scenario where structurally similar matches are expected. In this case, the pair-wise querying method achieves notable improvement over the BoW and spatial pyramidal BoW frameworks. Furthermore, we illustrate that the proposed method is also able to handle focused retrieval situations where the queries are defined as a specific interesting partial areas of the images. We examine our method on two types of focused queries: structure-focused and exact queries. The experimental results show that, the proposed generic framework obtains nearly perfect precision on both types of focused queries while it is the first framework able to tackle structure-focused queries, setting a new state of the art in the field. Besides, we introduce a line verification method to check the spatial consistency among the matched key-region pairs. We propose a computationally efficient version of line verification through a two step implementation. We first compute tentative localizations of the query and subsequently employ them to divide the matched key-region pairs into several groups, then line verification is performed within each group while more precise bounding boxes are computed. We demonstrate that, comparing with the standard approach (based on RANSAC), the line verification proposed generally achieves much higher recall with slight loss on precision on specific queries.
	Address	January 2015
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Josep Llados;Dimosthenis Karatzas;Marçal Rusiñol
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-943427-0-7	Medium
	Area		Expedition		Conference
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ Gao2015			Serial	2577
Permanent link to this record



	Author	Andrea Gemelli; Sanket Biswas; Enrico Civitelli; Josep Llados; Simone Marinai
	Title	Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks			Type	Conference Article
	Year	2022	Publication	17th European Conference on Computer Vision Workshops	Abbreviated Journal
	Volume	13804	Issue		Pages	329–344
	Keywords
	Abstract	Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis. The application of Graph Neural Networks (GNNs) has become crucial in various document-related tasks since they can unravel important structural patterns, fundamental in key information extraction processes. Previous works in the literature propose task-driven models and do not take into account the full power of graphs. We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model, to solve different tasks given different types of documents. We evaluated our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-031-25068-2	Medium
	Area		Expedition		Conference	ECCV-TiE
	Notes	DAG; 600.162; 600.140; 110.312			Approved	no
	Call Number	Admin @ si @ GBC2022			Serial	3795
Permanent link to this record



	Author	Lluis Gomez; Dena Bazazian; Dimosthenis Karatzas
	Title	Historical review of scene text detection research			Type	Book Chapter
	Year	2020	Publication	Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor	K. Alahari; C.V. Jawahar
	Language		Summary Language		Original Title
	Series Editor		Series Title	Series on Advances in Computer Vision and Pattern Recognition	Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ GBK2020			Serial	3495
Permanent link to this record



	Author	Leonardo Galteri; Dena Bazazian; Lorenzo Seidenari; Marco Bertini; Andrew Bagdanov; Anguelos Nicolaou; Dimosthenis Karatzas; Alberto del Bimbo
	Title	Reading Text in the Wild from Compressed Images			Type	Conference Article
	Year	2017	Publication	1st International workshop on Egocentric Perception, Interaction and Computing	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Reading text in the wild is gaining attention in the computer vision community. Images captured in the wild are almost always compressed to varying degrees, depending on application context, and this compression introduces artifacts that distort image content into the captured images. In this paper we investigate the impact these compression artifacts have on text localization and recognition in the wild. We also propose a deep Convolutional Neural Network (CNN) that can eliminate text-specific compression artifacts and which leads to an improvement in text recognition. Experimental results on the ICDAR-Challenge4 dataset demonstrate that compression artifacts have a significant impact on text localization and recognition and that our approach yields an improvement in both – especially at high compression rates.
	Address	Venice; Italy; October 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICCV - EPIC
	Notes	DAG; 600.084; 600.121			Approved	no
	Call Number	Admin @ si @ GBS2017			Serial	3006
Permanent link to this record



	Author	Giuseppe De Gregorio; Sanket Biswas; Mohamed Ali Souibgui; Asma Bensalah; Josep Llados; Alicia Fornes; Angelo Marcelli
	Title	A Few Shot Multi-representation Approach for N-Gram Spotting in Historical Manuscripts			Type	Conference Article
	Year	2022	Publication	Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022)	Abbreviated Journal
	Volume	13639	Issue		Pages	3-12
	Keywords	N-gram spotting; Few-shot learning; Multimodal understanding; Historical handwritten collections
	Abstract	Despite recent advances in automatic text recognition, the performance remains moderate when it comes to historical manuscripts. This is mainly because of the scarcity of available labelled data to train the data-hungry Handwritten Text Recognition (HTR) models. The Keyword Spotting System (KWS) provides a valid alternative to HTR due to the reduction in error rate, but it is usually limited to a closed reference vocabulary. In this paper, we propose a few-shot learning paradigm for spotting sequences of a few characters (N-gram) that requires a small amount of labelled training data. We exhibit that recognition of important n-grams could reduce the system’s dependency on vocabulary. In this case, an out-of-vocabulary (OOV) word in an input handwritten line image could be a sequence of n-grams that belong to the lexicon. An extensive experimental evaluation of our proposed multi-representation approach was carried out on a subset of Bentham’s historical manuscript collections to obtain some really promising results in this direction.
	Address	December 04 – 07, 2022; Hyderabad, India
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICFHR
	Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
	Call Number	Admin @ si @ GBS2022			Serial	3733
Permanent link to this record



	Author	Lluis Gomez; Ali Furkan Biten; Ruben Tito; Andres Mafla; Marçal Rusiñol; Ernest Valveny; Dimosthenis Karatzas
	Title	Multimodal grid features and cell pointers for scene text visual question answering			Type	Journal Article
	Year	2021	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	150	Issue		Pages	242-249
	Keywords
	Abstract	This paper presents a new model for the task of scene text visual question answering. In this task questions about a given image can only be answered by reading and understanding scene text. Current state of the art models for this task make use of a dual attention mechanism in which one attention module attends to visual features while the other attends to textual features. A possible issue with this is that it makes difficult for the model to reason jointly about both modalities. To fix this problem we propose a new model that is based on an single attention mechanism that attends to multi-modal features conditioned to the question. The output weights of this attention module over a grid of multi-modal spatial features are interpreted as the probability that a certain spatial location of the image contains the answer text to the given question. Our experiments demonstrate competitive performance in two standard datasets with a model that is faster than previous methods at inference time. Furthermore, we also provide a novel analysis of the ST-VQA dataset based on a human performance study. Supplementary material, code, and data is made available through this link.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.084; 600.121			Approved	no
	Call Number	Admin @ si @ GBT2021			Serial	3620
Permanent link to this record

Select All Deselect All

[1–10] << 11 12 13 14 15 16 17 18 19 20 >> [21–30]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: