Publicacions CVC -- Query Results

[21–30] << 31 32 33 34 35 36 37 38 39 40 >> [41–50]

Details

	Records
	Author	David Aldavert; Marçal Rusiñol; Ricardo Toledo
	Title	Automatic Static/Variable Content Separation in Administrative Document Images			Type	Conference Article
	Year	2017	Publication	14th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In this paper we present an automatic method for separating static and variable content from administrative document images. An alignment approach is able to unsupervisedly build probabilistic templates from a set of examples of the same document kind. Such templates define which is the likelihood of every pixel of being either static or variable content. In the extraction step, the same alignment technique is used to match an incoming image with the template and to locate the positions where variable fields appear. We validate our approach on the public NIST Structured Tax Forms Dataset.
	Address	Kyoto; Japan; November 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; 600.084; 600.121			Approved	no
	Call Number	Admin @ si @ ART2017			Serial	3001
Permanent link to this record



	Author	Leonardo Galteri; Dena Bazazian; Lorenzo Seidenari; Marco Bertini; Andrew Bagdanov; Anguelos Nicolaou; Dimosthenis Karatzas; Alberto del Bimbo
	Title	Reading Text in the Wild from Compressed Images			Type	Conference Article
	Year	2017	Publication	1st International workshop on Egocentric Perception, Interaction and Computing	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Reading text in the wild is gaining attention in the computer vision community. Images captured in the wild are almost always compressed to varying degrees, depending on application context, and this compression introduces artifacts that distort image content into the captured images. In this paper we investigate the impact these compression artifacts have on text localization and recognition in the wild. We also propose a deep Convolutional Neural Network (CNN) that can eliminate text-specific compression artifacts and which leads to an improvement in text recognition. Experimental results on the ICDAR-Challenge4 dataset demonstrate that compression artifacts have a significant impact on text localization and recognition and that our approach yields an improvement in both – especially at high compression rates.
	Address	Venice; Italy; October 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICCV - EPIC
	Notes	DAG; 600.084; 600.121			Approved	no
	Call Number	Admin @ si @ GBS2017			Serial	3006
Permanent link to this record



	Author	Masakazu Iwamura; Naoyuki Morimoto; Keishi Tainaka; Dena Bazazian; Lluis Gomez; Dimosthenis Karatzas
	Title	ICDAR2017 Robust Reading Challenge on Omnidirectional Video			Type	Conference Article
	Year	2017	Publication	14th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Results of ICDAR 2017 Robust Reading Challenge on Omnidirectional Video are presented. This competition uses Downtown Osaka Scene Text (DOST) Dataset that was captured in Osaka, Japan with an omnidirectional camera. Hence, it consists of sequential images (videos) of different view angles. Regarding the sequential images as videos (video mode), two tasks of localisation and end-to-end recognition are prepared. Regarding them as a set of still images (still image mode), three tasks of localisation, cropped word recognition and end-to-end recognition are prepared. As the dataset has been captured in Japan, the dataset contains Japanese text but also include text consisting of alphanumeric characters (Latin text). Hence, a submitted result for each task is evaluated in three ways: using Japanese only ground truth (GT), using Latin only GT and using combined GTs of both. Finally, by the submission deadline, we have received two submissions in the text localisation task of the still image mode. We intend to continue the competition in the open mode. Expecting further submissions, in this report we provide baseline results in all the tasks in addition to the submissions from the community.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; 600.084; 600.121			Approved	no
	Call Number	Admin @ si @ IMT2017			Serial	3077
Permanent link to this record



	Author	Dimosthenis Karatzas; Lluis Gomez; Marçal Rusiñol; Anguelos Nicolaou
	Title	The Robust Reading Competition Annotation and Evaluation Platform			Type	Conference Article
	Year	2018	Publication	13th IAPR International Workshop on Document Analysis Systems	Abbreviated Journal
	Volume		Issue		Pages	61-66
	Keywords
	Abstract	The ICDAR Robust Reading Competition (RRC), initiated in 2003 and reestablished in 2011, has become the defacto evaluation standard for the international community. Concurrent with its second incarnation in 2011, a continuous effort started to develop an online framework to facilitate the hosting and management of competitions. This short paper briefly outlines the Robust Reading Competition Annotation and Evaluation Platform, the backbone of the Robust Reading Competition, comprising a collection of tools and processes that aim to simplify the management and annotation of data, and to provide online and offline performance evaluation and analysis services.
	Address	Viena; Austria; April 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	DAS
	Notes	DAG; 600.084; 600.121			Approved	no
	Call Number	KGR2018			Serial	3103
Permanent link to this record



	Author	Lluis Gomez; Ali Furkan Biten; Ruben Tito; Andres Mafla; Marçal Rusiñol; Ernest Valveny; Dimosthenis Karatzas
	Title	Multimodal grid features and cell pointers for scene text visual question answering			Type	Journal Article
	Year	2021	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	150	Issue		Pages	242-249
	Keywords
	Abstract	This paper presents a new model for the task of scene text visual question answering. In this task questions about a given image can only be answered by reading and understanding scene text. Current state of the art models for this task make use of a dual attention mechanism in which one attention module attends to visual features while the other attends to textual features. A possible issue with this is that it makes difficult for the model to reason jointly about both modalities. To fix this problem we propose a new model that is based on an single attention mechanism that attends to multi-modal features conditioned to the question. The output weights of this attention module over a grid of multi-modal spatial features are interpreted as the probability that a certain spatial location of the image contains the answer text to the given question. Our experiments demonstrate competitive performance in two standard datasets with a model that is faster than previous methods at inference time. Furthermore, we also provide a novel analysis of the ST-VQA dataset based on a human performance study. Supplementary material, code, and data is made available through this link.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.084; 600.121			Approved	no
	Call Number	Admin @ si @ GBT2021			Serial	3620
Permanent link to this record



	Author	Marçal Rusiñol; J. Chazalon; Jean-Marc Ogier
	Title	Filtrage de descripteurs locaux pour l'amélioration de la détection de documents			Type	Conference Article
	Year	2016	Publication	Colloque International Francophone sur l'Écrit et le Document	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Local descriptors; mobile capture; document matching; keypoint selection
	Abstract	In this paper we propose an effective method aimed at reducing the amount of local descriptors to be indexed in a document matching framework.In an off-line training stage, the matching between the model document and incoming images is computed retaining the local descriptors from the model that steadily produce good matches. We have evaluated this approach by using the ICDAR2015 SmartDOC dataset containing near 25000 images from documents to be captured by a mobile device. We have tested the performance of this filtering step by using ORB and SIFT local detectors and descriptors. The results show an important gain both in quality of the final matching as well as in time and space requirements.
	Address	Toulouse; France; March 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CIFED
	Notes	DAG; 600.084; 600.077			Approved	no
	Call Number	Admin @ si @ RCO2016			Serial	2755
Permanent link to this record



	Author	Dimosthenis Karatzas; V. Poulain d'Andecy; Marçal Rusiñol
	Title	Human-Document Interaction – a new frontier for document image analysis			Type	Conference Article
	Year	2016	Publication	12th IAPR Workshop on Document Analysis Systems	Abbreviated Journal
	Volume		Issue		Pages	369-374
	Keywords
	Abstract	All indications show that paper documents will not cede in favour of their digital counterparts, but will instead be used increasingly in conjunction with digital information. An open challenge is how to seamlessly link the physical with the digital – how to continue taking advantage of the important affordances of paper, without missing out on digital functionality. This paper presents the authors’ experience with developing systems for Human-Document Interaction based on augmented document interfaces and examines new challenges and opportunities arising for the document image analysis field in this area. The system presented combines state of the art camera-based document image analysis techniques with a range of complementary tech-nologies to offer fluid Human-Document Interaction. Both fixed and nomadic setups are discussed that have gone through user testing in real-life environments, and use cases are presented that span the spectrum from business to educational application
	Address	Santorini; Greece; April 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	DAS
	Notes	DAG; 600.084; 600.077			Approved	no
	Call Number	KPR2016			Serial	2756
Permanent link to this record



	Author	Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Llados
	Title	Towards Query-by-Speech Handwritten Keyword Spotting			Type	Conference Article
	Year	2015	Publication	13th International Conference on Document Analysis and Recognition ICDAR2015	Abbreviated Journal
	Volume		Issue		Pages	501-505
	Keywords
	Abstract	In this paper, we present a new querying paradigm for handwritten keyword spotting. We propose to represent handwritten word images both by visual and audio representations, enabling a query-by-speech keyword spotting system. The two representations are merged together and projected to a common sub-space in the training phase. This transform allows to, given a spoken query, retrieve word instances that were only represented by the visual modality. In addition, the same method can be used backwards at no additional cost to produce a handwritten text-tospeech system. We present our first results on this new querying mechanism using synthetic voices over the George Washington dataset.
	Address	Nancy; France; August 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; 600.084; 600.061; 601.223; 600.077;ADAS			Approved	no
	Call Number	Admin @ si @ RAT2015b			Serial	2682
Permanent link to this record



	Author	J. Chazalon; Marçal Rusiñol; Jean-Marc Ogier; Josep Llados
	Title	A Semi-Automatic Groundtruthing Tool for Mobile-Captured Document Segmentation			Type	Conference Article
	Year	2015	Publication	13th International Conference on Document Analysis and Recognition ICDAR2015	Abbreviated Journal
	Volume		Issue		Pages	621-625
	Keywords
	Abstract	This paper presents a novel way to generate groundtruth data for the evaluation of mobile document capture systems, focusing on the first stage of the image processing pipeline involved: document object detection and segmentation in lowquality preview frames. We introduce and describe a simple, robust and fast technique based on color markers which enables a semi-automated annotation of page corners. We also detail a technique for marker removal. Methods and tools presented in the paper were successfully used to annotate, in few hours, 24889 frames in 150 video files for the smartDOC competition at ICDAR 2015
	Address	Nancy; France; August 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; 600.084; 600.061; 601.223; 600.077			Approved	no
	Call Number	Admin @ si @ CRO2015b			Serial	2685
Permanent link to this record



	Author	Y. Patel; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas
	Title	Dynamic Lexicon Generation for Natural Scene Images			Type	Conference Article
	Year	2016	Publication	14th European Conference on Computer Vision Workshops	Abbreviated Journal
	Volume		Issue		Pages	395-410
	Keywords	scene text; photo OCR; scene understanding; lexicon generation; topic modeling; CNN
	Abstract	Many scene text understanding methods approach the endtoend recognition problem from a word-spotting perspective and take huge benet from using small per-image lexicons. Such customized lexicons are normally assumed as given and their source is rarely discussed. In this paper we propose a method that generates contextualized lexicons for scene images using only visual information. For this, we exploit the correlation between visual and textual information in a dataset consisting of images and textual content associated with them. Using the topic modeling framework to discover a set of latent topics in such a dataset allows us to re-rank a xed dictionary in a way that prioritizes the words that are more likely to appear in a given image. Moreover, we train a CNN that is able to reproduce those word rankings but using only the image raw pixels as input. We demonstrate that the quality of the automatically obtained custom lexicons is superior to a generic frequency-based baseline.
	Address	Amsterdam; The Netherlands; October 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCVW
	Notes	DAG; 600.084			Approved	no
	Call Number	Admin @ si @ PGR2016			Serial	2825
Permanent link to this record

Select All Deselect All

[21–30] << 31 32 33 34 35 36 37 38 39 40 >> [41–50]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: