Publicacions CVC -- Query Results

[31–40] << 41 42 43 44 45 46 47 48 49 50 >> [51–60]

Details

	Records
	Author	Leonardo Galteri; Dena Bazazian; Lorenzo Seidenari; Marco Bertini; Andrew Bagdanov; Anguelos Nicolaou; Dimosthenis Karatzas; Alberto del Bimbo
	Title	Reading Text in the Wild from Compressed Images			Type	Conference Article
	Year	2017	Publication	1st International workshop on Egocentric Perception, Interaction and Computing	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Reading text in the wild is gaining attention in the computer vision community. Images captured in the wild are almost always compressed to varying degrees, depending on application context, and this compression introduces artifacts that distort image content into the captured images. In this paper we investigate the impact these compression artifacts have on text localization and recognition in the wild. We also propose a deep Convolutional Neural Network (CNN) that can eliminate text-specific compression artifacts and which leads to an improvement in text recognition. Experimental results on the ICDAR-Challenge4 dataset demonstrate that compression artifacts have a significant impact on text localization and recognition and that our approach yields an improvement in both – especially at high compression rates.
	Address	Venice; Italy; October 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICCV - EPIC
	Notes	DAG; 600.084; 600.121			Approved	no
	Call Number	Admin @ si @ GBS2017			Serial	3006
Permanent link to this record



	Author	Liu Wenyin; Josep Llados; Jean-Marc Ogier
	Title	Graphics Recognition. Recent Advances and New Opportunities.			Type	Book Whole
	Year	2008	Publication	7th International Workshop, Selected Papers,	Abbreviated Journal
	Volume	5046	Issue		Pages
	Keywords
	Abstract
	Address	Curitiba (Brazil)
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-540-88184-1	Medium
	Area		Expedition		Conference	GREC
	Notes	DAG			Approved	no
	Call Number	DAG @ dag @ WLO2008			Serial	1012
Permanent link to this record



	Author	Lluis Gomez
	Title	Perceptual Organization for Text Extraction in Natural Scenes			Type	Report
	Year	2012	Publication	CVC Technical Report	Abbreviated Journal
	Volume	173	Issue		Pages
	Keywords
	Abstract
	Address	Bellaterra
	Corporate Author				Thesis	Master's thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ Gom2012			Serial	2309
Permanent link to this record



	Author	Lluis Gomez
	Title	Exploiting Similarity Hierarchies for Multi-script Scene Text Understanding			Type	Book Whole
	Year	2016	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	This thesis addresses the problem of automatic scene text understanding in unconstrained conditions. In particular, we tackle the tasks of multi-language and arbitrary-oriented text detection, tracking, and script identification in natural scenes. For this we have developed a set of generic methods that build on top of the basic observation that text has always certain key visual and structural characteristics that are independent of the language or script in which it is written. Text instances in any language or script are always formed as groups of similar atomic parts, being them either individual characters, small stroke parts, or even whole words in the case of cursive text. This holistic (sumof-parts) and recursive perspective has lead us to explore different variants of the “segmentation and grouping” paradigm of computer vision. Scene text detection methodologies are usually based in classification of individual regions or patches, using a priory knowledge for a given script or language. Human perception of text, on the other hand, is based on perceptual organization through which text emerges as a perceptually significant group of atomic objects. In this thesis, we argue that the text detection problem must be posed as the detection of meaningful groups of regions. We address the problem of text detection in natural scenes from a hierarchical perspective, making explicit use of the recursive nature of text, aiming directly to the detection of region groupings corresponding to text within a hierarchy produced by an agglomerative similarity clustering process over individual regions. We propose an optimal way to construct such an hierarchy introducing a feature space designed to produce text group hypothese with high recall and a novel stopping rule combining a discriminative classifier and a probabilistic measure of group meaningfulness based in perceptual organization. Within this generic framework, we design a text-specific object proposals algorithm that, contrary to existing generic object proposals methods, aims directly to the detection of text regions groupings. For this, we abandon the rigid definition of “what is text” of traditional specialized text detectors, and move towards more fuzzy perspective of grouping-based object proposals methods. Then, we present a hybrid algorithm for detection and tracking of scene text where the notion of region groupings plays also a central role. By leveraging the structural arrangement of text group components between consecutive frames we can improve the overall tracking performance of the system. Finally, since our generic detection framework is inherently designed for multi-language environments, we focus on the problem of script identification in order to build a multi-language end-toend reading system. Facing this problem with state of the art CNN classifiers is not straightforward, as they fail to address a key characteristic of scene text instances: their extremely variable aspect ratio. Instead of resizing input images to a fixed size as in the typical use of holistic CNN classifiers, we propose a patch-based classification framework in order to preserve discriminative parts of the image that are characteristic of its class. We describe a novel method based on the use of ensembles of conjoined networks to jointly learn discriminative stroke-parts representations and their relative importance in a patch-based classification scheme.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher		Place of Publication		Editor	Dimosthenis Karatzas
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ Gom2016			Serial	2891
Permanent link to this record



	Author	Lluis Gomez; Ali Furkan Biten; Ruben Tito; Andres Mafla; Marçal Rusiñol; Ernest Valveny; Dimosthenis Karatzas
	Title	Multimodal grid features and cell pointers for scene text visual question answering			Type	Journal Article
	Year	2021	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	150	Issue		Pages	242-249
	Keywords
	Abstract	This paper presents a new model for the task of scene text visual question answering. In this task questions about a given image can only be answered by reading and understanding scene text. Current state of the art models for this task make use of a dual attention mechanism in which one attention module attends to visual features while the other attends to textual features. A possible issue with this is that it makes difficult for the model to reason jointly about both modalities. To fix this problem we propose a new model that is based on an single attention mechanism that attends to multi-modal features conditioned to the question. The output weights of this attention module over a grid of multi-modal spatial features are interpreted as the probability that a certain spatial location of the image contains the answer text to the given question. Our experiments demonstrate competitive performance in two standard datasets with a model that is faster than previous methods at inference time. Furthermore, we also provide a novel analysis of the ST-VQA dataset based on a human performance study. Supplementary material, code, and data is made available through this link.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.084; 600.121			Approved	no
	Call Number	Admin @ si @ GBT2021			Serial	3620
Permanent link to this record



	Author	Lluis Gomez; Andres Mafla; Marçal Rusiñol; Dimosthenis Karatzas
	Title	Single Shot Scene Text Retrieval			Type	Conference Article
	Year	2018	Publication	15th European Conference on Computer Vision	Abbreviated Journal
	Volume	11218	Issue		Pages	728-744
	Keywords	Image retrieval; Scene text; Word spotting; Convolutional Neural Networks; Region Proposals Networks; PHOC
	Abstract	Textual information found in scene images provides high level semantic information about the image and its context and it can be leveraged for better scene understanding. In this paper we address the problem of scene text retrieval: given a text query, the system must return all images containing the queried text. The novelty of the proposed model consists in the usage of a single shot CNN architecture that predicts at the same time bounding boxes and a compact text representation of the words in them. In this way, the text based image retrieval task can be casted as a simple nearest neighbor search of the query text representation over the outputs of the CNN over the entire image database. Our experiments demonstrate that the proposed architecture outperforms previous state-of-the-art while it offers a significant increase in processing speed.
	Address	Munich; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCV
	Notes	DAG; 600.084; 601.338; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ GMR2018			Serial	3143
Permanent link to this record



	Author	Lluis Gomez; Anguelos Nicolaou; Dimosthenis Karatzas
	Title	Improving patch‐based scene text script identification with ensembles of conjoined networks			Type	Journal Article
	Year	2017	Publication	Pattern Recognition	Abbreviated Journal	PR
	Volume	67	Issue		Pages	85-96
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.084; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ GNK2017			Serial	2887
Permanent link to this record



	Author	Lluis Gomez; Anguelos Nicolaou; Marçal Rusiñol; Dimosthenis Karatzas
	Title	12 years of ICDAR Robust Reading Competitions: The evolution of reading systems for unconstrained text understanding			Type	Book Chapter
	Year	2020	Publication	Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor	K. Alahari; C.V. Jawahar
	Language		Summary Language		Original Title
	Series Editor		Series Title	Series on Advances in Computer Vision and Pattern Recognition	Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121			Approved	no
	Call Number	GNR2020			Serial	3494
Permanent link to this record



	Author	Lluis Gomez; Dena Bazazian; Dimosthenis Karatzas
	Title	Historical review of scene text detection research			Type	Book Chapter
	Year	2020	Publication	Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor	K. Alahari; C.V. Jawahar
	Language		Summary Language		Original Title
	Series Editor		Series Title	Series on Advances in Computer Vision and Pattern Recognition	Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ GBK2020			Serial	3495
Permanent link to this record



	Author	Lluis Gomez; Dimosthenis Karatzas
	Title	Multi-script Text Extraction from Natural Scenes			Type	Conference Article
	Year	2013	Publication	12th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume		Issue		Pages	467-471
	Keywords
	Abstract	Scene text extraction methodologies are usually based in classification of individual regions or patches, using a priori knowledge for a given script or language. Human perception of text, on the other hand, is based on perceptual organisation through which text emerges as a perceptually significant group of atomic objects. Therefore humans are able to detect text even in languages and scripts never seen before. In this paper, we argue that the text extraction problem could be posed as the detection of meaningful groups of regions. We present a method built around a perceptual organisation framework that exploits collaboration of proximity and similarity laws to create text-group hypotheses. Experiments demonstrate that our algorithm is competitive with state of the art approaches on a standard dataset covering text in variable orientations and two languages.
	Address	Washington; USA; August 2013
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1520-5363	ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; 600.056; 601.158; 601.197			Approved	no
	Call Number	Admin @ si @ GoK2013			Serial	2310
Permanent link to this record

Select All Deselect All

[31–40] << 41 42 43 44 45 46 47 48 49 50 >> [51–60]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: