Publicacions CVC -- Query Results

[51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–74]

Details

	Records
	Author	Mohamed Ali Souibgui; Sanket Biswas; Andres Mafla; Ali Furkan Biten; Alicia Fornes; Yousri Kessentini; Josep Llados; Lluis Gomez; Dimosthenis Karatzas
	Title	Text-DIAE: a self-supervised degradation invariant autoencoder for text recognition and document enhancement			Type	Conference Article
	Year	2023	Publication	Proceedings of the 37th AAAI Conference on Artificial Intelligence	Abbreviated Journal
	Volume	37	Issue	2	Pages
	Keywords	Representation Learning for Vision; CV Applications; CV Language and Vision; ML Unsupervised; Self-Supervised Learning
	Abstract	In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement. We start by employing a transformer-based architecture that incorporates three pretext tasks as learning objectives to be optimized during pre-training without the usage of labelled data. Each of the pretext objectives is specifically tailored for the final downstream tasks. We conduct several ablation experiments that confirm the design choice of the selected pretext tasks. Importantly, the proposed model does not exhibit limitations of previous state-of-the-art methods based on contrastive losses, while at the same time requiring substantially fewer data samples to converge. Finally, we demonstrate that our method surpasses the state-of-the-art in existing supervised and self-supervised settings in handwritten and scene text recognition and document image enhancement. Our code and trained models will be made publicly available at https://github.com/dali92002/SSL-OCR
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	AAAI
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ SBM2023			Serial	3848
Permanent link to this record



	Author	Mickael Coustaty; Alicia Fornes
	Title	Document Analysis and Recognition – ICDAR 2023 Workshops			Type	Book Whole
	Year	2023	Publication	Document Analysis and Recognition – ICDAR 2023 Workshops	Abbreviated Journal
	Volume	14194	Issue	2	Pages
	Keywords
	Abstract
	Address	San Jose; USA; August 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ CoF2023			Serial	3852
Permanent link to this record



	Author	Khanh Nguyen; Ali Furkan Biten; Andres Mafla; Lluis Gomez; Dimosthenis Karatzas
	Title	Show, Interpret and Tell: Entity-Aware Contextualised Image Captioning in Wikipedia			Type	Conference Article
	Year	2023	Publication	Proceedings of the 37th AAAI Conference on Artificial Intelligence	Abbreviated Journal
	Volume	37	Issue	2	Pages	1940-1948
	Keywords
	Abstract	Humans exploit prior knowledge to describe images, and are able to adapt their explanation to specific contextual information given, even to the extent of inventing plausible explanations when contextual information and images do not match. In this work, we propose the novel task of captioning Wikipedia images by integrating contextual knowledge. Specifically, we produce models that jointly reason over Wikipedia articles, Wikimedia images and their associated descriptions to produce contextualized captions. The same Wikimedia image can be used to illustrate different articles, and the produced caption needs to be adapted to the specific context allowing us to explore the limits of the model to adjust captions to different contextual information. Dealing with out-of-dictionary words and Named Entities is a challenging task in this domain. To address this, we propose a pre-training objective, Masked Named Entity Modeling (MNEM), and show that this pretext task results to significantly improved models. Furthermore, we verify that a model pre-trained in Wikipedia generalizes well to News Captioning datasets. We further define two different test splits according to the difficulty of the captioning task. We offer insights on the role and the importance of each modality and highlight the limitations of our model.
	Address	Washington; USA; February 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	AAAI
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ NBM2023			Serial	3860
Permanent link to this record



	Author	T.Chauhan; E.Perales; Kaida Xiao; E.Hird ; Dimosthenis Karatzas; Sophie Wuerger
	Title	The achromatic locus: Effect of navigation direction in color space			Type	Journal Article
	Year	2014	Publication	Journal of Vision	Abbreviated Journal	VSS
	Volume	14 (1)	Issue	25	Pages	1-11
	Keywords	achromatic; unique hues; color constancy; luminance; color space
	Abstract	5Y Impact Factor: 2.99 / 1st (Ophthalmology) An achromatic stimulus is defined as a patch of light that is devoid of any hue. This is usually achieved by asking observers to adjust the stimulus such that it looks neither red nor green and at the same time neither yellow nor blue. Despite the theoretical and practical importance of the achromatic locus, little is known about the variability in these settings. The main purpose of the current study was to evaluate whether achromatic settings were dependent on the task of the observers, namely the navigation direction in color space. Observers could either adjust the test patch along the two chromatic axes in the CIE uv diagram or, alternatively, navigate along the unique-hue lines. Our main result is that the navigation method affects the reliability of these achromatic settings. Observers are able to make more reliable achromatic settings when adjusting the test patch along the directions defined by the four unique hues as opposed to navigating along the main axes in the commonly used CIE uv chromaticity plane. This result holds across different ambient viewing conditions (Dark, Daylight, Cool White Fluorescent) and different test luminance levels (5, 20, and 50 cd/m2). The reduced variability in the achromatic settings is consistent with the idea that internal color representations are more aligned with the unique-hue lines than the u* and v* axes.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ CPX2014			Serial	2418
Permanent link to this record



	Author	Josep Llados; Gemma Sanchez
	Title	Graph Matching vs. Graph Parsing in Graphics Recognition: A Combined Approach			Type	Journal
	Year	2004	Publication	International Journal of Pattern Recognition and Artificial Intelligence	Abbreviated Journal	IJPRAI
	Volume	18	Issue	3	Pages	455–473
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; IF: 0.588			Approved	no
	Call Number	DAG @ dag @ LlS2004			Serial	445
Permanent link to this record



	Author	Antonio Lopez; Ernest Valveny; Juan J. Villanueva
	Title	Real-time quality control of surgical material packaging by artificial vision			Type	Journal Article
	Year	2005	Publication	Assembly Automation	Abbreviated Journal
	Volume	25	Issue	3	Pages
	Keywords
	Abstract	IF: 0.061)
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS;DAG			Approved	no
	Call Number	ADAS @ adas @ LVV2005			Serial	552
Permanent link to this record



	Author	Marçal Rusiñol; Josep Llados; Gemma Sanchez
	Title	Symbol Spotting in Vectorized Technical Drawings Through a Lookup Table of Region Strings			Type	Journal Article
	Year	2010	Publication	Pattern Analysis and Applications	Abbreviated Journal	PAA
	Volume	13	Issue	3	Pages	321-331
	Keywords
	Abstract	In this paper, we address the problem of symbol spotting in technical document images applied to scanned and vectorized line drawings. Like any information spotting architecture, our approach has two components. First, symbols are decomposed in primitives which are compactly represented and second a primitive indexing structure aims to efficiently retrieve similar primitives. Primitives are encoded in terms of attributed strings representing closed regions. Similar strings are clustered in a lookup table so that the set median strings act as indexing keys. A voting scheme formulates hypothesis in certain locations of the line drawing image where there is a high presence of regions similar to the queried ones, and therefore, a high probability to find the queried graphical symbol. The proposed approach is illustrated in a framework consisting in spotting furniture symbols in architectural drawings. It has been proved to work even in the presence of noise and distortion introduced by the scanning and raster-to-vector processes.
	Address
	Corporate Author				Thesis
	Publisher	Springer-Verlag	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1433-7541	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	DAG @ dag @ RLS2010			Serial	1165
Permanent link to this record



	Author	Marçal Rusiñol; Agnes Borras; Josep Llados
	Title	Relational Indexing of Vectorial Primitives for Symbol Spotting in Line-Drawing Images			Type	Journal Article
	Year	2010	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	31	Issue	3	Pages	188–201
	Keywords	Document image analysis and recognition, Graphics recognition, Symbol spotting ,Vectorial representations, Line-drawings
	Abstract	This paper presents a symbol spotting approach for indexing by content a database of line-drawing images. As line-drawings are digital-born documents designed by vectorial softwares, instead of using a pixel-based approach, we present a spotting method based on vector primitives. Graphical symbols are represented by a set of vectorial primitives which are described by an off-the-shelf shape descriptor. A relational indexing strategy aims to retrieve symbol locations into the target documents by using a combined numerical-relational description of 2D structures. The zones which are likely to contain the queried symbol are validated by a Hough-like voting scheme. In addition, a performance evaluation framework for symbol spotting in graphical documents is proposed. The presented methodology has been evaluated with a benchmarking set of architectural documents achieving good performance results.
	Address
	Corporate Author				Thesis
	Publisher	Elsevier	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	DAG @ dag @ RBL2010			Serial	1177
Permanent link to this record



	Author	Alicia Fornes; Josep Llados; Gemma Sanchez; Dimosthenis Karatzas
	Title	Rotation Invariant Hand-Drawn Symbol Recognition based on a Dynamic Time Warping Model			Type	Journal Article
	Year	2010	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	13	Issue	3	Pages	229–241
	Keywords
	Abstract	One of the major difficulties of handwriting symbol recognition is the high variability among symbols because of the different writer styles. In this paper, we introduce a robust approach for describing and recognizing hand-drawn symbols tolerant to these writer style differences. This method, which is invariant to scale and rotation, is based on the dynamic time warping (DTW) algorithm. The symbols are described by vector sequences, a variation of the DTW distance is used for computing the matching distance, and K-Nearest Neighbor is used to classify them. Our approach has been evaluated in two benchmarking scenarios consisting of hand-drawn symbols. Compared with state-of-the-art methods for symbol recognition, our method shows higher tolerance to the irregular deformations induced by hand-drawn strokes.
	Address
	Corporate Author				Thesis
	Publisher	Springer-Verlag	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1433-2833	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; IF 2009: 1,213			Approved	no
	Call Number	DAG @ dag @ FLS2010a			Serial	1288
Permanent link to this record



	Author	Mathieu Nicolas Delalandre; Ernest Valveny; Tony Pridmore; Dimosthenis Karatzas
	Title	Generation of Synthetic Documents for Performance Evaluation of Symbol Recognition & Spotting Systems			Type	Journal Article
	Year	2010	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	13	Issue	3	Pages	187-207
	Keywords
	Abstract	This paper deals with the topic of performance evaluation of symbol recognition & spotting systems. We propose here a new approach to the generation of synthetic graphics documents containing non-isolated symbols in a real context. This approach is based on the definition of a set of constraints that permit us to place the symbols on a pre-defined background according to the properties of a particular domain (architecture, electronics, engineering, etc.). In this way, we can obtain a large amount of images resembling real documents by simply defining the set of constraints and providing a few pre-defined backgrounds. As documents are synthetically generated, the groundtruth (the location and the label of every symbol) becomes automatically available. We have applied this approach to the generation of a large database of architectural drawings and electronic diagrams, which shows the flexibility of the system. Performance evaluation experiments of a symbol localization system show that our approach permits to generate documents with different features that are reflected in variation of localization results.
	Address
	Corporate Author				Thesis
	Publisher	Springer-Verlag	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1433-2833	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	DAG @ dag @ DVP2010			Serial	1289
Permanent link to this record

Select All Deselect All

[51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–74]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: