toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Mohamed Ali Souibgui; Sanket Biswas; Andres Mafla; Ali Furkan Biten; Alicia Fornes; Yousri Kessentini; Josep Llados; Lluis Gomez; Dimosthenis Karatzas edit  url
openurl 
  Title Text-DIAE: a self-supervised degradation invariant autoencoder for text recognition and document enhancement Type Conference Article
  Year 2023 Publication Proceedings of the 37th AAAI Conference on Artificial Intelligence Abbreviated Journal  
  Volume 37 Issue (up) 2 Pages  
  Keywords Representation Learning for Vision; CV Applications; CV Language and Vision; ML Unsupervised; Self-Supervised Learning  
  Abstract In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement. We start by employing a transformer-based architecture that incorporates three pretext tasks as learning objectives to be optimized during pre-training without the usage of labelled data. Each of the pretext objectives is specifically tailored for the final downstream tasks. We conduct several ablation experiments that confirm the design choice of the selected pretext tasks. Importantly, the proposed model does not exhibit limitations of previous state-of-the-art methods based on contrastive losses, while at the same time requiring substantially fewer data samples to converge. Finally, we demonstrate that our method surpasses the state-of-the-art in existing supervised and self-supervised settings in handwritten and scene text recognition and document image enhancement. Our code and trained models will be made publicly available at https://github.com/dali92002/SSL-OCR  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference AAAI  
  Notes DAG Approved no  
  Call Number Admin @ si @ SBM2023 Serial 3848  
Permanent link to this record
 

 
Author Mickael Coustaty; Alicia Fornes edit  url
openurl 
  Title Document Analysis and Recognition – ICDAR 2023 Workshops Type Book Whole
  Year 2023 Publication Document Analysis and Recognition – ICDAR 2023 Workshops Abbreviated Journal  
  Volume 14194 Issue (up) 2 Pages  
  Keywords  
  Abstract  
  Address San Jose; USA; August 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ CoF2023 Serial 3852  
Permanent link to this record
 

 
Author Khanh Nguyen; Ali Furkan Biten; Andres Mafla; Lluis Gomez; Dimosthenis Karatzas edit  url
openurl 
  Title Show, Interpret and Tell: Entity-Aware Contextualised Image Captioning in Wikipedia Type Conference Article
  Year 2023 Publication Proceedings of the 37th AAAI Conference on Artificial Intelligence Abbreviated Journal  
  Volume 37 Issue (up) 2 Pages 1940-1948  
  Keywords  
  Abstract Humans exploit prior knowledge to describe images, and are able to adapt their explanation to specific contextual information given, even to the extent of inventing plausible explanations when contextual information and images do not match. In this work, we propose the novel task of captioning Wikipedia images by integrating contextual knowledge. Specifically, we produce models that jointly reason over Wikipedia articles, Wikimedia images and their associated descriptions to produce contextualized captions. The same Wikimedia image can be used to illustrate different articles, and the produced caption needs to be adapted to the specific context allowing us to explore the limits of the model to adjust captions to different contextual information. Dealing with out-of-dictionary words and Named Entities is a challenging task in this domain. To address this, we propose a pre-training objective, Masked Named Entity Modeling (MNEM), and show that this pretext task results to significantly improved models. Furthermore, we verify that a model pre-trained in Wikipedia generalizes well to News Captioning datasets. We further define two different test splits according to the difficulty of the captioning task. We offer insights on the role and the importance of each modality and highlight the limitations of our model.  
  Address Washington; USA; February 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference AAAI  
  Notes DAG Approved no  
  Call Number Admin @ si @ NBM2023 Serial 3860  
Permanent link to this record
 

 
Author T.Chauhan; E.Perales; Kaida Xiao; E.Hird ; Dimosthenis Karatzas; Sophie Wuerger edit  doi
openurl 
  Title The achromatic locus: Effect of navigation direction in color space Type Journal Article
  Year 2014 Publication Journal of Vision Abbreviated Journal VSS  
  Volume 14 (1) Issue (up) 25 Pages 1-11  
  Keywords achromatic; unique hues; color constancy; luminance; color space  
  Abstract 5Y Impact Factor: 2.99 / 1st (Ophthalmology)
An achromatic stimulus is defined as a patch of light that is devoid of any hue. This is usually achieved by asking observers to adjust the stimulus such that it looks neither red nor green and at the same time neither yellow nor blue. Despite the theoretical and practical importance of the achromatic locus, little is known about the variability in these settings. The main purpose of the current study was to evaluate whether achromatic settings were dependent on the task of the observers, namely the navigation direction in color space. Observers could either adjust the test patch along the two chromatic axes in the CIE u*v* diagram or, alternatively, navigate along the unique-hue lines. Our main result is that the navigation method affects the reliability of these achromatic settings. Observers are able to make more reliable achromatic settings when adjusting the test patch along the directions defined by the four unique hues as opposed to navigating along the main axes in the commonly used CIE u*v* chromaticity plane. This result holds across different ambient viewing conditions (Dark, Daylight, Cool White Fluorescent) and different test luminance levels (5, 20, and 50 cd/m2). The reduced variability in the achromatic settings is consistent with the idea that internal color representations are more aligned with the unique-hue lines than the u* and v* axes.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.077 Approved no  
  Call Number Admin @ si @ CPX2014 Serial 2418  
Permanent link to this record
 

 
Author Josep Llados; Gemma Sanchez edit  openurl
  Title Graph Matching vs. Graph Parsing in Graphics Recognition: A Combined Approach Type Journal
  Year 2004 Publication International Journal of Pattern Recognition and Artificial Intelligence Abbreviated Journal IJPRAI  
  Volume 18 Issue (up) 3 Pages 455–473  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; IF: 0.588 Approved no  
  Call Number DAG @ dag @ LlS2004 Serial 445  
Permanent link to this record
 

 
Author Antonio Lopez; Ernest Valveny; Juan J. Villanueva edit  url
openurl 
  Title Real-time quality control of surgical material packaging by artificial vision Type Journal Article
  Year 2005 Publication Assembly Automation Abbreviated Journal  
  Volume 25 Issue (up) 3 Pages  
  Keywords  
  Abstract IF: 0.061)  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS;DAG Approved no  
  Call Number ADAS @ adas @ LVV2005 Serial 552  
Permanent link to this record
 

 
Author Marçal Rusiñol; Josep Llados; Gemma Sanchez edit  doi
openurl 
  Title Symbol Spotting in Vectorized Technical Drawings Through a Lookup Table of Region Strings Type Journal Article
  Year 2010 Publication Pattern Analysis and Applications Abbreviated Journal PAA  
  Volume 13 Issue (up) 3 Pages 321-331  
  Keywords  
  Abstract In this paper, we address the problem of symbol spotting in technical document images applied to scanned and vectorized line drawings. Like any information spotting architecture, our approach has two components. First, symbols are decomposed in primitives which are compactly represented and second a primitive indexing structure aims to efficiently retrieve similar primitives. Primitives are encoded in terms of attributed strings representing closed regions. Similar strings are clustered in a lookup table so that the set median strings act as indexing keys. A voting scheme formulates hypothesis in certain locations of the line drawing image where there is a high presence of regions similar to the queried ones, and therefore, a high probability to find the queried graphical symbol. The proposed approach is illustrated in a framework consisting in spotting furniture symbols in architectural drawings. It has been proved to work even in the presence of noise and distortion introduced by the scanning and raster-to-vector processes.  
  Address  
  Corporate Author Thesis  
  Publisher Springer-Verlag Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1433-7541 ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ RLS2010 Serial 1165  
Permanent link to this record
 

 
Author Marçal Rusiñol; Agnes Borras; Josep Llados edit  doi
openurl 
  Title Relational Indexing of Vectorial Primitives for Symbol Spotting in Line-Drawing Images Type Journal Article
  Year 2010 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 31 Issue (up) 3 Pages 188–201  
  Keywords Document image analysis and recognition, Graphics recognition, Symbol spotting ,Vectorial representations, Line-drawings  
  Abstract This paper presents a symbol spotting approach for indexing by content a database of line-drawing images. As line-drawings are digital-born documents designed by vectorial softwares, instead of using a pixel-based approach, we present a spotting method based on vector primitives. Graphical symbols are represented by a set of vectorial primitives which are described by an off-the-shelf shape descriptor. A relational indexing strategy aims to retrieve symbol locations into the target documents by using a combined numerical-relational description of 2D structures. The zones which are likely to contain the queried symbol are validated by a Hough-like voting scheme. In addition, a performance evaluation framework for symbol spotting in graphical documents is proposed. The presented methodology has been evaluated with a benchmarking set of architectural documents achieving good performance results.  
  Address  
  Corporate Author Thesis  
  Publisher Elsevier Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ RBL2010 Serial 1177  
Permanent link to this record
 

 
Author Alicia Fornes; Josep Llados; Gemma Sanchez; Dimosthenis Karatzas edit  doi
openurl 
  Title Rotation Invariant Hand-Drawn Symbol Recognition based on a Dynamic Time Warping Model Type Journal Article
  Year 2010 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR  
  Volume 13 Issue (up) 3 Pages 229–241  
  Keywords  
  Abstract One of the major difficulties of handwriting symbol recognition is the high variability among symbols because of the different writer styles. In this paper, we introduce a robust approach for describing and recognizing hand-drawn symbols tolerant to these writer style differences. This method, which is invariant to scale and rotation, is based on the dynamic time warping (DTW) algorithm. The symbols are described by vector sequences, a variation of the DTW distance is used for computing the matching distance, and K-Nearest Neighbor is used to classify them. Our approach has been evaluated in two benchmarking scenarios consisting of hand-drawn symbols. Compared with state-of-the-art methods for symbol recognition, our method shows higher tolerance to the irregular deformations induced by hand-drawn strokes.  
  Address  
  Corporate Author Thesis  
  Publisher Springer-Verlag Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1433-2833 ISBN Medium  
  Area Expedition Conference  
  Notes DAG; IF 2009: 1,213 Approved no  
  Call Number DAG @ dag @ FLS2010a Serial 1288  
Permanent link to this record
 

 
Author Mathieu Nicolas Delalandre; Ernest Valveny; Tony Pridmore; Dimosthenis Karatzas edit  doi
openurl 
  Title Generation of Synthetic Documents for Performance Evaluation of Symbol Recognition & Spotting Systems Type Journal Article
  Year 2010 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR  
  Volume 13 Issue (up) 3 Pages 187-207  
  Keywords  
  Abstract This paper deals with the topic of performance evaluation of symbol recognition & spotting systems. We propose here a new approach to the generation of synthetic graphics documents containing non-isolated symbols in a real context. This approach is based on the definition of a set of constraints that permit us to place the symbols on a pre-defined background according to the properties of a particular domain (architecture, electronics, engineering, etc.). In this way, we can obtain a large amount of images resembling real documents by simply defining the set of constraints and providing a few pre-defined backgrounds. As documents are synthetically generated, the groundtruth (the location and the label of every symbol) becomes automatically available. We have applied this approach to the generation of a large database of architectural drawings and electronic diagrams, which shows the flexibility of the system. Performance evaluation experiments of a symbol localization system show that our approach permits to generate documents with different features that are reflected in variation of localization results.  
  Address  
  Corporate Author Thesis  
  Publisher Springer-Verlag Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1433-2833 ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ DVP2010 Serial 1289  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: