toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Juan Ignacio Toledo edit  isbn
openurl 
  Title Information Extraction from Heterogeneous Handwritten Documents Type Book Whole
  Year (down) 2019 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In this thesis we explore information Extraction from totally or partially handwritten documents. Basically we are dealing with two different application scenarios. The first scenario are modern highly structured documents like forms. In this kind of documents, the semantic information is encoded in different fields with a pre-defined location in the document, therefore, information extraction becomes roughly equivalent to transcription. The second application scenario are loosely structured totally handwritten documents, besides transcribing them, we need to assign a semantic label, from a set of known values to the handwritten words.
In both scenarios, transcription is an important part of the information extraction. For that reason in this thesis we present two methods based on Neural Networks, to transcribe handwritten text.In order to tackle the challenge of loosely structured documents, we have produced a benchmark, consisting of a dataset, a defined set of tasks and a metric, that was presented to the community as an international competition. Also, we propose different models based on Convolutional and Recurrent neural networks that are able to transcribe and assign different semantic labels to each handwritten words, that is, able to perform Information Extraction.
 
  Address July 2019  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Alicia Fornes;Josep Llados  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-948531-7-3 Medium  
  Area Expedition Conference  
  Notes DAG; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ Tol2019 Serial 3389  
Permanent link to this record
 

 
Author Juan Ignacio Toledo; Manuel Carbonell; Alicia Fornes; Josep Llados edit  url
openurl 
  Title Information Extraction from Historical Handwritten Document Images with a Context-aware Neural Model Type Journal Article
  Year (down) 2019 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 86 Issue Pages 27-36  
  Keywords Document image analysis; Handwritten documents; Named entity recognition; Deep neural networks  
  Abstract Many historical manuscripts that hold trustworthy memories of the past societies contain information organized in a structured layout (e.g. census, birth or marriage records). The precious information stored in these documents cannot be effectively used nor accessed without costly annotation efforts. The transcription driven by the semantic categories of words is crucial for the subsequent access. In this paper we describe an approach to extract information from structured historical handwritten text images and build a knowledge representation for the extraction of meaning out of historical data. The method extracts information, such as named entities, without the need of an intermediate transcription step, thanks to the incorporation of context information through language models. Our system has two variants, the first one is based on bigrams, whereas the second one is based on recurrent neural networks. Concretely, our second architecture integrates a Convolutional Neural Network to model visual information from word images together with a Bidirecitonal Long Short Term Memory network to model the relation among the words. This integrated sequential approach is able to extract more information than just the semantic category (e.g. a semantic category can be associated to a person in a record). Our system is generic, it deals with out-of-vocabulary words by design, and it can be applied to structured handwritten texts from different domains. The method has been validated with the ICDAR IEHHR competition protocol, outperforming the existing approaches.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.097; 601.311; 603.057; 600.084; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ TCF2019 Serial 3166  
Permanent link to this record
 

 
Author Katerine Diaz; Jesus Martinez del Rincon; Marçal Rusiñol; Aura Hernandez-Sabate edit   pdf
doi  openurl
  Title Feature Extraction by Using Dual-Generalized Discriminative Common Vectors Type Journal Article
  Year (down) 2019 Publication Journal of Mathematical Imaging and Vision Abbreviated Journal JMIV  
  Volume 61 Issue 3 Pages 331-351  
  Keywords Online feature extraction; Generalized discriminative common vectors; Dual learning; Incremental learning; Decremental learning  
  Abstract In this paper, a dual online subspace-based learning method called dual-generalized discriminative common vectors (Dual-GDCV) is presented. The method extends incremental GDCV by exploiting simultaneously both the concepts of incremental and decremental learning for supervised feature extraction and classification. Our methodology is able to update the feature representation space without recalculating the full projection or accessing the previously processed training data. It allows both adding information and removing unnecessary data from a knowledge base in an efficient way, while retaining the previously acquired knowledge. The proposed method has been theoretically proved and empirically validated in six standard face recognition and classification datasets, under two scenarios: (1) removing and adding samples of existent classes, and (2) removing and adding new classes to a classification problem. Results show a considerable computational gain without compromising the accuracy of the model in comparison with both batch methodologies and other state-of-art adaptive methods.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; ADAS; 600.084; 600.118; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ DRR2019 Serial 3172  
Permanent link to this record
 

 
Author Lasse Martensson; Ekta Vats; Anders Hast; Alicia Fornes edit  url
openurl 
  Title In Search of the Scribe: Letter Spotting as a Tool for Identifying Scribes in Large Handwritten Text Corpora Type Journal
  Year (down) 2019 Publication Journal for Information Technology Studies as a Human Science Abbreviated Journal HUMAN IT  
  Volume 14 Issue 2 Pages 95-120  
  Keywords Scribal attribution/ writer identification; digital palaeography; word spotting; mediaeval charters; mediaeval manuscripts  
  Abstract In this article, a form of the so-called word spotting-method is used on a large set of handwritten documents in order to identify those that contain script of similar execution. The point of departure for the investigation is the mediaeval Swedish manuscript Cod. Holm. D 3. The main scribe of this manuscript has yet not been identified in other documents. The current attempt aims at localising other documents that display a large degree of similarity in the characteristics of the script, these being possible candidates for being executed by the same hand. For this purpose, the method of word spotting has been employed, focusing on individual letters, and therefore the process is referred to as letter spotting in the article. In this process, a set of ‘g’:s, ‘h’:s and ‘k’:s have been selected as templates, and then a search has been made for close matches among the mediaeval Swedish charters. The search resulted in a number of charters that displayed great similarities with the manuscript D 3. The used letter spotting method thus proofed to be a very efficient sorting tool localising similar script samples.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.097; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ MVH2019 Serial 3234  
Permanent link to this record
 

 
Author Manuel Carbonell; Joan Mas; Mauricio Villegas; Alicia Fornes; Josep Llados edit   pdf
url  doi
openurl 
  Title End-to-End Handwritten Text Detection and Transcription in Full Pages Type Conference Article
  Year (down) 2019 Publication 2nd International Workshop on Machine Learning Abbreviated Journal  
  Volume 5 Issue Pages 29-34  
  Keywords Handwritten Text Recognition; Layout Analysis; Text segmentation; Deep Neural Networks; Multi-task learning  
  Abstract When transcribing handwritten document images, inaccuracies in the text segmentation step often cause errors in the subsequent transcription step. For this reason, some recent methods propose to perform the recognition at paragraph level. But still, errors in the segmentation of paragraphs can affect
the transcription performance. In this work, we propose an end-to-end framework to transcribe full pages. The joint text detection and transcription allows to remove the layout analysis requirement at test time. The experimental results show that our approach can achieve comparable results to models that assume
segmented paragraphs, and suggest that joining the two tasks brings an improvement over doing the two tasks separately.
 
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR WML  
  Notes DAG; 600.140; 601.311; 600.140 Approved no  
  Call Number Admin @ si @ CMV2019 Serial 3353  
Permanent link to this record
 

 
Author Marçal Rusiñol edit  url
openurl 
  Title Classificació semàntica i visual de documents digitals Type Journal
  Year (down) 2019 Publication Revista de biblioteconomia i documentacio Abbreviated Journal  
  Volume Issue Pages 75-86  
  Keywords  
  Abstract Se analizan los sistemas de procesamiento automático que trabajan sobre documentos digitalizados con el objetivo de describir los contenidos. De esta forma contribuyen a facilitar el acceso, permitir la indización automática y hacer accesibles los documentos a los motores de búsqueda. El objetivo de estas tecnologías es poder entrenar modelos computacionales que sean capaces de clasificar, agrupar o realizar búsquedas sobre documentos digitales. Así, se describen las tareas de clasificación, agrupamiento y búsqueda. Cuando utilizamos tecnologías de inteligencia artificial en los sistemas de
clasificación esperamos que la herramienta nos devuelva etiquetas semánticas; en sistemas de agrupamiento que nos devuelva documentos agrupados en clusters significativos; y en sistemas de búsqueda esperamos que dada una consulta, nos devuelva una lista ordenada de documentos en función de la relevancia. A continuación se da una visión de conjunto de los métodos que nos permiten describir los documentos digitales, tanto de manera visual (cuál es su apariencia), como a partir de sus contenidos semánticos (de qué hablan). En cuanto a la descripción visual de documentos se aborda el estado de la cuestión de las representaciones numéricas de documentos digitalizados
tanto por métodos clásicos como por métodos basados en el aprendizaje profundo (deep learning). Respecto de la descripción semántica de los contenidos se analizan técnicas como el reconocimiento óptico de caracteres (OCR); el cálculo de estadísticas básicas sobre la aparición de las diferentes palabras en un texto (bag-of-words model); y los métodos basados en aprendizaje profundo como el método word2vec, basado en una red neuronal que, dadas unas cuantas palabras de un texto, debe predecir cuál será la
siguiente palabra. Desde el campo de las ingenierías se están transfiriendo conocimientos que se han integrado en productos o servicios en los ámbitos de la archivística, la biblioteconomía, la documentación y las plataformas de gran consumo, sin embargo los algoritmos deben ser lo suficientemente eficientes no sólo para el reconocimiento y transcripción literal sino también para la capacidad de interpretación de los contenidos.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.084; 600.135; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ Rus2019 Serial 3282  
Permanent link to this record
 

 
Author Marçal Rusiñol; Lluis Gomez; A. Landman; M. Silva Constenla; Dimosthenis Karatzas edit   pdf
openurl 
  Title Automatic Structured Text Reading for License Plates and Utility Meters Type Conference Article
  Year (down) 2019 Publication BMVC Workshop on Visual Artificial Intelligence and Entrepreneurship Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Reading text in images has attracted interest from computer vision researchers for
many years. Our technology focuses on the extraction of structured text – such as serial
numbers, machine readings, product codes, etc. – so that it is able to center its attention just on the relevant textual elements. It is conceived to work in an end-to-end fashion, bypassing any explicit text segmentation stage. In this paper we present two different industrial use cases where we have applied our automatic structured text reading technology. In the first one, we demonstrate an outstanding performance when reading license plates compared to the current state of the art. In the second one, we present results on our solution for reading utility meters. The technology is commercialized by a recently created spin-off company, and both solutions are at different stages of integration with final clients.
 
  Address Cardiff; UK; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference BMVC-VAIE19  
  Notes DAG; 600.129 Approved no  
  Call Number Admin @ si @ RGL2019 Serial 3283  
Permanent link to this record
 

 
Author Mohammed Al Rawi; Ernest Valveny edit   pdf
url  doi
openurl 
  Title Compact and Efficient Multitask Learning in Vision, Language and Speech Type Conference Article
  Year (down) 2019 Publication IEEE International Conference on Computer Vision Workshops Abbreviated Journal  
  Volume Issue Pages 2933-2942  
  Keywords  
  Abstract Across-domain multitask learning is a challenging area of computer vision and machine learning due to the intra-similarities among class distributions. Addressing this problem to cope with the human cognition system by considering inter and intra-class categorization and recognition complicates the problem even further. We propose in this work an effective holistic and hierarchical learning by using a text embedding layer on top of a deep learning model. We also propose a novel sensory discriminator approach to resolve the collisions between different tasks and domains. We then train the model concurrently on textual sentiment analysis, speech recognition, image classification, action recognition from video, and handwriting word spotting of two different scripts (Arabic and English). The model we propose successfully learned different tasks across multiple domains.  
  Address Seul; Korea; October 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCVW  
  Notes DAG; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ RaV2019 Serial 3365  
Permanent link to this record
 

 
Author Mohammed Al Rawi; Ernest Valveny; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Can One Deep Learning Model Learn Script-Independent Multilingual Word-Spotting? Type Conference Article
  Year (down) 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 260-267  
  Keywords  
  Abstract Word spotting has gained increased attention lately as it can be used to extract textual information from handwritten documents and scene-text images. Current word spotting approaches are designed to work on a single language and/or script. Building intelligent models that learn script-independent multilingual word-spotting is challenging due to the large variability of multilingual alphabets and symbols. We used ResNet-152 and the Pyramidal Histogram of Characters (PHOC) embedding to build a one-model script-independent multilingual word-spotting and we tested it on Latin, Arabic, and Bangla (Indian) languages. The one-model we propose performs on par with the multi-model language-specific word-spotting system, and thus, reduces the number of models needed for each script and/or language.  
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.129; 600.121 Approved no  
  Call Number Admin @ si @ RVK2019 Serial 3337  
Permanent link to this record
 

 
Author Nibal Nayef; Yash Patel; Michal Busta; Pinaki Nath Chowdhury; Dimosthenis Karatzas; Wafa Khlif; Jiri Matas; Umapada Pal; Jean-Christophe Burie; Cheng-lin Liu; Jean-Marc Ogier edit   pdf
url  doi
openurl 
  Title ICDAR2019 Robust Reading Challenge on Multi-lingual Scene Text Detection and Recognition — RRC-MLT-2019 Type Conference Article
  Year (down) 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1582-1587  
  Keywords  
  Abstract With the growing cosmopolitan culture of modern cities, the need of robust Multi-Lingual scene Text (MLT) detection and recognition systems has never been more immense. With the goal to systematically benchmark and push the state-of-the-art forward, the proposed competition builds on top of the RRC-MLT-2017 with an additional end-to-end task, an additional language in the real images dataset, a large scale multi-lingual synthetic dataset to assist the training, and a baseline End-to-End recognition method. The real dataset consists of 20,000 images containing text from 10 languages. The challenge has 4 tasks covering various aspects of multi-lingual scene text: (a) text detection, (b) cropped word script classification, (c) joint text detection and script classification and (d) end-to-end detection and recognition. In total, the competition received 60 submissions from the research and industrial communities. This paper presents the dataset, the tasks and the findings of the presented RRC-MLT-2019 challenge.  
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ NPB2019 Serial 3341  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: