toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Arnau Baro; Pau Riba; Jorge Calvo-Zaragoza; Alicia Fornes edit  url
openurl 
  Title From Optical Music Recognition to Handwritten Music Recognition: a Baseline Type Journal Article
  Year 2019 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 123 Issue Pages 1-8  
  Keywords  
  Abstract Optical Music Recognition (OMR) is the branch of document image analysis that aims to convert images of musical scores into a computer-readable format. Despite decades of research, the recognition of handwritten music scores, concretely the Western notation, is still an open problem, and the few existing works only focus on a specific stage of OMR. In this work, we propose a full Handwritten Music Recognition (HMR) system based on Convolutional Recurrent Neural Networks, data augmentation and transfer learning, that can serve as a baseline for the research community.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference (up)  
  Notes DAG; 600.097; 601.302; 601.330; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ BRC2019 Serial 3275  
Permanent link to this record
 

 
Author Marçal Rusiñol edit  url
openurl 
  Title Classificació semàntica i visual de documents digitals Type Journal
  Year 2019 Publication Revista de biblioteconomia i documentacio Abbreviated Journal  
  Volume Issue Pages 75-86  
  Keywords  
  Abstract Se analizan los sistemas de procesamiento automático que trabajan sobre documentos digitalizados con el objetivo de describir los contenidos. De esta forma contribuyen a facilitar el acceso, permitir la indización automática y hacer accesibles los documentos a los motores de búsqueda. El objetivo de estas tecnologías es poder entrenar modelos computacionales que sean capaces de clasificar, agrupar o realizar búsquedas sobre documentos digitales. Así, se describen las tareas de clasificación, agrupamiento y búsqueda. Cuando utilizamos tecnologías de inteligencia artificial en los sistemas de
clasificación esperamos que la herramienta nos devuelva etiquetas semánticas; en sistemas de agrupamiento que nos devuelva documentos agrupados en clusters significativos; y en sistemas de búsqueda esperamos que dada una consulta, nos devuelva una lista ordenada de documentos en función de la relevancia. A continuación se da una visión de conjunto de los métodos que nos permiten describir los documentos digitales, tanto de manera visual (cuál es su apariencia), como a partir de sus contenidos semánticos (de qué hablan). En cuanto a la descripción visual de documentos se aborda el estado de la cuestión de las representaciones numéricas de documentos digitalizados
tanto por métodos clásicos como por métodos basados en el aprendizaje profundo (deep learning). Respecto de la descripción semántica de los contenidos se analizan técnicas como el reconocimiento óptico de caracteres (OCR); el cálculo de estadísticas básicas sobre la aparición de las diferentes palabras en un texto (bag-of-words model); y los métodos basados en aprendizaje profundo como el método word2vec, basado en una red neuronal que, dadas unas cuantas palabras de un texto, debe predecir cuál será la
siguiente palabra. Desde el campo de las ingenierías se están transfiriendo conocimientos que se han integrado en productos o servicios en los ámbitos de la archivística, la biblioteconomía, la documentación y las plataformas de gran consumo, sin embargo los algoritmos deben ser lo suficientemente eficientes no sólo para el reconocimiento y transcripción literal sino también para la capacidad de interpretación de los contenidos.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference (up)  
  Notes DAG; 600.084; 600.135; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ Rus2019 Serial 3282  
Permanent link to this record
 

 
Author Dena Bazazian; Raul Gomez; Anguelos Nicolaou; Lluis Gomez; Dimosthenis Karatzas; Andrew Bagdanov edit   pdf
url  openurl
  Title Fast: Facilitated and accurate scene text proposals through fcn guided pruning Type Journal Article
  Year 2019 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 119 Issue Pages 112-120  
  Keywords  
  Abstract Class-specific text proposal algorithms can efficiently reduce the search space for possible text object locations in an image. In this paper we combine the Text Proposals algorithm with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same recall level and thus gaining a significant speed up. Our experiments demonstrate that such text proposal approaches yield significantly higher recall rates than state-of-the-art text localization techniques, while also producing better-quality localizations. Our results on the ICDAR 2015 Robust Reading Competition (Challenge 4) and the COCO-text datasets show that, when combined with strong word classifiers, this recall margin leads to state-of-the-art results in end-to-end scene text recognition.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference (up)  
  Notes DAG; 600.084; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ BGN2019 Serial 3342  
Permanent link to this record
 

 
Author Lei Kang; Pau Riba; Mauricio Villegas; Alicia Fornes; Marçal Rusiñol edit   pdf
url  openurl
  Title Candidate Fusion: Integrating Language Modelling into a Sequence-to-Sequence Handwritten Word Recognition Architecture Type Journal Article
  Year 2021 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 112 Issue Pages 107790  
  Keywords  
  Abstract Sequence-to-sequence models have recently become very popular for tackling
handwritten word recognition problems. However, how to effectively integrate an external language model into such recognizer is still a challenging
problem. The main challenge faced when training a language model is to
deal with the language model corpus which is usually different to the one
used for training the handwritten word recognition system. Thus, the bias
between both word corpora leads to incorrectness on the transcriptions, providing similar or even worse performances on the recognition task. In this
work, we introduce Candidate Fusion, a novel way to integrate an external
language model to a sequence-to-sequence architecture. Moreover, it provides suggestions from an external language knowledge, as a new input to
the sequence-to-sequence recognizer. Hence, Candidate Fusion provides two
improvements. On the one hand, the sequence-to-sequence recognizer has
the flexibility not only to combine the information from itself and the language model, but also to choose the importance of the information provided
by the language model. On the other hand, the external language model
has the ability to adapt itself to the training corpus and even learn the
most commonly errors produced from the recognizer. Finally, by conducting
comprehensive experiments, the Candidate Fusion proves to outperform the
state-of-the-art language models for handwritten word recognition tasks.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference (up)  
  Notes DAG; 600.140; 601.302; 601.312; 600.121 Approved no  
  Call Number Admin @ si @ KRV2021 Serial 3343  
Permanent link to this record
 

 
Author Beata Megyesi; Bernhard Esslinger; Alicia Fornes; Nils Kopal; Benedek Lang; George Lasry; Karl de Leeuw; Eva Pettersson; Arno Wacker; Michelle Waldispuhl edit  url
openurl 
  Title Decryption of historical manuscripts: the DECRYPT project Type Journal Article
  Year 2020 Publication Cryptologia Abbreviated Journal CRYPT  
  Volume 44 Issue 6 Pages 545-559  
  Keywords automatic decryption; cipher collection; historical cryptology; image transcription  
  Abstract Many historians and linguists are working individually and in an uncoordinated fashion on the identification and decryption of historical ciphers. This is a time-consuming process as they often work without access to automatic methods and processes that can accelerate the decipherment. At the same time, computer scientists and cryptologists are developing algorithms to decrypt various cipher types without having access to a large number of original ciphertexts. In this paper, we describe the DECRYPT project aiming at the creation of resources and tools for historical cryptology by bringing the expertise of various disciplines together for collecting data, exchanging methods for faster progress to transcribe, decrypt and contextualize historical encrypted manuscripts. We present our goals and work-in progress of a general approach for analyzing historical encrypted manuscripts using standardized methods and a new set of state-of-the-art tools. We release the data and tools as open-source hoping that all mentioned disciplines would benefit and contribute to the research infrastructure of historical cryptology.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference (up)  
  Notes DAG; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ MEF2020 Serial 3347  
Permanent link to this record
 

 
Author Anjan Dutta; Pau Riba; Josep Llados; Alicia Fornes edit   pdf
url  openurl
  Title Hierarchical Stochastic Graphlet Embedding for Graph-based Pattern Recognition Type Journal Article
  Year 2020 Publication Neural Computing and Applications Abbreviated Journal NEUCOMA  
  Volume 32 Issue Pages 11579–11596  
  Keywords  
  Abstract Despite being very successful within the pattern recognition and machine learning community, graph-based methods are often unusable because of the lack of mathematical operations defined in graph domain. Graph embedding, which maps graphs to a vectorial space, has been proposed as a way to tackle these difficulties enabling the use of standard machine learning techniques. However, it is well known that graph embedding functions usually suffer from the loss of structural information. In this paper, we consider the hierarchical structure of a graph as a way to mitigate this loss of information. The hierarchical structure is constructed by topologically clustering the graph nodes and considering each cluster as a node in the upper hierarchical level. Once this hierarchical structure is constructed, we consider several configurations to define the mapping into a vector space given a classical graph embedding, in particular, we propose to make use of the stochastic graphlet embedding (SGE). Broadly speaking, SGE produces a distribution of uniformly sampled low-to-high-order graphlets as a way to embed graphs into the vector space. In what follows, the coarse-to-fine structure of a graph hierarchy and the statistics fetched by the SGE complements each other and includes important structural information with varied contexts. Altogether, these two techniques substantially cope with the usual information loss involved in graph embedding techniques, obtaining a more robust graph representation. This fact has been corroborated through a detailed experimental evaluation on various benchmark graph datasets, where we outperform the state-of-the-art methods.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference (up)  
  Notes DAG; 600.140; 600.121; 600.141 Approved no  
  Call Number Admin @ si @ DRL2020 Serial 3348  
Permanent link to this record
 

 
Author Pau Riba; Josep Llados; Alicia Fornes edit  url
openurl 
  Title Hierarchical graphs for coarse-to-fine error tolerant matching Type Journal Article
  Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 134 Issue Pages 116-124  
  Keywords Hierarchical graph representation; Coarse-to-fine graph matching; Graph-based retrieval  
  Abstract During the last years, graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their ability to capture both structural and appearance-based information. Thus, they provide a greater representational power than classical statistical frameworks. However, graph-based representations leads to high computational complexities usually dealt by graph embeddings or approximated matching techniques. Despite their representational power, they are very sensitive to noise and small variations of the input image. With the aim to cope with the time complexity and the variability present in the generated graphs, in this paper we propose to construct a novel hierarchical graph representation. Graph clustering techniques adapted from social media analysis have been used in order to contract a graph at different abstraction levels while keeping information about the topology. Abstract nodes attributes summarise information about the contracted graph partition. For the proposed representations, a coarse-to-fine matching technique is defined. Hence, small graphs are used as a filtering before more accurate matching methods are applied. This approach has been validated in real scenarios such as classification of colour images or retrieval of handwritten words (i.e. word spotting).  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference (up)  
  Notes DAG; 600.097; 601.302; 603.057; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ RLF2020 Serial 3349  
Permanent link to this record
 

 
Author Alicia Fornes; Josep Llados; Joana Maria Pujadas-Mora edit  url
isbn  openurl
  Title Browsing of the Social Network of the Past: Information Extraction from Population Manuscript Images Type Book Chapter
  Year 2020 Publication Handwritten Historical Document Analysis, Recognition, and Retrieval – State of the Art and Future Trends Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher World Scientific Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-981-120-323-7 Medium  
  Area Expedition Conference (up)  
  Notes DAG; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ FLP2020 Serial 3350  
Permanent link to this record
 

 
Author Joana Maria Pujadas-Mora; Alicia Fornes; Josep Llados; Gabriel Brea-Martinez; Miquel Valls-Figols edit  url
doi  isbn
openurl 
  Title The Baix Llobregat (BALL) Demographic Database, between Historical Demography and Computer Vision (nineteenth–twentieth centuries Type Book Chapter
  Year 2019 Publication Nominative Data in Demographic Research in the East and the West: monograph Abbreviated Journal  
  Volume Issue Pages 29-61  
  Keywords  
  Abstract The Baix Llobregat (BALL) Demographic Database is an ongoing database project containing individual census data from the Catalan region of Baix Llobregat (Spain) during the nineteenth and twentieth centuries. The BALL Database is built within the project ‘NETWORKS: Technology and citizen innovation for building historical social networks to understand the demographic past’ directed by Alícia Fornés from the Center for Computer Vision and Joana Maria Pujadas-Mora from the Center for Demographic Studies, both at the Universitat Autònoma de Barcelona, funded by the Recercaixa program (2017–2019).
Its webpage is http://dag.cvc.uab.es/xarxes/.The aim of the project is to develop technologies facilitating massive digitalization of demographic sources, and more specifically the padrones (local censuses), in order to reconstruct historical ‘social’ networks employing computer vision technology. Such virtual networks can be created thanks to the linkage of nominative records compiled in the local censuses across time and space. Thus, digitized versions of individual and family lifespans are established, and individuals and families can be located spatially.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-5-7996-2656-3 Medium  
  Area Expedition Conference (up)  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ PFL2019 Serial 3351  
Permanent link to this record
 

 
Author Arka Ujal Dey; Suman Ghosh; Ernest Valveny; Gaurav Harit edit   pdf
url  doi
openurl 
  Title Beyond Visual Semantics: Exploring the Role of Scene Text in Image Understanding Type Journal Article
  Year 2021 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 149 Issue Pages 164-171  
  Keywords  
  Abstract Images with visual and scene text content are ubiquitous in everyday life. However, current image interpretation systems are mostly limited to using only the visual features, neglecting to leverage the scene text content. In this paper, we propose to jointly use scene text and visual channels for robust semantic interpretation of images. We do not only extract and encode visual and scene text cues, but also model their interplay to generate a contextual joint embedding with richer semantics. The contextual embedding thus generated is applied to retrieval and classification tasks on multimedia images, with scene text content, to demonstrate its effectiveness. In the retrieval framework, we augment our learned text-visual semantic representation with scene text cues, to mitigate vocabulary misses that may have occurred during the semantic embedding. To deal with irrelevant or erroneous recognition of scene text, we also apply query-based attention to our text channel. We show how the multi-channel approach, involving visual semantics and scene text, improves upon state of the art.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference (up)  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ DGV2021 Serial 3364  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: