toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author (up) Alicia Fornes; Josep Llados; Oriol Ramos Terrades; Marçal Rusiñol edit   pdf
openurl 
  Title La Visió per Computador com a Eina per a la Interpretació Automàtica de Fonts Documentals Type Journal
  Year 2016 Publication Lligall, Revista Catalana d'Arxivística Abbreviated Journal  
  Volume 39 Issue Pages 20-46  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.097 Approved no  
  Call Number Admin @ si @ FLR2016 Serial 2897  
Permanent link to this record
 

 
Author (up) Alicia Fornes; Sergio Escalera; Josep Llados; Ernest Valveny edit  url
doi  isbn
openurl 
  Title Symbol Classification using Dynamic Aligned Shape Descriptor Type Conference Article
  Year 2010 Publication 20th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 1957–1960  
  Keywords  
  Abstract Shape representation is a difficult task because of several symbol distortions, such as occlusions, elastic deformations, gaps or noise. In this paper, we propose a new descriptor and distance computation for coping with the problem of symbol recognition in the domain of Graphical Document Image Analysis. The proposed D-Shape descriptor encodes the arrangement information of object parts in a circular structure, allowing different levels of distortion. The classification is performed using a cyclic Dynamic Time Warping based method, allowing distortions and rotation. The methodology has been validated on different data sets, showing very high recognition rates.  
  Address Istanbul (Turkey)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1051-4651 ISBN 978-1-4244-7542-1 Medium  
  Area Expedition Conference ICPR  
  Notes DAG; HUPBA; MILAB Approved no  
  Call Number BCNPCL @ bcnpcl @ FEL2010 Serial 1421  
Permanent link to this record
 

 
Author (up) Alicia Fornes; Sergio Escalera; Josep Llados; Gemma Sanchez edit  openurl
  Title Symbol Recognition by Multi-class Blurred Shape Models Type Conference Article
  Year 2007 Publication Seventh IAPR International Workshop on Graphics Recognition Abbreviated Journal  
  Volume Issue Pages 11–13  
  Keywords  
  Abstract  
  Address Curitiba (Brazil)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference GREC  
  Notes DAG; MILAB; HUPBA Approved no  
  Call Number BCNPCL @ bcnpcl @ FEL2007b Serial 910  
Permanent link to this record
 

 
Author (up) Alicia Fornes; Sergio Escalera; Josep Llados; Gemma Sanchez; Joan Mas edit  openurl
  Title Hand Drawn Symbol Recognition by Blurred Shape Model Descriptor and a Multiclass Classifier Type Book Chapter
  Year 2008 Publication Graphics Recognition: Recent Advances and New Opportunities Abbreviated Journal  
  Volume 5046 Issue Pages 30–40  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor W. Liu, J. Llados, J.M. Ogier  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; HUPBA; MILAB Approved no  
  Call Number BCNPCL @ bcnpcl @ FEL2008 Serial 989  
Permanent link to this record
 

 
Author (up) Alicia Fornes; Sergio Escalera; Josep Llados; Gemma Sanchez; Petia Radeva; Oriol Pujol edit  openurl
  Title Handwritten Symbol Recognition by a Boosted Blurred Shape Model with Error Correction Type Book Chapter
  Year 2007 Publication 3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2007), J. Marti et al. (Eds.) LNCS 4477:13–21 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Girona (Spain)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB;DAG;HUPBA Approved no  
  Call Number BCNPCL @ bcnpcl @ FEL2007a Serial 775  
Permanent link to this record
 

 
Author (up) Alicia Fornes; V.C.Kieu; M. Visani; N.Journet; Anjan Dutta edit  doi
isbn  openurl
  Title The ICDAR/GREC 2013 Music Scores Competition: Staff Removal Type Book Chapter
  Year 2014 Publication Graphics Recognition. Current Trends and Challenges Abbreviated Journal  
  Volume 8746 Issue Pages 207-220  
  Keywords Competition; Graphics recognition; Music scores; Writer identification; Staff removal  
  Abstract The first competition on music scores that was organized at ICDAR and GREC in 2011 awoke the interest of researchers, who participated in both staff removal and writer identification tasks. In this second edition, we focus on the staff removal task and simulate a real case scenario concerning old and degraded music scores. For this purpose, we have generated a new set of semi-synthetic images using two degradation models that we previously introduced: local noise and 3D distortions. In this extended paper we provide an extended description of the dataset, degradation models, evaluation metrics, the participant’s methods and the obtained results that could not be presented at ICDAR and GREC proceedings due to page limitations.  
  Address  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor B.Lamiroy; J.-M. Ogier  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-662-44853-3 Medium  
  Area Expedition Conference  
  Notes DAG; 600.077; 600.061 Approved no  
  Call Number Admin @ si @ FKV2014 Serial 2581  
Permanent link to this record
 

 
Author (up) Alicia Fornes; Veronica Romero; Arnau Baro; Juan Ignacio Toledo; Joan Andreu Sanchez; Enrique Vidal; Josep Llados edit   pdf
openurl 
  Title ICDAR2017 Competition on Information Extraction in Historical Handwritten Records Type Conference Article
  Year 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1389-1394  
  Keywords  
  Abstract The extraction of relevant information from historical handwritten document collections is one of the key steps in order to make these manuscripts available for access and searches. In this competition, the goal is to detect the named entities and assign each of them a semantic category, and therefore, to simulate the filling in of a knowledge database. This paper describes the dataset, the tasks, the evaluation metrics, the participants methods and the results.  
  Address Kyoto; Japan; November 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.097; 601.225; 600.121 Approved no  
  Call Number Admin @ si @ FRB2017 Serial 3052  
Permanent link to this record
 

 
Author (up) Alicia Fornes; Volkmar Frinken; Andreas Fischer; Jon Almazan; G. Jackson; Horst Bunke edit  doi
isbn  openurl
  Title A Keyword Spotting Approach Using Blurred Shape Model-Based Descriptors Type Conference Article
  Year 2011 Publication Proceedings of the 2011 Workshop on Historical Document Imaging and Processing Abbreviated Journal  
  Volume Issue Pages 83-90  
  Keywords  
  Abstract The automatic processing of handwritten historical documents is considered a hard problem in pattern recognition. In addition to the challenges given by modern handwritten data, a lack of training data as well as effects caused by the degradation of documents can be observed. In this scenario, keyword spotting arises to be a viable solution to make documents amenable for searching and browsing. For this task we propose the adaptation of shape descriptors used in symbol recognition. By treating each word image as a shape, it can be represented using the Blurred Shape Model and the De-formable Blurred Shape Model. Experiments on the George Washington database demonstrate that this approach is able to outperform the commonly used Dynamic Time Warping approach.  
  Address  
  Corporate Author Thesis  
  Publisher ACM Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-4503-0916-5 Medium  
  Area Expedition Conference HIP  
  Notes DAG Approved no  
  Call Number Admin @ si @ FFF2011a Serial 1823  
Permanent link to this record
 

 
Author (up) Alicia Fornes; Xavier Otazu; Josep Llados edit   pdf
doi  openurl
  Title Show through cancellation and image enhancement by multiresolution contrast processing Type Conference Article
  Year 2013 Publication 12th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 200-204  
  Keywords  
  Abstract Historical documents suffer from different types of degradation and noise such as background variation, uneven illumination or dark spots. In case of double-sided documents, another common problem is that the back side of the document usually interferes with the front side because of the transparency of the document or ink bleeding. This effect is called the show through phenomenon. Many methods are developed to solve these problems, and in the case of show-through, by scanning and matching both the front and back sides of the document. In contrast, our approach is designed to use only one side of the scanned document. We hypothesize that show-trough are low contrast components, while foreground components are high contrast ones. A Multiresolution Contrast (MC) decomposition is presented in order to estimate the contrast of features at different spatial scales. We cancel the show-through phenomenon by thresholding these low contrast components. This decomposition is also able to enhance the image removing shadowed areas by weighting spatial scales. Results show that the enhanced images improve the readability of the documents, allowing scholars both to recover unreadable words and to solve ambiguities.  
  Address Washington; USA; August 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1520-5363 ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 602.006; 600.045; 600.061; 600.052;CIC Approved no  
  Call Number Admin @ si @ FOL2013 Serial 2241  
Permanent link to this record
 

 
Author (up) Alina Matei; Andreea Glavan; Petia Radeva; Estefania Talavera edit  url
doi  openurl
  Title Towards Eating Habits Discovery in Egocentric Photo-Streams Type Journal Article
  Year 2021 Publication IEEE Access Abbreviated Journal ACCESS  
  Volume 9 Issue Pages 17495-17506  
  Keywords  
  Abstract Eating habits are learned throughout the early stages of our lives. However, it is not easy to be aware of how our food-related routine affects our healthy living. In this work, we address the unsupervised discovery of nutritional habits from egocentric photo-streams. We build a food-related behavioral pattern discovery model, which discloses nutritional routines from the activities performed throughout the days. To do so, we rely on Dynamic-Time-Warping for the evaluation of similarity among the collected days. Within this framework, we present a simple, but robust and fast novel classification pipeline that outperforms the state-of-the-art on food-related image classification with a weighted accuracy and F-score of 70% and 63%, respectively. Later, we identify days composed of nutritional activities that do not describe the habits of the person as anomalies in the daily life of the user with the Isolation Forest method. Furthermore, we show an application for the identification of food-related scenes when the camera wearer eats in isolation. Results have shown the good performance of the proposed model and its relevance to visualize the nutritional habits of individuals.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ MGR2021 Serial 3637  
Permanent link to this record
 

 
Author (up) Alloy Das; Sanket Biswas; Ayan Banerjee; Josep Llados; Umapada Pal; Saumik Bhattacharya edit   pdf
url  openurl
  Title Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance Type Conference Article
  Year 2024 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 718-728  
  Keywords  
  Abstract The adaptation capability to a wide range of domains is crucial for scene text spotting models when deployed to real-world conditions. However, existing state-of-the-art (SOTA) approaches usually incorporate scene text detection and recognition simply by pretraining on natural scene text datasets, which do not directly exploit the intermediate feature representations between multiple domains. Here, we investigate the problem of domain-adaptive scene text spotting, i.e., training a model on multi-domain source data such that it can directly adapt to target domains rather than being specialized for a specific domain or scenario. Further, we investigate a transformer baseline called Swin-TESTR to focus on solving scene-text spotting for both regular and arbitrary-shaped scene text along with an exhaustive evaluation. The results clearly demonstrate the potential of intermediate representations to achieve significant performance on text spotting benchmarks across multiple domains (e.g. language, synth-to-real, and documents). both in terms of accuracy and efficiency.  
  Address Waikoloa; Hawai; USA; January 2024  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG Approved no  
  Call Number Admin @ si @ DBB2024 Serial 3986  
Permanent link to this record
 

 
Author (up) Alloy Das; Sanket Biswas; Umapada Pal; Josep Llados edit   pdf
url  openurl
  Title Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes Type Conference Article
  Year 2024 Publication IEEE International Conference on Robotics and Automation in PACIFICO Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract When used in a real-world noisy environment, the capacity to generalize to multiple domains is essential for any autonomous scene text spotting system. However, existing state-of-the-art methods employ pretraining and fine-tuning strategies on natural scene datasets, which do not exploit the feature interaction across other complex domains. In this work, we explore and investigate the problem of domain-agnostic scene text spotting, i.e., training a model on multi-domain source data such that it can directly generalize to target domains rather than being specialized for a specific domain or scenario. In this regard, we present the community a text spotting validation benchmark called Under-Water Text (UWT) for noisy underwater scenes to establish an important case study. Moreover, we also design an efficient super-resolution based end-to-end transformer baseline called DA-TextSpotter which achieves comparable or superior performance over existing text spotting architectures for both regular and arbitrary-shaped scene text spotting benchmarks in terms of both accuracy and model efficiency. The dataset, code and pre-trained models will be released upon acceptance.  
  Address Yokohama; Japan; May 2024  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICRA  
  Notes DAG Approved no  
  Call Number Admin @ si @ DBP2024 Serial 3979  
Permanent link to this record
 

 
Author (up) Alvaro Cepero; Albert Clapes; Sergio Escalera edit   pdf
doi  openurl
  Title Automatic non-verbal communication skills analysis: a quantitative evaluation Type Journal Article
  Year 2015 Publication AI Communications Abbreviated Journal AIC  
  Volume 28 Issue 1 Pages 87-101  
  Keywords Social signal processing; human behavior analysis; multi-modal data description; multi-modal data fusion; non-verbal communication analysis; e-Learning  
  Abstract The oral communication competence is defined on the top of the most relevant skills for one's professional and personal life. Because of the importance of communication in our activities of daily living, it is crucial to study methods to evaluate and provide the necessary feedback that can be used in order to improve these communication capabilities and, therefore, learn how to express ourselves better. In this work, we propose a system capable of evaluating quantitatively the quality of oral presentations in an automatic fashion. The system is based on a multi-modal RGB, depth, and audio data description and a fusion approach in order to recognize behavioral cues and train classifiers able to eventually predict communication quality levels. The performance of the proposed system is tested on a novel dataset containing Bachelor thesis' real defenses, presentations from an 8th semester Bachelor courses, and Master courses' presentations at Universitat de Barcelona. Using as groundtruth the marks assigned by actual instructors, our system achieves high performance categorizing and ranking presentations by their quality, and also making real-valued mark predictions.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0921-7126 ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA;MILAB Approved no  
  Call Number Admin @ si @ CCE2015 Serial 2549  
Permanent link to this record
 

 
Author (up) Alvaro Cepero; Albert Clapes; Sergio Escalera edit   pdf
openurl 
  Title Quantitative analysis of non-verbal communication for competence analysis Type Conference Article
  Year 2013 Publication 16th Catalan Conference on Artificial Intelligence Abbreviated Journal  
  Volume 256 Issue Pages 105-114  
  Keywords  
  Abstract  
  Address Vic; October 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CCIA  
  Notes HUPBA;MILAB Approved no  
  Call Number Admin @ si @ CCE2013 Serial 2324  
Permanent link to this record
 

 
Author (up) Alvaro Peris; Marc Bolaños; Petia Radeva; Francisco Casacuberta edit   pdf
openurl 
  Title Video Description Using Bidirectional Recurrent Neural Networks Type Conference Article
  Year 2016 Publication 25th International Conference on Artificial Neural Networks Abbreviated Journal  
  Volume 2 Issue Pages 3-11  
  Keywords Video description; Neural Machine Translation; Birectional Recurrent Neural Networks; LSTM; Convolutional Neural Networks  
  Abstract Although traditionally used in the machine translation field, the encoder-decoder framework has been recently applied for the generation of video and image descriptions. The combination of Convolutional and Recurrent Neural Networks in these models has proven to outperform the previous state of the art, obtaining more accurate video descriptions. In this work we propose pushing further this model by introducing two contributions into the encoding stage. First, producing richer image representations by combining object and location information from Convolutional Neural Networks and second, introducing Bidirectional Recurrent Neural Networks for capturing both forward and backward temporal relationships in the input frames.  
  Address Barcelona; September 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICANN  
  Notes MILAB; Approved no  
  Call Number Admin @ si @ PBR2016 Serial 2833  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: