toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Alicia Fornes; Veronica Romero; Arnau Baro; Juan Ignacio Toledo; Joan Andreu Sanchez; Enrique Vidal; Josep Llados edit   pdf
openurl 
  Title ICDAR2017 Competition on Information Extraction in Historical Handwritten Records Type Conference Article
  Year 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1389-1394  
  Keywords  
  Abstract (up) The extraction of relevant information from historical handwritten document collections is one of the key steps in order to make these manuscripts available for access and searches. In this competition, the goal is to detect the named entities and assign each of them a semantic category, and therefore, to simulate the filling in of a knowledge database. This paper describes the dataset, the tasks, the evaluation metrics, the participants methods and the results.  
  Address Kyoto; Japan; November 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.097; 601.225; 600.121 Approved no  
  Call Number Admin @ si @ FRB2017 Serial 3052  
Permanent link to this record
 

 
Author Dimosthenis Karatzas; Lluis Gomez; Marçal Rusiñol edit   pdf
openurl 
  Title The Robust Reading Competition Annotation and Evaluation Platform Type Conference Article
  Year 2017 Publication 1st International Workshop on Open Services and Tools for Document Analysis Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) The ICDAR Robust Reading Competition (RRC), initiated in 2003 and re-established in 2011, has become the defacto evaluation standard for the international community. Concurrent with its second incarnation in 2011, a continuous effort started to develop an online framework to facilitate the hosting and management of competitions. This short paper briefly outlines the Robust Reading Competition Annotation and Evaluation Platform, the backbone of the Robust Reading Competition, comprising a collection of tools and processes that aim to simplify the management and annotation
of data, and to provide online and offline performance evaluation and analysis services
 
  Address Kyoto; Japan; November 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR-OST  
  Notes DAG; 600.084; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ KGR2017 Serial 3063  
Permanent link to this record
 

 
Author Marco Pedersoli edit  openurl
  Title Hierarchical Multiresolution Models for fast Object Detection Type Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) The ability to automatically detect and recognize objects in unconstrained images is becoming more and more critical: from security systems and autonomous robots, to smart phones and augmented reality, intelligent devices need to understand the meaning of images as a composition of semantic objects. This Thesis tackles the problem of fast object detection based on template models. Detection consists of searching for an object in an image by evaluating the similarity between a template model and an image region at each possible location and scale. In this work, we show that using a template model representation based on a multiple resolution hierarchy is an optimal choice that can lead to excellent detection accuracy and fast computation. We implement two different approaches that make use of a hierarchy of multiresolution models: a multiresolution cascade and a coarse-to-fine search. Also, we extend the coarse-to-fine search by introducing a deformable part-based model that achieves state-of-the-art results together with a very reduced computational cost. Finally, we specialize our approach to the challenging task of pedestrian detection from moving vehicles and show that the overall quality of the system outperforms previous works in terms of speed and accuracy.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ Ped2012 Serial 2203  
Permanent link to this record
 

 
Author Patricia Suarez; Angel Sappa; Boris X. Vintimilla edit   pdf
doi  openurl
  Title Cross-Spectral Image Patch Similarity using Convolutional Neural Network Type Conference Article
  Year 2017 Publication IEEE International Workshop of Electronics, Control, Measurement, Signals and their application to Mechatronics Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) The ability to compare image regions (patches) has been the basis of many approaches to core computer vision problems, including object, texture and scene categorization. Hence, developing representations for image patches have been of interest in several works. The current work focuses on learning similarity between cross-spectral image patches with a 2 channel convolutional neural network (CNN) model. The proposed approach is an adaptation of a previous work, trying to obtain similar results than the state of the art but with a lowcost hardware. Hence, obtained results are compared with both
classical approaches, showing improvements, and a state of the art CNN based approach.
 
  Address San Sebastian; Spain; May 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECMSM  
  Notes ADAS; 600.086; 600.118 Approved no  
  Call Number Admin @ si @ SSV2017a Serial 2916  
Permanent link to this record
 

 
Author Jiaolong Xu; Sebastian Ramos; David Vazquez; Antonio Lopez edit   pdf
doi  openurl
  Title Domain Adaptation of Deformable Part-Based Models Type Journal Article
  Year 2014 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI  
  Volume 36 Issue 12 Pages 2367-2380  
  Keywords Domain Adaptation; Pedestrian Detection  
  Abstract (up) The accuracy of object classifiers can significantly drop when the training data (source domain) and the application scenario (target domain) have inherent differences. Therefore, adapting the classifiers to the scenario in which they must operate is of paramount importance. We present novel domain adaptation (DA) methods for object detection. As proof of concept, we focus on adapting the state-of-the-art deformable part-based model (DPM) for pedestrian detection. We introduce an adaptive structural SVM (A-SSVM) that adapts a pre-learned classifier between different domains. By taking into account the inherent structure in feature space (e.g., the parts in a DPM), we propose a structure-aware A-SSVM (SA-SSVM). Neither A-SSVM nor SA-SSVM needs to revisit the source-domain training data to perform the adaptation. Rather, a low number of target-domain training examples (e.g., pedestrians) are used. To address the scenario where there are no target-domain annotated samples, we propose a self-adaptive DPM based on a self-paced learning (SPL) strategy and a Gaussian Process Regression (GPR). Two types of adaptation tasks are assessed: from both synthetic pedestrians and general persons (PASCAL VOC) to pedestrians imaged from an on-board camera. Results show that our proposals avoid accuracy drops as high as 15 points when comparing adapted and non-adapted detectors.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0162-8828 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.057; 600.054; 601.217; 600.076 Approved no  
  Call Number ADAS @ adas @ XRV2014b Serial 2436  
Permanent link to this record
 

 
Author Alloy Das; Sanket Biswas; Ayan Banerjee; Josep Llados; Umapada Pal; Saumik Bhattacharya edit   pdf
url  openurl
  Title Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance Type Conference Article
  Year 2024 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 718-728  
  Keywords  
  Abstract (up) The adaptation capability to a wide range of domains is crucial for scene text spotting models when deployed to real-world conditions. However, existing state-of-the-art (SOTA) approaches usually incorporate scene text detection and recognition simply by pretraining on natural scene text datasets, which do not directly exploit the intermediate feature representations between multiple domains. Here, we investigate the problem of domain-adaptive scene text spotting, i.e., training a model on multi-domain source data such that it can directly adapt to target domains rather than being specialized for a specific domain or scenario. Further, we investigate a transformer baseline called Swin-TESTR to focus on solving scene-text spotting for both regular and arbitrary-shaped scene text along with an exhaustive evaluation. The results clearly demonstrate the potential of intermediate representations to achieve significant performance on text spotting benchmarks across multiple domains (e.g. language, synth-to-real, and documents). both in terms of accuracy and efficiency.  
  Address Waikoloa; Hawai; USA; January 2024  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG Approved no  
  Call Number Admin @ si @ DBB2024 Serial 3986  
Permanent link to this record
 

 
Author Arturo Fuentes; F. Javier Sanchez; Thomas Voncina; Jorge Bernal edit  doi
openurl 
  Title LAMV: Learning to Predict Where Spectators Look in Live Music Performances Type Conference Article
  Year 2021 Publication 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications Abbreviated Journal  
  Volume 5 Issue Pages 500-507  
  Keywords  
  Abstract (up) The advent of artificial intelligence has supposed an evolution on how different daily work tasks are performed. The analysis of cultural content has seen a huge boost by the development of computer-assisted methods that allows easy and transparent data access. In our case, we deal with the automation of the production of live shows, like music concerts, aiming to develop a system that can indicate the producer which camera to show based on what each of them is showing. In this context, we consider that is essential to understand where spectators look and what they are interested in so the computational method can learn from this information. The work that we present here shows the results of a first preliminary study in which we compare areas of interest defined by human beings and those indicated by an automatic system. Our system is based on the extraction of motion textures from dynamic Spatio-Temporal Volumes (STV) and then analyzing the patterns by means of texture analysis techniques. We validate our approach over several video sequences that have been labeled by 16 different experts. Our method is able to match those relevant areas identified by the experts, achieving recall scores higher than 80% when a distance of 80 pixels between method and ground truth is considered. Current performance shows promise when detecting abnormal peaks and movement trends.  
  Address Virtual; February 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference VISIGRAPP  
  Notes MV; ISE; 600.119; Approved no  
  Call Number Admin @ si @ FSV2021 Serial 3570  
Permanent link to this record
 

 
Author Lei Kang; Pau Riba; Marçal Rusiñol; Alicia Fornes; Mauricio Villegas edit   file
url  doi
openurl 
  Title Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition Type Journal Article
  Year 2022 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 129 Issue Pages 108766  
  Keywords  
  Abstract (up) The advent of recurrent neural networks for handwriting recognition marked an important milestone reaching impressive recognition accuracies despite the great variability that we observe across different writing styles. Sequential architectures are a perfect fit to model text lines, not only because of the inherent temporal aspect of text, but also to learn probability distributions over sequences of characters and words. However, using such recurrent paradigms comes at a cost at training stage, since their sequential pipelines prevent parallelization. In this work, we introduce a non-recurrent approach to recognize handwritten text by the use of transformer models. We propose a novel method that bypasses any recurrence. By using multi-head self-attention layers both at the visual and textual stages, we are able to tackle character recognition as well as to learn language-related dependencies of the character sequences to be decoded. Our model is unconstrained to any predefined vocabulary, being able to recognize out-of-vocabulary words, i.e. words that do not appear in the training vocabulary. We significantly advance over prior art and demonstrate that satisfactory recognition accuracies are yielded even in few-shot learning scenarios.  
  Address Sept. 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.162 Approved no  
  Call Number Admin @ si @ KRR2022 Serial 3556  
Permanent link to this record
 

 
Author Aura Hernandez-Sabate; Debora Gil; Petia Radeva; E.N.Nofrerias edit   pdf
doi  openurl
  Title Anisotropic processing of image structures for adventitia detection in intravascular ultrasound images Type Conference Article
  Year 2004 Publication Proc. Computers in Cardiology Abbreviated Journal  
  Volume 31 Issue Pages 229-232  
  Keywords  
  Abstract (up) The adventitia layer appears as a weak edge in IVUS images with a non-uniform grey level, which difficulties its detection. In order to enhance edges, we apply an anisotropic filter that homogenizes the grey level along the image significant structures (ridges, valleys and edges). A standard edge detector applied to the filtered image yields a set of candidate points prone to be unconnected. The final model is obtained by interpolating the former line segments along the tangent direction to the level curves of the filtered image with an anisotropic contour closing technique based on functional extension principles  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Chicago (USA) Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM; MILAB Approved no  
  Call Number IAM @ iam @ HGR2004 Serial 1555  
Permanent link to this record
 

 
Author Noha Elfiky; Theo Gevers; Arjan Gijsenij; Jordi Gonzalez edit   pdf
doi  openurl
  Title Color Constancy using 3D Scene Geometry derived from a Single Image Type Journal Article
  Year 2014 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 23 Issue 9 Pages 3855-3868  
  Keywords  
  Abstract (up) The aim of color constancy is to remove the effect of the color of the light source. As color constancy is inherently an ill-posed problem, most of the existing color constancy algorithms are based on specific imaging assumptions (e.g. grey-world and white patch assumption).
In this paper, 3D geometry models are used to determine which color constancy method to use for the different geometrical regions (depth/layer) found
in images. The aim is to classify images into stages (rough 3D geometry models). According to stage models; images are divided into stage regions using hard and soft segmentation. After that, the best color constancy methods is selected for each geometry depth. To this end, we propose a method to combine color constancy algorithms by investigating the relation between depth, local image statistics and color constancy. Image statistics are then exploited per depth to select the proper color constancy method. Our approach opens the possibility to estimate multiple illuminations by distinguishing
nearby light source from distant illuminations. Experiments on state-of-the-art data sets show that the proposed algorithm outperforms state-of-the-art
single color constancy algorithms with an improvement of almost 50% of median angular error. When using a perfect classifier (i.e, all of the test images are correctly classified into stages); the performance of the proposed method achieves an improvement of 52% of the median angular error compared to the best-performing single color constancy algorithm.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes ISE; 600.078 Approved no  
  Call Number Admin @ si @ EGG2014 Serial 2528  
Permanent link to this record
 

 
Author Susana Alvarez; Maria Vanrell edit   pdf
url  doi
openurl 
  Title Texton theory revisited: a bag-of-words approach to combine textons Type Journal Article
  Year 2012 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 45 Issue 12 Pages 4312-4325  
  Keywords  
  Abstract (up) The aim of this paper is to revisit an old theory of texture perception and
update its computational implementation by extending it to colour. With this in mind we try to capture the optimality of perceptual systems. This is achieved in the proposed approach by sharing well-known early stages of the visual processes and extracting low-dimensional features that perfectly encode adequate properties for a large variety of textures without needing further learning stages. We propose several descriptors in a bag-of-words framework that are derived from different quantisation models on to the feature spaces. Our perceptual features are directly given by the shape and colour attributes of image blobs, which are the textons. In this way we avoid learning visual words and directly build the vocabularies on these lowdimensionaltexton spaces. Main differences between proposed descriptors rely on how co-occurrence of blob attributes is represented in the vocabularies. Our approach overcomes current state-of-art in colour texture description which is proved in several experiments on large texture datasets.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0031-3203 ISBN Medium  
  Area Expedition Conference  
  Notes CIC Approved no  
  Call Number Admin @ si @ AlV2012a Serial 2130  
Permanent link to this record
 

 
Author Jaime Moreno; Xavier Otazu; Maria Vanrell edit  isbn
openurl 
  Title Local Perceptual Weighting in JPEG2000 for Color Images Type Conference Article
  Year 2010 Publication 5th European Conference on Colour in Graphics, Imaging and Vision and 12th International Symposium on Multispectral Colour Science Abbreviated Journal  
  Volume Issue Pages 255–260  
  Keywords  
  Abstract (up) The aim of this work is to explain how to apply perceptual concepts to define a perceptual pre-quantizer and to improve JPEG2000 compressor. The approach consists in quantizing wavelet transform coefficients using some of the human visual system behavior properties. Noise is fatal to image compression performance, because it can be both annoying for the observer and consumes excessive bandwidth when the imagery is transmitted. Perceptual pre-quantization reduces unperceivable details and thus improve both visual impression and transmission properties. The comparison between JPEG2000 without and with perceptual pre-quantization shows that the latter is not favorable in PSNR, but the recovered image is more compressed at the same or even better visual quality measured with a weighted PSNR. Perceptual criteria were taken from the CIWaM (Chromatic Induction Wavelet Model).  
  Address Joensuu, Finland  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 9781617388897 Medium  
  Area Expedition Conference CGIV/MCS  
  Notes CIC Approved no  
  Call Number CAT @ cat @ MOV2010a Serial 1307  
Permanent link to this record
 

 
Author Jaime Moreno; Xavier Otazu; Maria Vanrell edit  openurl
  Title Contribution of CIWaM in JPEG2000 Quantization for Color Images Type Conference Article
  Year 2010 Publication Proceedings of The CREATE 2010 Conference Abbreviated Journal  
  Volume Issue Pages 132–136  
  Keywords  
  Abstract (up) The aim of this work is to explain how to apply perceptual concepts to define a perceptual pre-quantizer and to improve JPEG2000 compressor. The approach consists in quantizing wavelet transform coefficients using some of the human visual system behavior properties. Noise is fatal to image compression performance, because it can be both annoying for the observer and consumes excessive bandwidth when the imagery is transmitted. Perceptual pre-quantization reduces unperceivable details and thus improve both visual impression and transmission properties. The comparison between JPEG2000 without and with perceptual pre-quantization shows that the latter is not favorable in PSNR, but the recovered image is more compressed at the same or even better visual quality measured with a weighted PSNR. Perceptual criteria were taken from the CIWaM(ChromaticInductionWaveletModel).  
  Address Gjovik (Norway)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CREATE  
  Notes CIC Approved no  
  Call Number CAT @ cat @ MOV2010b Serial 1308  
Permanent link to this record
 

 
Author Alicia Fornes; Josep Llados; Gemma Sanchez; Horst Bunke edit  doi
openurl 
  Title Writer Identification in Old Handwritten Music Scores Type Book Chapter
  Year 2012 Publication Pattern Recognition and Signal Processing in Archaeometry: Mathematical and Computational Solutions for Archaeology Abbreviated Journal  
  Volume Issue Pages 27-63  
  Keywords  
  Abstract (up) The aim of writer identification is determining the writer of a piece of handwriting from a set of writers. In this paper we present a system for writer identification in old handwritten music scores. Even though an important amount of compositions contains handwritten text in the music scores, the aim of our work is to use only music notation to determine the author. The steps of the system proposed are the following. First of all, the music sheet is preprocessed and normalized for obtaining a single binarized music line, without the staff lines. Afterwards, 100 features are extracted for every music line, which are subsequently used in a k-NN classifier that compares every feature vector with prototypes stored in a database. By applying feature selection and extraction methods on the original feature set, the performance is increased. The proposed method has been tested on a database of old music scores from the 17th to 19th centuries, achieving a recognition rate of about 95%.  
  Address  
  Corporate Author Thesis  
  Publisher IGI-Global Place of Publication Editor Copnstantin Papaodysseus  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number Admin @ si @ FLS2012 Serial 1828  
Permanent link to this record
 

 
Author Alicia Fornes; Josep Llados; Gemma Sanchez; Xavier Otazu; Horst Bunke edit  doi
openurl 
  Title A Combination of Features for Symbol-Independent Writer Identification in Old Music Scores Type Journal Article
  Year 2010 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR  
  Volume 13 Issue 4 Pages 243-259  
  Keywords  
  Abstract (up) The aim of writer identification is determining the writer of a piece of handwriting from a set of writers. In this paper, we present an architecture for writer identification in old handwritten music scores. Even though an important amount of music compositions contain handwritten text, the aim of our work is to use only music notation to determine the author. The main contribution is therefore the use of features extracted from graphical alphabets. Our proposal consists in combining the identification results of two different approaches, based on line and textural features. The steps of the ensemble architecture are the following. First of all, the music sheet is preprocessed for removing the staff lines. Then, music lines and texture images are generated for computing line features and textural features. Finally, the classification results are combined for identifying the writer. The proposed method has been tested on a database of old music scores from the seventeenth to nineteenth centuries, achieving a recognition rate of about 92% with 20 writers.  
  Address  
  Corporate Author Thesis  
  Publisher Springer-Verlag Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1433-2833 ISBN Medium  
  Area Expedition Conference  
  Notes DAG; CAT;CIC Approved no  
  Call Number FLS2010b Serial 1319  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: