|   | 
Details
   web
Records
Author Josep Brugues Pujolras; Lluis Gomez; Dimosthenis Karatzas
Title A Multilingual Approach to Scene Text Visual Question Answering Type (down) Conference Article
Year 2022 Publication Document Analysis Systems.15th IAPR International Workshop, (DAS2022) Abbreviated Journal
Volume Issue Pages 65-79
Keywords Scene text; Visual question answering; Multilingual word embeddings; Vision and language; Deep learning
Abstract Scene Text Visual Question Answering (ST-VQA) has recently emerged as a hot research topic in Computer Vision. Current ST-VQA models have a big potential for many types of applications but lack the ability to perform well on more than one language at a time due to the lack of multilingual data, as well as the use of monolingual word embeddings for training. In this work, we explore the possibility to obtain bilingual and multilingual VQA models. In that regard, we use an already established VQA model that uses monolingual word embeddings as part of its pipeline and substitute them by FastText and BPEmb multilingual word embeddings that have been aligned to English. Our experiments demonstrate that it is possible to obtain bilingual and multilingual VQA models with a minimal loss in performance in languages not used during training, as well as a multilingual model trained in multiple languages that match the performance of the respective monolingual baselines.
Address La Rochelle, France; May 22–25, 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference DAS
Notes DAG; 611.004; 600.155; 601.002 Approved no
Call Number Admin @ si @ BGK2022b Serial 3695
Permanent link to this record
 

 
Author Bojana Gajic; Ariel Amato; Ramon Baldrich; Joost Van de Weijer; Carlo Gatta
Title Area Under the ROC Curve Maximization for Metric Learning Type (down) Conference Article
Year 2022 Publication CVPR 2022 Workshop on Efficien Deep Learning for Computer Vision (ECV 2022, 5th Edition) Abbreviated Journal
Volume Issue Pages
Keywords Training; Computer vision; Conferences; Area measurement; Benchmark testing; Pattern recognition
Abstract Most popular metric learning losses have no direct relation with the evaluation metrics that are subsequently applied to evaluate their performance. We hypothesize that training a metric learning model by maximizing the area under the ROC curve (which is a typical performance measure of recognition systems) can induce an implicit ranking suitable for retrieval problems. This hypothesis is supported by previous work that proved that a curve dominates in ROC space if and only if it dominates in Precision-Recall space. To test this hypothesis, we design and maximize an approximated, derivable relaxation of the area under the ROC curve. The proposed AUC loss achieves state-of-the-art results on two large scale retrieval benchmark datasets (Stanford Online Products and DeepFashion In-Shop). Moreover, the AUC loss achieves comparable performance to more complex, domain specific, state-of-the-art methods for vehicle re-identification.
Address New Orleans, USA; 20 June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes CIC; LAMP; Approved no
Call Number Admin @ si @ GAB2022 Serial 3700
Permanent link to this record
 

 
Author Javad Zolfaghari Bengar; Joost Van de Weijer; Laura Lopez-Fuentes; Bogdan Raducanu
Title Class-Balanced Active Learning for Image Classification Type (down) Conference Article
Year 2022 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Active learning aims to reduce the labeling effort that is required to train algorithms by learning an acquisition function selecting the most relevant data for which a label should be requested from a large unlabeled data pool. Active learning is generally studied on balanced datasets where an equal amount of images per class is available. However, real-world datasets suffer from severe imbalanced classes, the so called long-tail distribution. We argue that this further complicates the active learning process, since the imbalanced data pool can result in suboptimal classifiers. To address this problem in the context of active learning, we proposed a general optimization framework that explicitly takes class-balancing into account. Results on three datasets showed that the method is general (it can be combined with most existing active learning algorithms) and can be effectively applied to boost the performance of both informative and representative-based active learning methods. In addition, we showed that also on balanced datasets
our method 1 generally results in a performance gain.
Address Virtual; Waikoloa; Hawai; USA; January 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WACV
Notes LAMP; 602.200; 600.147; 600.120 Approved no
Call Number Admin @ si @ ZWL2022 Serial 3703
Permanent link to this record
 

 
Author Alex Gomez-Villa; Bartlomiej Twardowski; Lu Yu; Andrew Bagdanov; Joost Van de Weijer
Title Continually Learning Self-Supervised Representations With Projected Functional Regularization Type (down) Conference Article
Year 2022 Publication CVPR 2022 Workshop on Continual Learning (CLVision, 3rd Edition) Abbreviated Journal
Volume Issue Pages 3866-3876
Keywords Computer vision; Conferences; Self-supervised learning; Image representation; Pattern recognition
Abstract Recent self-supervised learning methods are able to learn high-quality image representations and are closing the gap with supervised approaches. However, these methods are unable to acquire new knowledge incrementally – they are, in fact, mostly used only as a pre-training phase over IID data. In this work we investigate self-supervised methods in continual learning regimes without any replay
mechanism. We show that naive functional regularization,also known as feature distillation, leads to lower plasticity and limits continual learning performance. Instead, we propose Projected Functional Regularization in which a separate temporal projection network ensures that the newly learned feature space preserves information of the previous one, while at the same time allowing for the learning of new features. This prevents forgetting while maintaining the plasticity of the learner. Comparison with other incremental learning approaches applied to self-supervision demonstrates that our method obtains competitive performance in
different scenarios and on multiple datasets.
Address New Orleans, USA; 20 June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes LAMP: 600.147; 600.120 Approved no
Call Number Admin @ si @ GTY2022 Serial 3704
Permanent link to this record
 

 
Author Gemma Rotger; Francesc Moreno-Noguer; Felipe Lumbreras; Antonio Agudo
Title Single view facial hair 3D reconstruction Type (down) Conference Article
Year 2019 Publication 9th Iberian Conference on Pattern Recognition and Image Analysis Abbreviated Journal
Volume 11867 Issue Pages 423-436
Keywords 3D Vision; Shape Reconstruction; Facial Hair Modeling
Abstract n this work, we introduce a novel energy-based framework that addresses the challenging problem of 3D reconstruction of facial hair from a single RGB image. To this end, we identify hair pixels over the image via texture analysis and then determine individual hair fibers that are modeled by means of a parametric hair model based on 3D helixes. We propose to minimize an energy composed of several terms, in order to adapt the hair parameters that better fit the image detections. The final hairs respond to the resulting fibers after a post-processing step where we encourage further realism. The resulting approach generates realistic facial hair fibers from solely an RGB image without assuming any training data nor user interaction. We provide an experimental evaluation on real-world pictures where several facial hair styles and image conditions are observed, showing consistent results and establishing a comparison with respect to competing approaches.
Address Madrid; July 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IbPRIA
Notes ADAS; 600.086; 600.130; 600.122 Approved no
Call Number Admin @ si @ Serial 3707
Permanent link to this record
 

 
Author Bojana Gajic; Ramon Baldrich
Title Cross-domain fashion image retrieval Type (down) Conference Article
Year 2018 Publication CVPR 2018 Workshop on Women in Computer Vision (WiCV 2018, 4th Edition) Abbreviated Journal
Volume Issue Pages 19500-19502
Keywords
Abstract Cross domain image retrieval is a challenging task that implies matching images from one domain to their pairs from another domain. In this paper we focus on fashion image retrieval, which involves matching an image of a fashion item taken by users, to the images of the same item taken in controlled condition, usually by professional photographer. When facing this problem, we have different products
in train and test time, and we use triplet loss to train the network. We stress the importance of proper training of simple architecture, as well as adapting general models to the specific task.
Address Salt Lake City, USA; 22 June 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes CIC; 600.087 Approved no
Call Number Admin @ si @ Serial 3709
Permanent link to this record
 

 
Author Bojana Gajic; Eduard Vazquez; Ramon Baldrich
Title Evaluation of Deep Image Descriptors for Texture Retrieval Type (down) Conference Article
Year 2017 Publication Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017) Abbreviated Journal
Volume Issue Pages 251-257
Keywords Texture Representation; Texture Retrieval; Convolutional Neural Networks; Psychophysical Evaluation
Abstract The increasing complexity learnt in the layers of a Convolutional Neural Network has proven to be of great help for the task of classification. The topic has received great attention in recently published literature.
Nonetheless, just a handful of works study low-level representations, commonly associated with lower layers. In this paper, we explore recent findings which conclude, counterintuitively, the last layer of the VGG convolutional network is the best to describe a low-level property such as texture. To shed some light on this issue, we are proposing a psychophysical experiment to evaluate the adequacy of different layers of the VGG network for texture retrieval. Results obtained suggest that, whereas the last convolutional layer is a good choice for a specific task of classification, it might not be the best choice as a texture descriptor, showing a very poor performance on texture retrieval. Intermediate layers show the best performance, showing a good combination of basic filters, as in the primary visual cortex, and also a degree of higher level information to describe more complex textures.
Address Porto, Portugal; 27 February – 1 March 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VISIGRAPP
Notes CIC; 600.087 Approved no
Call Number Admin @ si @ Serial 3710
Permanent link to this record
 

 
Author Jose Elias Yauri; Aura Hernandez-Sabate; Pau Folch; Debora Gil
Title Mental Workload Detection Based on EEG Analysis Type (down) Conference Article
Year 2021 Publication Artificial Intelligent Research and Development. Proceedings 23rd International Conference of the Catalan Association for Artificial Intelligence. Abbreviated Journal
Volume 339 Issue Pages 268-277
Keywords Cognitive states; Mental workload; EEG analysis; Neural Networks.
Abstract The study of mental workload becomes essential for human work efficiency, health conditions and to avoid accidents, since workload compromises both performance and awareness. Although workload has been widely studied using several physiological measures, minimising the sensor network as much as possible remains both a challenge and a requirement.
Electroencephalogram (EEG) signals have shown a high correlation to specific cognitive and mental states like workload. However, there is not enough evidence in the literature to validate how well models generalize in case of new subjects performing tasks of a workload similar to the ones included during model’s training.
In this paper we propose a binary neural network to classify EEG features across different mental workloads. Two workloads, low and medium, are induced using two variants of the N-Back Test. The proposed model was validated in a dataset collected from 16 subjects and shown a high level of generalization capability: model reported an average recall of 81.81% in a leave-one-out subject evaluation.
Address Virtual; October 20-22 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CCIA
Notes IAM; 600.139; 600.118; 600.145 Approved no
Call Number Admin @ si @ Serial 3723
Permanent link to this record
 

 
Author Giuseppe De Gregorio; Sanket Biswas; Mohamed Ali Souibgui; Asma Bensalah; Josep Llados; Alicia Fornes; Angelo Marcelli
Title A Few Shot Multi-representation Approach for N-Gram Spotting in Historical Manuscripts Type (down) Conference Article
Year 2022 Publication Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) Abbreviated Journal
Volume 13639 Issue Pages 3-12
Keywords N-gram spotting; Few-shot learning; Multimodal understanding; Historical handwritten collections
Abstract Despite recent advances in automatic text recognition, the performance remains moderate when it comes to historical manuscripts. This is mainly because of the scarcity of available labelled data to train the data-hungry Handwritten Text Recognition (HTR) models. The Keyword Spotting System (KWS) provides a valid alternative to HTR due to the reduction in error rate, but it is usually limited to a closed reference vocabulary. In this paper, we propose a few-shot learning paradigm for spotting sequences of a few characters (N-gram) that requires a small amount of labelled training data. We exhibit that recognition of important n-grams could reduce the system’s dependency on vocabulary. In this case, an out-of-vocabulary (OOV) word in an input handwritten line image could be a sequence of n-grams that belong to the lexicon. An extensive experimental evaluation of our proposed multi-representation approach was carried out on a subset of Bentham’s historical manuscript collections to obtain some really promising results in this direction.
Address December 04 – 07, 2022; Hyderabad, India
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICFHR
Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no
Call Number Admin @ si @ GBS2022 Serial 3733
Permanent link to this record
 

 
Author Arnau Baro; Carles Badal; Pau Torras; Alicia Fornes
Title Handwritten Historical Music Recognition through Sequence-to-Sequence with Attention Mechanism Type (down) Conference Article
Year 2022 Publication 3rd International Workshop on Reading Music Systems (WoRMS2021) Abbreviated Journal
Volume Issue Pages 55-59
Keywords Optical Music Recognition; Digits; Image Classification
Abstract Despite decades of research in Optical Music Recognition (OMR), the recognition of old handwritten music scores remains a challenge because of the variabilities in the handwriting styles, paper degradation, lack of standard notation, etc. Therefore, the research in OMR systems adapted to the particularities of old manuscripts is crucial to accelerate the conversion of music scores existing in archives into digital libraries, fostering the dissemination and preservation of our music heritage. In this paper we explore the adaptation of sequence-to-sequence models with attention mechanism (used in translation and handwritten text recognition) and the generation of specific synthetic data for recognizing old music scores. The experimental validation demonstrates that our approach is promising, especially when compared with long short-term memory neural networks.
Address July 23, 2021, Alicante (Spain)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WoRMS
Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no
Call Number Admin @ si @ BBT2022 Serial 3734
Permanent link to this record
 

 
Author Pau Torras; Arnau Baro; Alicia Fornes; Lei Kang
Title Improving Handwritten Music Recognition through Language Model Integration Type (down) Conference Article
Year 2022 Publication 4th International Workshop on Reading Music Systems (WoRMS2022) Abbreviated Journal
Volume Issue Pages 42-46
Keywords optical music recognition; historical sources; diversity; music theory; digital humanities
Abstract Handwritten Music Recognition, especially in the historical domain, is an inherently challenging endeavour; paper degradation artefacts and the ambiguous nature of handwriting make recognising such scores an error-prone process, even for the current state-of-the-art Sequence to Sequence models. In this work we propose a way of reducing the production of statistically implausible output sequences by fusing a Language Model into a recognition Sequence to Sequence model. The idea is leveraging visually-conditioned and context-conditioned output distributions in order to automatically find and correct any mistakes that would otherwise break context significantly. We have found this approach to improve recognition results to 25.15 SER (%) from a previous best of 31.79 SER (%) in the literature.
Address November 18, 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WoRMS
Notes DAG; 600.121; 600.162; 602.230 Approved no
Call Number Admin @ si @ TBF2022 Serial 3735
Permanent link to this record
 

 
Author Asma Bensalah; Alicia Fornes; Cristina Carmona_Duarte; Josep Llados
Title Easing Automatic Neurorehabilitation via Classification and Smoothness Analysis Type (down) Conference Article
Year 2022 Publication Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022 Abbreviated Journal
Volume 13424 Issue Pages 336-348
Keywords Neurorehabilitation; Upper-lim; Movement classification; Movement smoothness; Deep learning; Jerk
Abstract Assessing the quality of movements for post-stroke patients during the rehabilitation phase is vital given that there is no standard stroke rehabilitation plan for all the patients. In fact, it depends basically on the patient’s functional independence and its progress along the rehabilitation sessions. To tackle this challenge and make neurorehabilitation more agile, we propose an automatic assessment pipeline that starts by recognising patients’ movements by means of a shallow deep learning architecture, then measuring the movement quality using jerk measure and related measures. A particularity of this work is that the dataset used is clinically relevant, since it represents movements inspired from Fugl-Meyer a well common upper-limb clinical stroke assessment scale for stroke patients. We show that it is possible to detect the contrast between healthy and patients movements in terms of smoothness, besides achieving conclusions about the patients’ progress during the rehabilitation sessions that correspond to the clinicians’ findings about each case.
Address June 7-9, 2022, Las Palmas de Gran Canaria, Spain
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IGS
Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no
Call Number Admin @ si @ BFC2022 Serial 3738
Permanent link to this record
 

 
Author Alicia Fornes; Asma Bensalah; Cristina Carmona_Duarte; Jialuo Chen; Miguel A. Ferrer; Andreas Fischer; Josep Llados; Cristina Martin; Eloy Opisso; Rejean Plamondon; Anna Scius-Bertrand; Josep Maria Tormos
Title The RPM3D Project: 3D Kinematics for Remote Patient Monitoring Type (down) Conference Article
Year 2022 Publication Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022 Abbreviated Journal
Volume 13424 Issue Pages 217-226
Keywords Healthcare applications; Kinematic; Theory of Rapid Human Movements; Human activity recognition; Stroke rehabilitation; 3D kinematics
Abstract This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches. We base our analysis on the Kinematic Theory of Rapid Human Movement. We have validated our research in a real case scenario for stroke rehabilitation at the Guttmann Institute (https://www.guttmann.com/en/) (neurorehabilitation hospital), showing promising results. Our work could have a great impact in remote healthcare applications, improving the medical efficiency and reducing the healthcare costs. Future steps include more clinical validation, developing multi-modal analysis architectures (analysing data from sensors, images, audio, etc.), and exploring the application of our technology to monitor other neurodegenerative diseases.
Address June 7-9, 2022, Las Palmas de Gran Canaria, Spain
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IGS
Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no
Call Number Admin @ si @ FBC2022 Serial 3739
Permanent link to this record
 

 
Author Arnau Baro; Pau Riba; Alicia Fornes
Title Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network Type (down) Conference Article
Year 2022 Publication Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) Abbreviated Journal
Volume 13639 Issue Pages 171-184
Keywords Object detection; Optical music recognition; Graph neural network
Abstract During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results.
Address December 04 – 07, 2022; Hyderabad, India
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICFHR
Notes DAG; 600.162; 600.140; 602.230 Approved no
Call Number Admin @ si @ BRF2022b Serial 3740
Permanent link to this record
 

 
Author Carlos Boned Riera; Oriol Ramos Terrades
Title Discriminative Neural Variational Model for Unbalanced Classification Tasks in Knowledge Graph Type (down) Conference Article
Year 2022 Publication 26th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 2186-2191
Keywords Measurement; Couplings; Semantics; Ear; Benchmark testing; Data models; Pattern recognition
Abstract Nowadays the paradigm of link discovery problems has shown significant improvements on Knowledge Graphs. However, method performances are harmed by the unbalanced nature of this classification problem, since many methods are easily biased to not find proper links. In this paper we present a discriminative neural variational auto-encoder model, called DNVAE from now on, in which we have introduced latent variables to serve as embedding vectors. As a result, the learnt generative model approximate better the underlying distribution and, at the same time, it better differentiate the type of relations in the knowledge graph. We have evaluated this approach on benchmark knowledge graph and Census records. Results in this last data set are quite impressive since we reach the highest possible score in the evaluation metrics. However, further experiments are still needed to deeper evaluate the performance of the method in more challenging tasks.
Address Montreal; Quebec; Canada; August 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes DAG; 600.121; 600.162 Approved no
Call Number Admin @ si @ BoR2022 Serial 3741
Permanent link to this record