toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links (down)
Author Akhil Gurram; Antonio Lopez edit   pdf
url  openurl
  Title On the Metrics for Evaluating Monocular Depth Estimation Type Miscellaneous
  Year 2023 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Monocular Depth Estimation (MDE) is performed to produce 3D information that can be used in downstream tasks such as those related to on-board perception for Autonomous Vehicles (AVs) or driver assistance. Therefore, a relevant arising question is whether the standard metrics for MDE assessment are a good indicator of the accuracy of future MDE-based driving-related perception tasks. We address this question in this paper. In particular, we take the task of 3D object detection on point clouds as a proxy of on-board perception. We train and test state-of-the-art 3D object detectors using 3D point clouds coming from MDE models. We confront the ranking of object detection results with the ranking given by the depth estimation metrics of the MDE models. We conclude that, indeed, MDE evaluation metrics give rise to a ranking of methods that reflects relatively well the 3D object detection results we may expect. Among the different metrics, the absolute relative (abs-rel) error seems to be the best for that purpose.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ GuL2023 Serial 3867  
Permanent link to this record
 

 
Author Mert Kilickaya; Joost van de Weijer; Yuki M. Asano edit   pdf
url  openurl
  Title Towards Label-Efficient Incremental Learning: A Survey Type Miscellaneous
  Year 2023 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The current dominant paradigm when building a machine learning model is to iterate over a dataset over and over until convergence. Such an approach is non-incremental, as it assumes access to all images of all categories at once. However, for many applications, non-incremental learning is unrealistic. To that end, researchers study incremental learning, where a learner is required to adapt to an incoming stream of data with a varying distribution while preventing forgetting of past knowledge. Significant progress has been made, however, the vast majority of works focus on the fully supervised setting, making these algorithms label-hungry thus limiting their real-life deployment. To that end, in this paper, we make the first attempt to survey recently growing interest in label-efficient incremental learning. We identify three subdivisions, namely semi-, few-shot- and self-supervised learning to reduce labeling efforts. Finally, we identify novel directions that can further enhance label-efficiency and improve incremental learning scalability. Project website: this https URL.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP Approved no  
  Call Number Admin @ si @ KWA2023 Serial 3994  
Permanent link to this record
 

 
Author Hao Fang; Ajian Liu; Jun Wan; Sergio Escalera; Chenxu Zhao; Xu Zhang; Stan Z Li; Zhen Lei edit   pdf
url  openurl
  Title Surveillance Face Anti-spoofing Type Journal Article
  Year 2024 Publication IEEE Transactions on Information Forensics and Security Abbreviated Journal TIFS  
  Volume 19 Issue Pages 1535-1546  
  Keywords  
  Abstract Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, recent research generally focuses on short-distance applications (i.e., phone unlocking) while lacking consideration of long-distance scenes (i.e., surveillance security checks). In order to promote relevant research and fill this gap in the community, we collect a large-scale Surveillance High-Fidelity Mask (SuHiFiMask) dataset captured under 40 surveillance scenes, which has 101 subjects from different age groups with 232 3D attacks (high-fidelity masks), 200 2D attacks (posters, portraits, and screens), and 2 adversarial attacks. In this scene, low image resolution and noise interference are new challenges faced in surveillance FAS. Together with the SuHiFiMask dataset, we propose a Contrastive Quality-Invariance Learning (CQIL) network to alleviate the performance degradation caused by image quality from three aspects: (1) An Image Quality Variable module (IQV) is introduced to recover image information associated with discrimination by combining the super-resolution network. (2) Using generated sample pairs to simulate quality variance distributions to help contrastive learning strategies obtain robust feature representation under quality variation. (3) A Separate Quality Network (SQN) is designed to learn discriminative features independent of image quality. Finally, a large number of experiments verify the quality of the SuHiFiMask dataset and the superiority of the proposed CQIL.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ FLW2024 Serial 3869  
Permanent link to this record
 

 
Author Arnau Baro; Carles Badal; Pau Torras; Alicia Fornes edit   pdf
url  openurl
  Title Handwritten Historical Music Recognition through Sequence-to-Sequence with Attention Mechanism Type Conference Article
  Year 2022 Publication 3rd International Workshop on Reading Music Systems (WoRMS2021) Abbreviated Journal  
  Volume Issue Pages 55-59  
  Keywords Optical Music Recognition; Digits; Image Classification  
  Abstract Despite decades of research in Optical Music Recognition (OMR), the recognition of old handwritten music scores remains a challenge because of the variabilities in the handwriting styles, paper degradation, lack of standard notation, etc. Therefore, the research in OMR systems adapted to the particularities of old manuscripts is crucial to accelerate the conversion of music scores existing in archives into digital libraries, fostering the dissemination and preservation of our music heritage. In this paper we explore the adaptation of sequence-to-sequence models with attention mechanism (used in translation and handwritten text recognition) and the generation of specific synthetic data for recognizing old music scores. The experimental validation demonstrates that our approach is promising, especially when compared with long short-term memory neural networks.  
  Address July 23, 2021, Alicante (Spain)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WoRMS  
  Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no  
  Call Number Admin @ si @ BBT2022 Serial 3734  
Permanent link to this record
 

 
Author Marco Cotogni; Fei Yang; Claudio Cusano; Andrew Bagdanov; Joost Van de Weijer edit   pdf
url  openurl
  Title Exemplar-free Continual Learning of Vision Transformers via Gated Class-Attention and Cascaded Feature Drift Compensation Type Miscellaneous
  Year 2023 Publication ARXIV Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract We propose a new method for exemplar-free class incremental training of ViTs. The main challenge of exemplar-free continual learning is maintaining plasticity of the learner without causing catastrophic forgetting of previously learned tasks. This is often achieved via exemplar replay which can help recalibrate previous task classifiers to the feature drift which occurs when learning new tasks. Exemplar replay, however, comes at the cost of retaining samples from previous tasks which for many applications may not be possible. To address the problem of continual ViT training, we first propose gated class-attention to minimize the drift in the final ViT transformer block. This mask-based gating is applied to class-attention mechanism of the last transformer block and strongly regulates the weights crucial for previous tasks. Importantly, gated class-attention does not require the task-ID during inference, which distinguishes it from other parameter isolation methods. Secondly, we propose a new method of feature drift compensation that accommodates feature drift in the backbone when learning new tasks. The combination of gated class-attention and cascaded feature drift compensation allows for plasticity towards new tasks while limiting forgetting of previous ones. Extensive experiments performed on CIFAR-100, Tiny-ImageNet and ImageNet100 demonstrate that our exemplar-free method obtains competitive results when compared to rehearsal based ViT methods.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP Approved no  
  Call Number Admin @ si @ CYC2023 Serial 3981  
Permanent link to this record
 

 
Author Soumya Jahagirdar; Minesh Mathew; Dimosthenis Karatzas; CV Jawahar edit   pdf
url  openurl
  Title Watching the News: Towards VideoQA Models that can Read Type Conference Article
  Year 2023 Publication Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Video Question Answering methods focus on commonsense reasoning and visual cognition of objects or persons and their interactions over time. Current VideoQA approaches ignore the textual information present in the video. Instead, we argue that textual information is complementary to the action and provides essential contextualisation cues to the reasoning process. To this end, we propose a novel VideoQA task that requires reading and understanding the text in the video. To explore this direction, we focus on news videos and require QA systems to comprehend and answer questions about the topics presented by combining visual and textual cues in the video. We introduce the ``NewsVideoQA'' dataset that comprises more than 8,600 QA pairs on 3,000+ news videos obtained from diverse news channels from around the world. We demonstrate the limitations of current Scene Text VQA and VideoQA methods and propose ways to incorporate scene text information into VideoQA methods.  
  Address Waikoloa; Hawai; USA; January 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG Approved no  
  Call Number Admin @ si @ JMK2023 Serial 3899  
Permanent link to this record
 

 
Author Dipam Goswami; J Schuster; Joost Van de Weijer; Didier Stricker edit   pdf
url  openurl
  Title Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation Type Conference Article
  Year 2023 Publication Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 3195-3204  
  Keywords  
  Abstract Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation. D Goswami, R Schuster, J van de Weijer, D Stricker. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 3195-3204  
  Address Waikoloa; Hawai; USA; January 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes LAMP Approved no  
  Call Number Admin @ si @ GSW2023 Serial 3901  
Permanent link to this record
 

 
Author Javad Zolfaghari Bengar; Joost Van de Weijer; Laura Lopez-Fuentes; Bogdan Raducanu edit   pdf
url  doi
openurl 
  Title Class-Balanced Active Learning for Image Classification Type Conference Article
  Year 2022 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Active learning aims to reduce the labeling effort that is required to train algorithms by learning an acquisition function selecting the most relevant data for which a label should be requested from a large unlabeled data pool. Active learning is generally studied on balanced datasets where an equal amount of images per class is available. However, real-world datasets suffer from severe imbalanced classes, the so called long-tail distribution. We argue that this further complicates the active learning process, since the imbalanced data pool can result in suboptimal classifiers. To address this problem in the context of active learning, we proposed a general optimization framework that explicitly takes class-balancing into account. Results on three datasets showed that the method is general (it can be combined with most existing active learning algorithms) and can be effectively applied to boost the performance of both informative and representative-based active learning methods. In addition, we showed that also on balanced datasets
our method 1 generally results in a performance gain.
 
  Address Virtual; Waikoloa; Hawai; USA; January 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes LAMP; 602.200; 600.147; 600.120 Approved no  
  Call Number Admin @ si @ ZWL2022 Serial 3703  
Permanent link to this record
 

 
Author Hannes Mueller; Andre Groger; Jonathan Hersh; Andrea Matranga; Joan Serrat edit   pdf
url  openurl
  Title Monitoring War Destruction from Space: A Machine Learning Approach Type Miscellaneous
  Year 2020 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Existing data on building destruction in conflict zones rely on eyewitness reports or manual detection, which makes it generally scarce, incomplete and potentially biased. This lack of reliable data imposes severe limitations for media reporting, humanitarian relief efforts, human rights monitoring, reconstruction initiatives, and academic studies of violent conflict. This article introduces an automated method of measuring destruction in high-resolution satellite images using deep learning techniques combined with data augmentation to expand training samples. We apply this method to the Syrian civil war and reconstruct the evolution of damage in major cities across the country. The approach allows generating destruction data with unprecedented scope, resolution, and frequency – only limited by the available satellite imagery – which can alleviate data limitations decisively.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ MGH2020 Serial 3489  
Permanent link to this record
 

 
Author Minesh Mathew; Ruben Tito; Dimosthenis Karatzas; R.Manmatha; C.V. Jawahar edit   pdf
url  openurl
  Title Document Visual Question Answering Challenge 2020 Type Conference Article
  Year 2020 Publication 33rd IEEE Conference on Computer Vision and Pattern Recognition – Short paper Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This paper presents results of Document Visual Question Answering Challenge organized as part of “Text and Documents in the Deep Learning Era” workshop, in CVPR 2020. The challenge introduces a new problem – Visual Question Answering on document images. The challenge comprised two tasks. The first task concerns with asking questions on a single document image. On the other hand, the second task is set as a retrieval task where the question is posed over a collection of images. For the task 1 a new dataset is introduced comprising 50,000 questions-answer(s) pairs defined over 12,767 document images. For task 2 another dataset has been created comprising 20 questions over 14,362 document images which share the same document template.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPR  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ MTK2020 Serial 3558  
Permanent link to this record
 

 
Author Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes edit   pdf
url  openurl
  Title Learning Graph Edit Distance by Graph NeuralNetworks Type Miscellaneous
  Year 2020 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The emergence of geometric deep learning as a novel framework to deal with graph-based representations has faded away traditional approaches in favor of completely new methodologies. In this paper, we propose a new framework able to combine the advances on deep metric learning with traditional approximations of the graph edit distance. Hence, we propose an efficient graph distance based on the novel field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure, and thus, leveraging this information for its use on a distance computation. The performance of the proposed graph distance is validated on two different scenarios. On the one hand, in a graph retrieval of handwritten words~\ie~keyword spotting, showing its superior performance when compared with (approximate) graph edit distance benchmarks. On the other hand, demonstrating competitive results for graph similarity learning when compared with the current state-of-the-art on a recent benchmark dataset.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.140; 601.302 Approved no  
  Call Number Admin @ si @ RFL2020 Serial 3555  
Permanent link to this record
 

 
Author Debora Gil; Katerine Diaz; Carles Sanchez; Aura Hernandez-Sabate edit   pdf
url  openurl
  Title Early Screening of SARS-CoV-2 by Intelligent Analysis of X-Ray Images Type Miscellaneous
  Year 2020 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Future SARS-CoV-2 virus outbreak COVID-XX might possibly occur during the next years. However the pathology in humans is so recent that many clinical aspects, like early detection of complications, side effects after recovery or early screening, are currently unknown. In spite of the number of cases of COVID-19, its rapid spread putting many sanitary systems in the edge of collapse has hindered proper collection and analysis of the data related to COVID-19 clinical aspects. We describe an interdisciplinary initiative that integrates clinical research, with image diagnostics and the use of new technologies such as artificial intelligence and radiomics with the aim of clarifying some of SARS-CoV-2 open questions. The whole initiative addresses 3 main points: 1) collection of standardize data including images, clinical data and analytics; 2) COVID-19 screening for its early diagnosis at primary care centers; 3) define radiomic signatures of COVID-19 evolution and associated pathologies for the early treatment of complications. In particular, in this paper we present a general overview of the project, the experimental design and first results of X-ray COVID-19 detection using a classic approach based on HoG and feature selection. Our experiments include a comparison to some recent methods for COVID-19 screening in X-Ray and an exploratory analysis of the feasibility of X-Ray COVID-19 screening. Results show that classic approaches can outperform deep-learning methods in this experimental setting, indicate the feasibility of early COVID-19 screening and that non-COVID infiltration is the group of patients most similar to COVID-19 in terms of radiological description of X-ray. Therefore, an efficient COVID-19 screening should be complemented with other clinical data to better discriminate these cases.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM; 600.139; 600.145; 601.337 Approved no  
  Call Number Admin @ si @ GDS2020 Serial 3474  
Permanent link to this record
 

 
Author Lei Kang; Pau Riba; Marçal Rusiñol; Alicia Fornes; Mauricio Villegas edit   file
url  doi
openurl 
  Title Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition Type Journal Article
  Year 2022 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 129 Issue Pages 108766  
  Keywords  
  Abstract The advent of recurrent neural networks for handwriting recognition marked an important milestone reaching impressive recognition accuracies despite the great variability that we observe across different writing styles. Sequential architectures are a perfect fit to model text lines, not only because of the inherent temporal aspect of text, but also to learn probability distributions over sequences of characters and words. However, using such recurrent paradigms comes at a cost at training stage, since their sequential pipelines prevent parallelization. In this work, we introduce a non-recurrent approach to recognize handwritten text by the use of transformer models. We propose a novel method that bypasses any recurrence. By using multi-head self-attention layers both at the visual and textual stages, we are able to tackle character recognition as well as to learn language-related dependencies of the character sequences to be decoded. Our model is unconstrained to any predefined vocabulary, being able to recognize out-of-vocabulary words, i.e. words that do not appear in the training vocabulary. We significantly advance over prior art and demonstrate that satisfactory recognition accuracies are yielded even in few-shot learning scenarios.  
  Address Sept. 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.162 Approved no  
  Call Number Admin @ si @ KRR2022 Serial 3556  
Permanent link to this record
 

 
Author Oriol Ramos Terrades; Albert Berenguel; Debora Gil edit   pdf
url  openurl
  Title A flexible outlier detector based on a topology given by graph communities Type Miscellaneous
  Year 2020 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Outlier, or anomaly, detection is essential for optimal performance of machine learning methods and statistical predictive models. It is not just a technical step in a data cleaning process but a key topic in many fields such as fraudulent document detection, in medical applications and assisted diagnosis systems or detecting security threats. In contrast to population-based methods, neighborhood based local approaches are simple flexible methods that have the potential to perform well in small sample size unbalanced problems. However, a main concern of local approaches is the impact that the computation of each sample neighborhood has on the method performance. Most approaches use a distance in the feature space to define a single neighborhood that requires careful selection of several parameters. This work presents a local approach based on a local measure of the heterogeneity of sample labels in the feature space considered as a topological manifold. Topology is computed using the communities of a weighted graph codifying mutual nearest neighbors in the feature space. This way, we provide with a set of multiple neighborhoods able to describe the structure of complex spaces without parameter fine tuning. The extensive experiments on real-world data sets show that our approach overall outperforms, both, local and global strategies in multi and single view settings.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM; DAG; 600.139; 600.145; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ RBG2020 Serial 3475  
Permanent link to this record
 

 
Author Vacit Oguz Yazici; Abel Gonzalez-Garcia; Arnau Ramisa; Bartlomiej Twardowski; Joost Van de Weijer edit   pdf
url  openurl
  Title Orderless Recurrent Models for Multi-label Classification Type Conference Article
  Year 2020 Publication 33rd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Recurrent neural networks (RNN) are popular for many computer vision tasks, including multi-label classification. Since RNNs produce sequential outputs, labels need to be ordered for the multi-label classification task. Current approaches sort labels according to their frequency, typically ordering them in either rare-first or frequent-first. These imposed orderings do not take into account that the natural order to generate the labels can change for each image, e.g.\ first the dominant object before summing up the smaller objects in the image. Therefore, in this paper, we propose ways to dynamically order the ground truth labels with the predicted label sequence. This allows for the faster training of more optimal LSTM models for multi-label classification. Analysis evidences that our method does not suffer from duplicate generation, something which is common for other models. Furthermore, it outperforms other CNN-RNN models, and we show that a standard architecture of an image encoder and language decoder trained with our proposed loss obtains the state-of-the-art results on the challenging MS-COCO, WIDER Attribute and PA-100K and competitive results on NUS-WIDE.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPR  
  Notes LAMP; 600.109; 601.309; 600.141; 600.120 Approved no  
  Call Number Admin @ si @ YGR2020 Serial 3408  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: