Home | << 1 2 3 4 5 6 7 8 9 >> |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Mohamed Ali Souibgui; Y.Kessentini; Alicia Fornes | ||||
Title | A conditional GAN based approach for distorted camera captured documents recovery | Type | Conference Article | ||
Year | 2020 | Publication | 4th Mediterranean Conference on Pattern Recognition and Artificial Intelligence | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address ![]() |
Virtual; December 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | MedPRAI | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ SKF2020 | Serial | 3450 | ||
Permanent link to this record | |||||
Author | Riccardo Del Chiaro; Bartlomiej Twardowski; Andrew Bagdanov; Joost Van de Weijer | ||||
Title | Recurrent attention to transient tasks for continual image captioning | Type | Conference Article | ||
Year | 2020 | Publication | 34th Conference on Neural Information Processing Systems | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Research on continual learning has led to a variety of approaches to mitigating catastrophic forgetting in feed-forward classification networks. Until now surprisingly little attention has been focused on continual learning of recurrent models applied to problems like image captioning. In this paper we take a systematic look at continual learning of LSTM-based models for image captioning. We propose an attention-based approach that explicitly accommodates the transient nature of vocabularies in continual image captioning tasks -- i.e. that task vocabularies are not disjoint. We call our method Recurrent Attention to Transient Tasks (RATT), and also show how to adapt continual learning approaches based on weight egularization and knowledge distillation to recurrent continual learning problems. We apply our approaches to incremental image captioning problem on two new continual learning benchmarks we define using the MS-COCO and Flickr30 datasets. Our results demonstrate that RATT is able to sequentially learn five captioning tasks while incurring no forgetting of previously learned ones. | ||||
Address ![]() |
virtual; December 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | NEURIPS | ||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ CTB2020 | Serial | 3484 | ||
Permanent link to this record | |||||
Author | Yaxing Wang; Lu Yu; Joost Van de Weijer | ||||
Title | DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs | Type | Conference Article | ||
Year | 2020 | Publication | 34th Conference on Neural Information Processing Systems | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Image-to-image translation has recently achieved remarkable results. But despite current success, it suffers from inferior performance when translations between classes require large shape changes. We attribute this to the high-resolution bottlenecks which are used by current state-of-the-art image-to-image methods. Therefore, in this work, we propose a novel deep hierarchical Image-to-Image Translation method, called DeepI2I. We learn a model by leveraging hierarchical features: (a) structural information contained in the shallow layers and (b) semantic information extracted from the deep layers. To enable the training of deep I2I models on small datasets, we propose a novel transfer learning method, that transfers knowledge from pre-trained GANs. Specifically, we leverage the discriminator of a pre-trained GANs (i.e. BigGAN or StyleGAN) to initialize both the encoder and the discriminator and the pre-trained generator to initialize the generator of our model. Applying knowledge transfer leads to an alignment problem between the encoder and generator. We introduce an adaptor network to address this. On many-class image-to-image translation on three datasets (Animal faces, Birds, and Foods) we decrease mFID by at least 35% when compared to the state-of-the-art. Furthermore, we qualitatively and quantitatively demonstrate that transfer learning significantly improves the performance of I2I systems, especially for small datasets. Finally, we are the first to perform I2I translations for domains with over 100 classes. | ||||
Address ![]() |
virtual; December 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | NEURIPS | ||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ WYW2020 | Serial | 3485 | ||
Permanent link to this record | |||||
Author | Asma Bensalah; Jialuo Chen; Alicia Fornes; Cristina Carmona_Duarte; Josep Llados; Miguel A. Ferrer | ||||
Title | Towards Stroke Patients' Upper-limb Automatic Motor Assessment Using Smartwatches. | Type | Conference Article | ||
Year | 2020 | Publication | International Workshop on Artificial Intelligence for Healthcare Applications | Abbreviated Journal | |
Volume | 12661 | Issue | Pages | 476-489 | |
Keywords | |||||
Abstract | Assessing the physical condition in rehabilitation scenarios is a challenging problem, since it involves Human Activity Recognition (HAR) and kinematic analysis methods. In addition, the difficulties increase in unconstrained rehabilitation scenarios, which are much closer to the real use cases. In particular, our aim is to design an upper-limb assessment pipeline for stroke patients using smartwatches. We focus on the HAR task, as it is the first part of the assessing pipeline. Our main target is to automatically detect and recognize four key movements inspired by the Fugl-Meyer assessment scale, which are performed in both constrained and unconstrained scenarios. In addition to the application protocol and dataset, we propose two detection and classification baseline methods. We believe that the proposed framework, dataset and baseline results will serve to foster this research field. | ||||
Address ![]() |
Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPRW | ||
Notes | DAG; 600.121; 600.140; | Approved | no | ||
Call Number | Admin @ si @ BCF2020 | Serial | 3508 | ||
Permanent link to this record | |||||
Author | Manuel Carbonell; Pau Riba; Mauricio Villegas; Alicia Fornes; Josep Llados | ||||
Title | Named Entity Recognition and Relation Extraction with Graph Neural Networks in Semi Structured Documents | Type | Conference Article | ||
Year | 2020 | Publication | 25th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | The use of administrative documents to communicate and leave record of business information requires of methods
able to automatically extract and understand the content from such documents in a robust and efficient way. In addition, the semi-structured nature of these reports is specially suited for the use of graph-based representations which are flexible enough to adapt to the deformations from the different document templates. Moreover, Graph Neural Networks provide the proper methodology to learn relations among the data elements in these documents. In this work we study the use of Graph Neural Network architectures to tackle the problem of entity recognition and relation extraction in semi-structured documents. Our approach achieves state of the art results in the three tasks involved in the process. Additionally, the experimentation with two datasets of different nature demonstrates the good generalization ability of our approach. |
||||
Address ![]() |
Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ CRV2020 | Serial | 3509 | ||
Permanent link to this record | |||||
Author | M. Li; Xialei Liu; Joost Van de Weijer; Bogdan Raducanu | ||||
Title | Learning to Rank for Active Learning: A Listwise Approach | Type | Conference Article | ||
Year | 2020 | Publication | 25th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 5587-5594 | ||
Keywords | |||||
Abstract | Active learning emerged as an alternative to alleviate the effort to label huge amount of data for data hungry applications (such as image/video indexing and retrieval, autonomous driving, etc.). The goal of active learning is to automatically select a number of unlabeled samples for annotation (according to a budget), based on an acquisition function, which indicates how valuable a sample is for training the model. The learning loss method is a task-agnostic approach which attaches a module to learn to predict the target loss of unlabeled data, and select data with the highest loss for labeling. In this work, we follow this strategy but we define the acquisition function as a learning to rank problem and rethink the structure of the loss prediction module, using a simple but effective listwise approach. Experimental results on four datasets demonstrate that our method outperforms recent state-of-the-art active learning approaches for both image classification and regression tasks. | ||||
Address ![]() |
Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ LLW2020a | Serial | 3511 | ||
Permanent link to this record | |||||
Author | Idoia Ruiz; Joan Serrat | ||||
Title | Rank-based ordinal classification | Type | Conference Article | ||
Year | 2020 | Publication | 25th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 8069-8076 | ||
Keywords | |||||
Abstract | Differently from the regular classification task, in ordinal classification there is an order in the classes. As a consequence not all classification errors matter the same: a predicted class close to the groundtruth one is better than predicting a farther away class. To account for this, most previous works employ loss functions based on the absolute difference between the predicted and groundtruth class labels. We argue that there are many cases in ordinal classification where label values are arbitrary (for instance 1. . . C, being C the number of classes) and thus such loss functions may not be the best choice. We instead propose a network architecture that produces not a single class prediction but an ordered vector, or ranking, of all the possible classes from most to least likely. This is thanks to a loss function that compares groundtruth and predicted rankings of these class labels, not the labels themselves. Another advantage of this new formulation is that we can enforce consistency in the predictions, namely, predicted rankings come from some unimodal vector of scores with mode at the groundtruth class. We compare with the state of the art ordinal classification methods, showing
that ours attains equal or better performance, as measured by common ordinal classification metrics, on three benchmark datasets. Furthermore, it is also suitable for a new task on image aesthetics assessment, i.e. most voted score prediction. Finally, we also apply it to building damage assessment from satellite images, providing an analysis of its performance depending on the degree of imbalance of the dataset. |
||||
Address ![]() |
Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | ADAS; 600.118; 600.124 | Approved | no | ||
Call Number | Admin @ si @ RuS2020 | Serial | 3549 | ||
Permanent link to this record | |||||
Author | Klara Janousckova; Jiri Matas; Lluis Gomez; Dimosthenis Karatzas | ||||
Title | Text Recognition – Real World Data and Where to Find Them | Type | Conference Article | ||
Year | 2020 | Publication | 25th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 4489-4496 | ||
Keywords | |||||
Abstract | We present a method for exploiting weakly annotated images to improve text extraction pipelines. The approach uses an arbitrary end-to-end text recognition system to obtain text region proposals and their, possibly erroneous, transcriptions. The method includes matching of imprecise transcriptions to weak annotations and an edit distance guided neighbourhood search. It produces nearly error-free, localised instances of scene text, which we treat as “pseudo ground truth” (PGT). The method is applied to two weakly-annotated datasets. Training with the extracted PGT consistently improves the accuracy of a state of the art recognition model, by 3.7% on average, across different benchmark datasets (image domains) and 24.5% on one of the weakly annotated datasets 1 1 Acknowledgements. The authors were supported by Czech Technical University student grant SGS20/171/0HK3/3TJ13, the MEYS VVV project CZ.02.1.01/0.010.0J16 019/0000765 Research Center for Informatics, the Spanish Research project TIN2017-89779-P and the CERCA Programme / Generalitat de Catalunya. | ||||
Address ![]() |
Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ JMG2020 | Serial | 3557 | ||
Permanent link to this record | |||||
Author | Zhengying Liu; Adrien Pavao; Zhen Xu; Sergio Escalera; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Sebastien Treguer | ||||
Title | How far are we from true AutoML: reflection from winning solutions and results of AutoDL challenge | Type | Conference Article | ||
Year | 2020 | Publication | 7th ICML Workshop on Automated Machine Learning | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Following the completion of the AutoDL challenge (the final challenge in the ChaLearn
AutoDL challenge series 2019), we investigate winning solutions and challenge results to answer an important motivational question: how far are we from achieving true AutoML? On one hand, the winning solutions achieve good (accurate and fast) classification performance on unseen datasets. On the other hand, all winning solutions still contain a considerable amount of hard-coded knowledge on the domain (or modality) such as image, video, text, speech and tabular. This form of ad-hoc meta-learning could be replaced by more automated forms of meta-learning in the future. Organizing a meta-learning challenge could help forging AutoML solutions that generalize to new unseen domains (e.g. new types of sensor data) as well as gaining insights on the AutoML problem from a more fundamental point of view. The datasets of the AutoDL challenge are a resource that can be used for further benchmarks and the code of the winners has been outsourced, which is a big step towards “democratizing” Deep Learning. |
||||
Address ![]() |
Virtual; July 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICML | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ LPX2020 | Serial | 3502 | ||
Permanent link to this record | |||||
Author | Marc Masana; Bartlomiej Twardowski; Joost Van de Weijer | ||||
Title | On Class Orderings for Incremental Learning | Type | Conference Article | ||
Year | 2020 | Publication | ICML Workshop on Continual Learning | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | The influence of class orderings in the evaluation of incremental learning has received very little attention. In this paper, we investigate the impact of class orderings for incrementally learned classifiers. We propose a method to compute various orderings for a dataset. The orderings are derived by simulated annealing optimization from the confusion matrix and reflect different incremental learning scenarios, including maximally and minimally confusing tasks. We evaluate a wide range of state-of-the-art incremental learning methods on the proposed orderings. Results show that orderings can have a significant impact on performance and the ranking of the methods. | ||||
Address ![]() |
Virtual; July 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICMLW | ||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ MTW2020 | Serial | 3505 | ||
Permanent link to this record | |||||
Author | David Berga; Marc Masana; Joost Van de Weijer | ||||
Title | Disentanglement of Color and Shape Representations for Continual Learning | Type | Conference Article | ||
Year | 2020 | Publication | ICML Workshop on Continual Learning | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | We hypothesize that disentangled feature representations suffer less from catastrophic forgetting. As a case study we perform explicit disentanglement of color and shape, by adjusting the network architecture. We tested classification accuracy and forgetting in a task-incremental setting with Oxford-102 Flowers dataset. We combine our method with Elastic Weight Consolidation, Learning without Forgetting, Synaptic Intelligence and Memory Aware Synapses, and show that feature disentanglement positively impacts continual learning performance. | ||||
Address ![]() |
Virtual; July 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICMLW | ||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ BMW2020 | Serial | 3506 | ||
Permanent link to this record | |||||
Author | Jialuo Chen; M.A.Souibgui; Alicia Fornes; Beata Megyesi | ||||
Title | A Web-based Interactive Transcription Tool for Encrypted Manuscripts | Type | Conference Article | ||
Year | 2020 | Publication | 3rd International Conference on Historical Cryptology | Abbreviated Journal | |
Volume | Issue | Pages | 52-59 | ||
Keywords | |||||
Abstract | Manual transcription of handwritten text is a time consuming task. In the case of encrypted manuscripts, the recognition is even more complex due to the huge variety of alphabets and symbol sets. To speed up and ease this process, we present a web-based tool aimed to (semi)-automatically transcribe the encrypted sources. The user uploads one or several images of the desired encrypted document(s) as input, and the system returns the transcription(s). This process is carried out in an interactive fashion with
the user to obtain more accurate results. For discovering and testing, the developed web tool is freely available. |
||||
Address ![]() |
Virtual; June 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | HistoCrypt | ||
Notes | DAG; 600.140; 602.230; 600.121 | Approved | no | ||
Call Number | Admin @ si @ CSF2020 | Serial | 3447 | ||
Permanent link to this record | |||||
Author | Lorenzo Porzi; Markus Hofinger; Idoia Ruiz; Joan Serrat; Samuel Rota Bulo; Peter Kontschieder | ||||
Title | Learning Multi-Object Tracking and Segmentation from Automatic Annotations | Type | Conference Article | ||
Year | 2020 | Publication | 33rd IEEE Conference on Computer Vision and Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 6845-6854 | ||
Keywords | |||||
Abstract | In this work we contribute a novel pipeline to automatically generate training data, and to improve over state-of-the-art multi-object tracking and segmentation (MOTS) methods. Our proposed track mining algorithm turns raw street-level videos into high-fidelity MOTS training data, is scalable and overcomes the need of expensive and time-consuming manual annotation approaches. We leverage state-of-the-art instance segmentation results in combination with optical flow predictions, also trained on automatically harvested training data. Our second major contribution is MOTSNet – a deep learning, tracking-by-detection architecture for MOTS – deploying a novel mask-pooling layer for improved object association over time. Training MOTSNet with our automatically extracted data leads to significantly improved sMOTSA scores on the novel KITTI MOTS dataset (+1.9%/+7.5% on cars/pedestrians), and MOTSNet improves by +4.1% over previously best methods on the MOTSChallenge dataset. Our most impressive finding is that we can improve over previous best-performing works, even in complete absence of manually annotated MOTS training data. | ||||
Address ![]() |
virtual; June 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPR | ||
Notes | ADAS; 600.124; 600.118 | Approved | no | ||
Call Number | Admin @ si @ PHR2020 | Serial | 3402 | ||
Permanent link to this record | |||||
Author | Debora Gil; Guillermo Torres | ||||
Title | A multi-shape loss function with adaptive class balancing for the segmentation of lung structures | Type | Conference Article | ||
Year | 2020 | Publication | 34th International Congress and Exhibition on Computer Assisted Radiology & Surgery | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address ![]() |
Virtual; June 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CARS | ||
Notes | IAM; 600.139; 600.145 | Approved | no | ||
Call Number | Admin @ si @ GiT2020 | Serial | 3472 | ||
Permanent link to this record | |||||
Author | Yaxing Wang; Salman Khan; Abel Gonzalez-Garcia; Joost Van de Weijer; Fahad Shahbaz Khan | ||||
Title | Semi-supervised Learning for Few-shot Image-to-Image Translation | Type | Conference Article | ||
Year | 2020 | Publication | 33rd IEEE Conference on Computer Vision and Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | In the last few years, unpaired image-to-image translation has witnessed remarkable progress. Although the latest methods are able to generate realistic images, they crucially rely on a large number of labeled images. Recently, some methods have tackled the challenging setting of few-shot image-to-image translation, reducing the labeled data requirements for the target domain during inference. In this work, we go one step further and reduce the amount of required labeled data also from the source domain during training. To do so, we propose applying semi-supervised learning via a noise-tolerant pseudo-labeling procedure. We also apply a cycle consistency constraint to further exploit the information from unlabeled images, either from the same dataset or external. Additionally, we propose several structural modifications to facilitate the image translation task under these circumstances. Our semi-supervised method for few-shot image translation, called SEMIT, achieves excellent results on four different datasets using as little as 10% of the source labels, and matches the performance of the main fully-supervised competitor using only 20% labeled data. Our code and models are made public at: this https URL. | ||||
Address ![]() |
Virtual; June 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPR | ||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ WKG2020 | Serial | 3486 | ||
Permanent link to this record |