toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Marc Masana; Bartlomiej Twardowski; Joost Van de Weijer edit   pdf
openurl 
  Title On Class Orderings for Incremental Learning Type Conference Article
  Year 2020 Publication ICML Workshop on Continual Learning Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) The influence of class orderings in the evaluation of incremental learning has received very little attention. In this paper, we investigate the impact of class orderings for incrementally learned classifiers. We propose a method to compute various orderings for a dataset. The orderings are derived by simulated annealing optimization from the confusion matrix and reflect different incremental learning scenarios, including maximally and minimally confusing tasks. We evaluate a wide range of state-of-the-art incremental learning methods on the proposed orderings. Results show that orderings can have a significant impact on performance and the ranking of the methods.  
  Address Virtual; July 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICMLW  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ MTW2020 Serial 3505  
Permanent link to this record
 

 
Author Manuel Carbonell; Pau Riba; Mauricio Villegas; Alicia Fornes; Josep Llados edit   pdf
openurl 
  Title Named Entity Recognition and Relation Extraction with Graph Neural Networks in Semi Structured Documents Type Conference Article
  Year 2020 Publication 25th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) The use of administrative documents to communicate and leave record of business information requires of methods
able to automatically extract and understand the content from
such documents in a robust and efficient way. In addition,
the semi-structured nature of these reports is specially suited
for the use of graph-based representations which are flexible
enough to adapt to the deformations from the different document
templates. Moreover, Graph Neural Networks provide the proper
methodology to learn relations among the data elements in
these documents. In this work we study the use of Graph
Neural Network architectures to tackle the problem of entity
recognition and relation extraction in semi-structured documents.
Our approach achieves state of the art results in the three
tasks involved in the process. Additionally, the experimentation
with two datasets of different nature demonstrates the good
generalization ability of our approach.
 
  Address Virtual; January 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ CRV2020 Serial 3509  
Permanent link to this record
 

 
Author Yaxing Wang; Luis Herranz; Joost Van de Weijer edit   pdf
url  doi
openurl 
  Title Mix and match networks: multi-domain alignment for unpaired image-to-image translation Type Journal Article
  Year 2020 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 128 Issue Pages 2849–2872  
  Keywords  
  Abstract (up) This paper addresses the problem of inferring unseen cross-modal image-to-image translations between multiple modalities. We assume that only some of the pairwise translations have been seen (i.e. trained) and infer the remaining unseen translations (where training pairs are not available). We propose mix and match networks, an approach where multiple encoders and decoders are aligned in such a way that the desired translation can be obtained by simply cascading the source encoder and the target decoder, even when they have not interacted during the training stage (i.e. unseen). The main challenge lies in the alignment of the latent representations at the bottlenecks of encoder-decoder pairs. We propose an architecture with several tools to encourage alignment, including autoencoders and robust side information and latent consistency losses. We show the benefits of our approach in terms of effectiveness and scalability compared with other pairwise image-to-image translation approaches. We also propose zero-pair cross-modal image translation, a challenging setting where the objective is inferring semantic segmentation from depth (and vice-versa) without explicit segmentation-depth pairs, and only from two (disjoint) segmentation-RGB and depth-RGB training sets. We observe that a certain part of the shared information between unseen modalities might not be reachable, so we further propose a variant that leverages pseudo-pairs which allows us to exploit this shared information between the unseen modalities  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.109; 600.106; 600.141; 600.120 Approved no  
  Call Number Admin @ si @ WHW2020 Serial 3424  
Permanent link to this record
 

 
Author Henry Velesaca; Raul Mira; Patricia Suarez; Christian X. Larrea; Angel Sappa edit   pdf
url  openurl
  Title Deep Learning Based Corn Kernel Classification Type Conference Article
  Year 2020 Publication 1st International Workshop and Prize Challenge on Agriculture-Vision: Challenges & Opportunities for Computer Vision in Agriculture Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) This paper presents a full pipeline to classify sample sets of corn kernels. The proposed approach follows a segmentation-classification scheme. The image segmentation is performed through a well known deep learningbased approach, the Mask R-CNN architecture, while the classification is performed hrough a novel-lightweight network specially designed for this task—good corn kernel, defective corn kernel and impurity categories are considered. As a second contribution, a carefully annotated multitouching corn kernel dataset has been generated. This dataset has been used for training the segmentation and the classification modules. Quantitative evaluations have been
performed and comparisons with other approaches are provided showing improvements with the proposed pipeline.
 
  Address Virtual CVPR  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes MSIAU; 600.130; 600.122 Approved no  
  Call Number Admin @ si @ VMS2020 Serial 3430  
Permanent link to this record
 

 
Author Jorge Charco; Angel Sappa; Boris X. Vintimilla; Henry Velesaca edit   pdf
doi  openurl
  Title Transfer Learning from Synthetic Data in the Camera Pose Estimation Problem Type Conference Article
  Year 2020 Publication 15th International Conference on Computer Vision Theory and Applications Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) This paper presents a novel Siamese network architecture, as a variant of Resnet-50, to estimate the relative camera pose on multi-view environments. In order to improve the performance of the proposed model a transfer learning strategy, based on synthetic images obtained from a virtual-world, is considered. The transfer learning consists of first training the network using pairs of images from the virtual-world scenario
considering different conditions (i.e., weather, illumination, objects, buildings, etc.); then, the learned weight
of the network are transferred to the real case, where images from real-world scenarios are considered. Experimental results and comparisons with the state of the art show both, improvements on the relative pose estimation accuracy using the proposed model, as well as further improvements when the transfer learning strategy (synthetic-world data transfer learning real-world data) is considered to tackle the limitation on the
training due to the reduced number of pairs of real-images on most of the public data sets.
 
  Address Valletta; Malta; February 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference VISAPP  
  Notes MSIAU; 600.130; 601.349; 600.122 Approved no  
  Call Number Admin @ si @ CSV2020 Serial 3433  
Permanent link to this record
 

 
Author Minesh Mathew; Ruben Tito; Dimosthenis Karatzas; R.Manmatha; C.V. Jawahar edit   pdf
url  openurl
  Title Document Visual Question Answering Challenge 2020 Type Conference Article
  Year 2020 Publication 33rd IEEE Conference on Computer Vision and Pattern Recognition – Short paper Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) This paper presents results of Document Visual Question Answering Challenge organized as part of “Text and Documents in the Deep Learning Era” workshop, in CVPR 2020. The challenge introduces a new problem – Visual Question Answering on document images. The challenge comprised two tasks. The first task concerns with asking questions on a single document image. On the other hand, the second task is set as a retrieval task where the question is posed over a collection of images. For the task 1 a new dataset is introduced comprising 50,000 questions-answer(s) pairs defined over 12,767 document images. For task 2 another dataset has been created comprising 20 questions over 14,362 document images which share the same document template.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPR  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ MTK2020 Serial 3558  
Permanent link to this record
 

 
Author Henry Velesaca; Steven Araujo; Patricia Suarez; Angel Sanchez; Angel Sappa edit   pdf
openurl 
  Title Off-the-Shelf Based System for Urban Environment Video Analytics Type Conference Article
  Year 2020 Publication 27th International Conference on Systems, Signals and Image Processing Abbreviated Journal  
  Volume Issue Pages  
  Keywords greenhouse gases; carbon footprint; object detection; object tracking; website framework; off-the-shelf video analytics  
  Abstract (up) This paper presents the design and implementation details of a system build-up by using off-the-shelf algorithms for urban video analytics. The system allows the connection to
public video surveillance camera networks to obtain the necessary information to generate statistics from urban scenarios (e.g., amount of vehicles, type of cars, direction, numbers of persons, etc.). The obtained information could be used not only for traffic management but also to estimate the carbon footprint of urban scenarios. As a case study, a university campus is selected to evaluate the performance of the proposed system. The system is implemented in a modular way so that it is being used as a testbed to evaluate different algorithms. Implementation results are provided showing the validity and utility of the proposed approach.
 
  Address Virtual IWSSIP  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IWSSIP  
  Notes MSIAU; 600.130; 601.349; 600.122 Approved no  
  Call Number Admin @ si @ VAS2020 Serial 3429  
Permanent link to this record
 

 
Author Xavier Soria; Edgar Riba; Angel Sappa edit   pdf
url  doi
openurl 
  Title Dense Extreme Inception Network: Towards a Robust CNN Model for Edge Detection Type Conference Article
  Year 2020 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) This paper proposes a Deep Learning based edge detector, which is inspired on both HED (Holistically-Nested Edge Detection) and Xception networks. The proposed approach generates thin edge-maps that are plausible for human eyes; it can be used in any edge detection task without previous training or fine tuning process. As a second contribution, a large dataset with carefully annotated edges has been generated. This dataset has been used for training the proposed approach as well the state-of-the-art algorithms for comparisons. Quantitative and qualitative evaluations have been performed on different benchmarks showing improvements with the proposed method when F-measure of ODS and OIS are considered.  
  Address Aspen; USA; March 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes MSIAU; 600.130; 601.349; 600.122 Approved no  
  Call Number Admin @ si @ SRS2020 Serial 3434  
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla edit   pdf
openurl 
  Title Thermal Image Super-resolution: A Novel Architecture and Dataset Type Conference Article
  Year 2020 Publication 15th International Conference on Computer Vision Theory and Applications Abbreviated Journal  
  Volume Issue Pages 111-119  
  Keywords  
  Abstract (up) This paper proposes a novel CycleGAN architecture for thermal image super-resolution, together with a large dataset consisting of thermal images at different resolutions. The dataset has been acquired using three thermal cameras at different resolutions, which acquire images from the same scenario at the same time. The thermal cameras are mounted in rig trying to minimize the baseline distance to make easier the registration problem.
The proposed architecture is based on ResNet6 as a Generator and PatchGAN as Discriminator. The novelty on the proposed unsupervised super-resolution training (CycleGAN) is possible due to the existence of aforementioned thermal images—images of the same scenario with different resolutions. The proposed approach is evaluated in the dataset and compared with classical bicubic interpolation. The dataset and the network are available.
 
  Address Valletta; Malta; February 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference VISAPP  
  Notes MSIAU; 600.130; 600.122 Approved no  
  Call Number Admin @ si @ RSV2020 Serial 3432  
Permanent link to this record
 

 
Author Albert Clapes; Julio C. S. Jacques Junior; Carla Morral; Sergio Escalera edit  url
openurl 
  Title ChaLearn LAP 2020 Challenge on Identity-preserved Human Detection: Dataset and Results Type Conference Article
  Year 2020 Publication 15th IEEE International Conference on Automatic Face and Gesture Recognition Abbreviated Journal  
  Volume Issue Pages 801-808  
  Keywords  
  Abstract (up) This paper summarizes the ChaLearn Looking at People 2020 Challenge on Identity-preserved Human Detection (IPHD). For the purpose, we released a large novel dataset containing more than 112K pairs of spatiotemporally aligned depth and thermal frames (and 175K instances of humans) sampled from 780 sequences. The sequences contain hundreds of non-identifiable people appearing in a mix of in-the-wild and scripted scenarios recorded in public and private places. The competition was divided into three tracks depending on the modalities exploited for the detection: (1) depth, (2) thermal, and (3) depth-thermal fusion. Color was also captured but only used to facilitate the groundtruth annotation. Still the temporal synchronization of three sensory devices is challenging, so bad temporal matches across modalities can occur. Hence, the labels provided should considered “weak”, although test frames were carefully selected to minimize this effect and ensure the fairest comparison of the participants’ results. Despite this added difficulty, the results got by the participants demonstrate current fully-supervised methods can deal with that and achieve outstanding detection performance when measured in terms of AP@0.50.  
  Address Virtual; November 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference FG  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ CJM2020 Serial 3501  
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla; Lin Guo; Jiankun Hou; Armin Mehri; Parichehr Behjati Ardakani; Heena Patel; Vishal Chudasama; Kalpesh Prajapati; Kishor P. Upla; Raghavendra Ramachandra; Kiran Raja; Christoph Busch; Feras Almasri; Olivier Debeir; Sabari Nathan; Priya Kansal; Nolan Gutierrez; Bardia Mojra; William J. Beksi edit   pdf
url  openurl
  Title Thermal Image Super-Resolution Challenge – PBVS 2020 Type Conference Article
  Year 2020 Publication 16h IEEE Workshop on Perception Beyond the Visible Spectrum Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) This paper summarizes the top contributions to the first challenge on thermal image super-resolution (TISR), which was organized as part of the Perception Beyond the Visible Spectrum (PBVS) 2020 workshop. In this challenge, a novel thermal image dataset is considered together with state-of-the-art approaches evaluated under a common framework. The dataset used in the challenge consists of 1021 thermal images, obtained from three distinct thermal cameras at different resolutions (low-resolution, mid-resolution, and high-resolution), resulting in a total of 3063 thermal images. From each resolution, 951 images are used for training and 50 for testing while the 20 remaining images are used for two proposed evaluations. The first evaluation consists of downsampling the low-resolution, mid-resolution, and high-resolution thermal images by x2, x3 and x4 respectively, and comparing their super-resolution results with the corresponding ground truth images. The second evaluation is comprised of obtaining the x2 super-resolution from a given mid-resolution thermal image and comparing it with the corresponding semi-registered high-resolution thermal image. Out of 51 registered participants, 6 teams reached the final validation phase.  
  Address Virtual CVPR  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes MSIAU; ISE; 600.119; 600.122 Approved no  
  Call Number Admin @ si @ RSV2020 Serial 3431  
Permanent link to this record
 

 
Author Sounak Dey edit  isbn
openurl 
  Title Mapping between Images and Conceptual Spaces: Sketch-based Image Retrieval Type Book Whole
  Year 2020 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) This thesis presents several contributions to the literature of sketch based image retrieval (SBIR). In SBIR the first challenge we face is how to map two different domains to common space for effective retrieval of images, while tackling the different levels of abstraction people use to express their notion of objects around while sketching. To this extent we first propose a cross-modal learning framework that maps both sketches and text into a joint embedding space invariant to depictive style, while preserving semantics. Then we have also investigated different query types possible to encompass people's dilema in sketching certain world objects. For this we propose an approach for multi-modal image retrieval in multi-labelled images. A multi-modal deep network architecture is formulated to jointly model sketches and text as input query modalities into a common embedding space, which is then further aligned with the image feature space. This permits encoding the object-based features and its alignment with the query irrespective of the availability of the co-occurrence of different objects in the training set.

Finally, we explore the problem of zero-shot sketch-based image retrieval (ZS-SBIR), where human sketches are used as queries to conduct retrieval of photos from unseen categories. We importantly advance prior arts by proposing a novel ZS-SBIR scenario that represents a firm step forward in its practical application. The new setting uniquely recognises two important yet often neglected challenges of practical ZS-SBIR, (i) the large domain gap between amateur sketch and photo, and (ii) the necessity for moving towards large-scale retrieval. We first contribute to the community a novel ZS-SBIR dataset, QuickDraw-Extended. We also in this dissertation pave the path to the future direction of research in this domain.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados;Umapada Pal  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-121011-8-8 Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ Dey20 Serial 3480  
Permanent link to this record
 

 
Author Sergio Escalera; Ralf Herbrich edit  url
doi  isbn
openurl 
  Title The NeurIPS’18 Competition: From Machine Learning to Intelligent Conversations Type Book Whole
  Year 2020 Publication The Springer Series on Challenges in Machine Learning Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) This volume presents the results of the Neural Information Processing Systems Competition track at the 2018 NeurIPS conference. The competition follows the same format as the 2017 competition track for NIPS. Out of 21 submitted proposals, eight competition proposals were selected, spanning the area of Robotics, Health, Computer Vision, Natural Language Processing, Systems and Physics. Competitions have become an integral part of advancing state-of-the-art in artificial intelligence (AI). They exhibit one important difference to benchmarks: Competitions test a system end-to-end rather than evaluating only a single component; they assess the practicability of an algorithmic solution in addition to assessing feasibility.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor Sergio Escalera; Ralf Hebrick  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 2520-1328 ISBN 978-3-030-29134-1 Medium  
  Area Expedition Conference  
  Notes HuPBA; no menciona Approved no  
  Call Number Admin @ si @ HeE2020 Serial 3328  
Permanent link to this record
 

 
Author Angel Morera; Angel Sanchez; A. Belen Moreno; Angel Sappa; Jose F. Velez edit   pdf
url  openurl
  Title SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities Type Journal Article
  Year 2020 Publication Sensors Abbreviated Journal SENS  
  Volume 20 Issue 16 Pages 4587  
  Keywords  
  Abstract (up) This work compares Single Shot MultiBox Detector (SSD) and You Only Look Once (YOLO) deep neural networks for the outdoor advertisement panel detection problem by handling multiple and combined variabilities in the scenes. Publicity panel detection in images offers important advantages both in the real world as well as in the virtual one. For example, applications like Google Street View can be used for Internet publicity and when detecting these ads panels in images, it could be possible to replace the publicity appearing inside the panels by another from a funding company. In our experiments, both SSD and YOLO detectors have produced acceptable results under variable sizes of panels, illumination conditions, viewing perspectives, partial occlusion of panels, complex background and multiple panels in scenes. Due to the difficulty of finding annotated images for the considered problem, we created our own dataset for conducting the experiments. The major strength of the SSD model was the almost elimination of False Positive (FP) cases, situation that is preferable when the publicity contained inside the panel is analyzed after detecting them. On the other side, YOLO produced better panel localization results detecting a higher number of True Positive (TP) panels with a higher accuracy. Finally, a comparison of the two analyzed object detection models with different types of semantic segmentation networks and using the same evaluation metrics is also included.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MSIAU; 600.130; 601.349; 600.122 Approved no  
  Call Number Admin @ si @ MSM2020 Serial 3452  
Permanent link to this record
 

 
Author Hugo Bertiche; Meysam Madadi; Sergio Escalera edit   pdf
openurl 
  Title CLOTH3D: Clothed 3D Humans Type Conference Article
  Year 2020 Publication 16th European Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) This work presents CLOTH3D, the first big scale synthetic dataset of 3D clothed human sequences. CLOTH3D contains a large variability on garment type, topology, shape, size, tightness and fabric. Clothes are simulated on top of thousands of different pose sequences and body shapes, generating realistic cloth dynamics. We provide the dataset with a generative model for cloth generation. We propose a Conditional Variational Auto-Encoder (CVAE) based on graph convolutions (GCVAE) to learn garment latent spaces. This allows for realistic generation of 3D garments on top of SMPL model for any pose and shape.  
  Address Virtual; August 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCV  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ BME2020 Serial 3519  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: