|   | 
Details
   web
Records
Author Javier Marin; David Vazquez; Antonio Lopez; Jaume Amores; Bastian Leibe
Title Random Forests of Local Experts for Pedestrian Detection Type Conference Article
Year 2013 Publication 15th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages (down) 2592 - 2599
Keywords ADAS; Random Forest; Pedestrian Detection
Abstract Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one.
Address Sydney; Australia; December 2013
Corporate Author Thesis
Publisher IEEE Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1550-5499 ISBN Medium
Area Expedition Conference ICCV
Notes ADAS; 600.057; 600.054 Approved no
Call Number ADAS @ adas @ MVL2013 Serial 2333
Permanent link to this record
 

 
Author Victor Vaquero; German Ros; Francesc Moreno-Noguer; Antonio Lopez; Alberto Sanfeliu
Title Joint coarse-and-fine reasoning for deep optical flow Type Conference Article
Year 2017 Publication 24th International Conference on Image Processing Abbreviated Journal
Volume Issue Pages (down) 2558-2562
Keywords
Abstract We propose a novel representation for dense pixel-wise estimation tasks using CNNs that boosts accuracy and reduces training time, by explicitly exploiting joint coarse-and-fine reasoning. The coarse reasoning is performed over a discrete classification space to obtain a general rough solution, while the fine details of the solution are obtained over a continuous regression space. In our approach both components are jointly estimated, which proved to be beneficial for improving estimation accuracy. Additionally, we propose a new network architecture, which combines coarse and fine components by treating the fine estimation as a refinement built on top of the coarse solution, and therefore adding details to the general prediction. We apply our approach to the challenging problem of optical flow estimation and empirically validate it against state-of-the-art CNN-based solutions trained from scratch and tested on large optical flow datasets.
Address Beijing; China; September 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIP
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ VRM2017 Serial 2898
Permanent link to this record
 

 
Author Chenshen Wu; Joost Van de Weijer
Title Density Map Distillation for Incremental Object Counting Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages (down) 2505-2514
Keywords
Abstract We investigate the problem of incremental learning for object counting, where a method must learn to count a variety of object classes from a sequence of datasets. A naïve approach to incremental object counting would suffer from catastrophic forgetting, where it would suffer from a dramatic performance drop on previous tasks. In this paper, we propose a new exemplar-free functional regularization method, called Density Map Distillation (DMD). During training, we introduce a new counter head for each task and introduce a distillation loss to prevent forgetting of previous tasks. Additionally, we introduce a cross-task adaptor that projects the features of the current backbone to the previous backbone. This projector allows for the learning of new features while the backbone retains the relevant features for previous tasks. Finally, we set up experiments of incremental learning for counting new objects. Results confirm that our method greatly reduces catastrophic forgetting and outperforms existing methods.
Address Vancouver; Canada; June 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes LAMP Approved no
Call Number Admin @ si @ WuW2023 Serial 3916
Permanent link to this record
 

 
Author Miguel Oliveira; L. Seabra Lopes; G. Hyun Lim; S. Hamidreza Kasaei; Angel Sappa; A. Tom
Title Concurrent Learning of Visual Codebooks and Object Categories in Openended Domains Type Conference Article
Year 2015 Publication International Conference on Intelligent Robots and Systems Abbreviated Journal
Volume Issue Pages (down) 2488 - 2495
Keywords Visual Learning; Computer Vision; Autonomous Agents
Abstract In open-ended domains, robots must continuously learn new object categories. When the training sets are created offline, it is not possible to ensure their representativeness with respect to the object categories and features the system will find when operating online. In the Bag of Words model, visual codebooks are constructed from training sets created offline. This might lead to non-discriminative visual words and, as a consequence, to poor recognition performance. This paper proposes a visual object recognition system which concurrently learns in an incremental and online fashion both the visual object category representations as well as the codebook words used to encode them. The codebook is defined using Gaussian Mixture Models which are updated using new object views. The approach contains similarities with the human visual object recognition system: evidence suggests that the development of recognition capabilities occurs on multiple levels and is sustained over large periods of time. Results show that the proposed system with concurrent learning of object categories and codebooks is capable of learning more categories, requiring less examples, and with similar accuracies, when compared to the classical Bag of Words approach using offline constructed codebooks.
Address Hamburg; Germany; October 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IROS
Notes ADAS; 600.076 Approved no
Call Number Admin @ si @ OSL2015 Serial 2664
Permanent link to this record
 

 
Author Adela Barbulescu; Wenjuan Gong; Jordi Gonzalez; Thomas B. Moeslund; Xavier Roca
Title 3D Human Pose Estimation Using 2D Body Part Detectors Type Conference Article
Year 2012 Publication 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages (down) 2484 - 2487
Keywords
Abstract Automatic 3D reconstruction of human poses from monocular images is a challenging and popular topic in the computer vision community, which provides a wide range of applications in multiple areas. Solutions for 3D pose estimation involve various learning approaches, such as support vector machines and Gaussian processes, but many encounter difficulties in cluttered scenarios and require additional input data, such as silhouettes, or controlled camera settings. We present a framework that is capable of estimating the 3D pose of a person from single images or monocular image sequences without requiring background information and which is robust to camera variations. The framework models the non-linearity present in human pose estimation as it benefits from flexible learning approaches, including a highly customizable 2D detector. Results on the HumanEva benchmark show how they perform and influence the quality of the 3D pose estimates.
Address Tsubuka, Japan
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN 978-1-4673-2216-4 Medium
Area Expedition Conference ICPR
Notes ISE Approved no
Call Number Admin @ si @ BGG2012 Serial 2172
Permanent link to this record
 

 
Author Naila Murray; Luca Marchesotti; Florent Perronnin
Title AVA: A Large-Scale Database for Aesthetic Visual Analysis Type Conference Article
Year 2012 Publication 25th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages (down) 2408-2415
Keywords
Abstract With the ever-expanding volume of visual content available, the ability to organize and navigate such content by aesthetic preference is becoming increasingly important. While still in its nascent stage, research into computational models of aesthetic preference already shows great potential. However, to advance research, realistic, diverse and challenging databases are needed. To this end, we introduce a new large-scale database for conducting Aesthetic Visual Analysis: AVA. It contains over 250,000 images along with a rich variety of meta-data including a large number of aesthetic scores for each image, semantic labels for over 60 categories as well as labels related to photographic style. We show the advantages of AVA with respect to existing databases in terms of scale, diversity, and heterogeneity of annotations. We then describe several key insights into aesthetic preference afforded by AVA. Finally, we demonstrate, through three applications, how the large scale of AVA can be leveraged to improve performance on existing preference tasks
Address Providence, Rhode Islan
Corporate Author Thesis
Publisher IEEE Xplore Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1063-6919 ISBN 978-1-4673-1226-4 Medium
Area Expedition Conference CVPR
Notes CIC Approved no
Call Number Admin @ si @ MMP2012a Serial 2025
Permanent link to this record
 

 
Author Jiaolong Xu; Peng Wang; Heng Yang; Antonio Lopez
Title Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving Type Conference Article
Year 2019 Publication IEEE International Conference on Robotics and Automation Abbreviated Journal
Volume Issue Pages (down) 2379-2384
Keywords
Abstract Autonomous driving has harsh requirements of small model size and energy efficiency, in order to enable the embedded system to achieve real-time on-board object detection. Recent deep convolutional neural network based object detectors have achieved state-of-the-art accuracy. However, such models are trained with numerous parameters and their high computational costs and large storage prohibit the deployment to memory and computation resource limited systems. Low-precision neural networks are popular techniques for reducing the computation requirements and memory footprint. Among them, binary weight neural network (BWN) is the extreme case which quantizes the float-point into just bit. BWNs are difficult to train and suffer from accuracy deprecation due to the extreme low-bit representation. To address this problem, we propose a knowledge transfer (KT) method to aid the training of BWN using a full-precision teacher network. We built DarkNet-and MobileNet-based binary weight YOLO-v2 detectors and conduct experiments on KITTI benchmark for car, pedestrian and cyclist detection. The experimental results show that the proposed method maintains high detection accuracy while reducing the model size of DarkNet-YOLO from 257 MB to 8.8 MB and MobileNet-YOLO from 193 MB to 7.9 MB.
Address Montreal; Canada; May 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICRA
Notes ADAS; 600.124; 600.116; 600.118 Approved no
Call Number Admin @ si @ XWY2018 Serial 3182
Permanent link to this record
 

 
Author Victor Campmany; Sergio Silva; Antonio Espinosa; Juan Carlos Moure; David Vazquez; Antonio Lopez
Title GPU-based pedestrian detection for autonomous driving Type Conference Article
Year 2016 Publication 16th International Conference on Computational Science Abbreviated Journal
Volume 80 Issue Pages (down) 2377-2381
Keywords Pedestrian detection; Autonomous Driving; CUDA
Abstract We propose a real-time pedestrian detection system for the embedded Nvidia Tegra X1 GPU-CPU hybrid platform. The pipeline is composed by the following state-of-the-art algorithms: Histogram of Local Binary Patterns (LBP) and Histograms of Oriented Gradients (HOG) features extracted from the input image; Pyramidal Sliding Window technique for foreground segmentation; and Support Vector Machine (SVM) for classification. Results show a 8x speedup in the target Tegra X1 platform and a better performance/watt ratio than desktop CUDA platforms in study.
Address San Diego; CA; USA; June 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCS
Notes ADAS; 600.085; 600.082; 600.076 Approved no
Call Number ADAS @ adas @ CSE2016 Serial 2741
Permanent link to this record
 

 
Author Albert Clapes; Ozan Bilici; Dariia Temirova; Egils Avots; Gholamreza Anbarjafari; Sergio Escalera
Title From apparent to real age: gender, age, ethnic, makeup, and expression bias analysis in real age estimation Type Conference Article
Year 2018 Publication IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages (down) 2373-2382
Keywords
Abstract
Address Salt Lake City; USA; June 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes HUPBA Approved no
Call Number Admin @ si @ Serial 3116
Permanent link to this record
 

 
Author German Barquero; Sergio Escalera; Cristina Palmero
Title BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction Type Conference Article
Year 2023 Publication IEEE/CVF International Conference on Computer Vision (ICCV) Workshops Abbreviated Journal
Volume Issue Pages (down) 2317-2327
Keywords
Abstract Stochastic human motion prediction (HMP) has generally been tackled with generative adversarial networks and variational autoencoders. Most prior works aim at predicting highly diverse movements in terms of the skeleton joints’ dispersion. This has led to methods predicting fast and motion-divergent movements, which are often unrealistic and incoherent with past motion. Such methods also neglect contexts that need to anticipate diverse low-range behaviors, or actions, with subtle joint displacements. To address these issues, we present BeLFusion, a model that, for the first time, leverages latent diffusion models in HMP to sample from a latent space where behavior is disentangled from pose and motion. As a result, diversity is encouraged from a behavioral perspective. Thanks to our behavior
coupler’s ability to transfer sampled behavior to ongoing motion, BeLFusion’s predictions display a variety of behaviors that are significantly more realistic than the state of the art. To support it, we introduce two metrics, the Area of
the Cumulative Motion Distribution, and the Average Pairwise Distance Error, which are correlated to our definition of realism according to a qualitative study with 126 participants. Finally, we prove BeLFusion’s generalization power in a new cross-dataset scenario for stochastic HMP.
Address 2-6 October 2023. Paris (France)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes HUPBA; no menciona Approved no
Call Number Admin @ si @ BEP2023 Serial 3829
Permanent link to this record
 

 
Author Gemma Roig; Xavier Boix; R. de Nijs; Sebastian Ramos; K. Kühnlenz; Luc Van Gool
Title Active MAP Inference in CRFs for Efficient Semantic Segmentation Type Conference Article
Year 2013 Publication 15th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages (down) 2312 - 2319
Keywords Semantic Segmentation
Abstract Most MAP inference algorithms for CRFs optimize an energy function knowing all the potentials. In this paper, we focus on CRFs where the computational cost of instantiating the potentials is orders of magnitude higher than MAP inference. This is often the case in semantic image segmentation, where most potentials are instantiated by slow classifiers fed with costly features. We introduce Active MAP inference 1) to on-the-fly select a subset of potentials to be instantiated in the energy function, leaving the rest of the parameters of the potentials unknown, and 2) to estimate the MAP labeling from such incomplete energy function. Results for semantic segmentation benchmarks, namely PASCAL VOC 2010 [5] and MSRC-21 [19], show that Active MAP inference achieves similar levels of accuracy but with major efficiency gains.
Address Sydney; Australia; December 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1550-5499 ISBN Medium
Area Expedition Conference ICCV
Notes ADAS; 600.057 Approved no
Call Number ADAS @ adas @ RBN2013 Serial 2377
Permanent link to this record
 

 
Author Xialei Liu; Marc Masana; Luis Herranz; Joost Van de Weijer; Antonio Lopez; Andrew Bagdanov
Title Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting Type Conference Article
Year 2018 Publication 24th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages (down) 2262-2268
Keywords
Abstract In this paper we propose an approach to avoiding catastrophic forgetting in sequential task learning scenarios. Our technique is based on a network reparameterization that approximately diagonalizes the Fisher Information Matrix of the network parameters. This reparameterization takes the form of
a factorized rotation of parameter space which, when used in conjunction with Elastic Weight Consolidation (which assumes a diagonal Fisher Information Matrix), leads to significantly better performance on lifelong learning of sequential tasks. Experimental results on the MNIST, CIFAR-100, CUB-200 and
Stanford-40 datasets demonstrate that we significantly improve the results of standard elastic weight consolidation, and that we obtain competitive results when compared to the state-of-the-art in lifelong learning without forgetting.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes LAMP; ADAS; 601.305; 601.109; 600.124; 600.106; 602.200; 600.120; 600.118 Approved no
Call Number Admin @ si @ LMH2018 Serial 3160
Permanent link to this record
 

 
Author Lichao Zhang; Martin Danelljan; Abel Gonzalez-Garcia; Joost Van de Weijer; Fahad Shahbaz Khan
Title Multi-Modal Fusion for End-to-End RGB-T Tracking Type Conference Article
Year 2019 Publication IEEE International Conference on Computer Vision Workshops Abbreviated Journal
Volume Issue Pages (down) 2252-2261
Keywords
Abstract We propose an end-to-end tracking framework for fusing the RGB and TIR modalities in RGB-T tracking. Our baseline tracker is DiMP (Discriminative Model Prediction), which employs a carefully designed target prediction network trained end-to-end using a discriminative loss. We analyze the effectiveness of modality fusion in each of the main components in DiMP, i.e. feature extractor, target estimation network, and classifier. We consider several fusion mechanisms acting at different levels of the framework, including pixel-level, feature-level and response-level. Our tracker is trained in an end-to-end manner, enabling the components to learn how to fuse the information from both modalities. As data to train our model, we generate a large-scale RGB-T dataset by considering an annotated RGB tracking dataset (GOT-10k) and synthesizing paired TIR images using an image-to-image translation approach. We perform extensive experiments on VOT-RGBT2019 dataset and RGBT210 dataset, evaluating each type of modality fusing on each model component. The results show that the proposed fusion mechanisms improve the performance of the single modality counterparts. We obtain our best results when fusing at the feature-level on both the IoU-Net and the model predictor, obtaining an EAO score of 0.391 on VOT-RGBT2019 dataset. With this fusion mechanism we achieve the state-of-the-art performance on RGBT210 dataset.
Address Seul; Corea; October 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCVW
Notes LAMP; 600.109; 600.141; 600.120 Approved no
Call Number Admin @ si @ ZDG2019 Serial 3279
Permanent link to this record
 

 
Author Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes
Title Learning Graph Distances with Message Passing Neural Networks Type Conference Article
Year 2018 Publication 24th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages (down) 2239-2244
Keywords ★Best Paper Award★
Abstract Graph representations have been widely used in pattern recognition thanks to their powerful representation formalism and rich theoretical background. A number of error-tolerant graph matching algorithms such as graph edit distance have been proposed for computing a distance between two labelled graphs. However, they typically suffer from a high
computational complexity, which makes it difficult to apply
these matching algorithms in a real scenario. In this paper, we propose an efficient graph distance based on the emerging field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure and learns a metric with a siamese network approach. The performance of the proposed graph distance is validated in two application cases, graph classification and graph retrieval of handwritten words, and shows a promising performance when compared with
(approximate) graph edit distance benchmarks.
Address Beijing; China; August 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes DAG; 600.097; 603.057; 601.302; 600.121 Approved no
Call Number Admin @ si @ RFL2018 Serial 3168
Permanent link to this record
 

 
Author Patricia Suarez; Angel Sappa; Boris X. Vintimilla; Riad I. Hammoud
Title Near InfraRed Imagery Colorization Type Conference Article
Year 2018 Publication 25th International Conference on Image Processing Abbreviated Journal
Volume Issue Pages (down) 2237 - 2241
Keywords Convolutional Neural Networks (CNN), Generative Adversarial Network (GAN), Infrared Imagery colorization
Abstract This paper proposes a stacked conditional Generative Adversarial Network-based method for Near InfraRed (NIR) imagery colorization. We propose a variant architecture of Generative Adversarial Network (GAN) that uses multiple
loss functions over a conditional probabilistic generative model. We show that this new architecture/loss-function yields better generalization and representation of the generated colored IR images. The proposed approach is evaluated on a large test dataset and compared to recent state of the art methods using standard metrics.
Address Athens; Greece; October 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIP
Notes MSIAU; 600.086; 600.130; 600.122 Approved no
Call Number Admin @ si @ SSV2018b Serial 3195
Permanent link to this record