toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Xialei Liu; Marc Masana; Luis Herranz; Joost Van de Weijer; Antonio Lopez; Andrew Bagdanov edit   pdf
doi  openurl
  Title Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting Type Conference Article
  Year 2018 Publication (down) 24th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 2262-2268  
  Keywords  
  Abstract In this paper we propose an approach to avoiding catastrophic forgetting in sequential task learning scenarios. Our technique is based on a network reparameterization that approximately diagonalizes the Fisher Information Matrix of the network parameters. This reparameterization takes the form of
a factorized rotation of parameter space which, when used in conjunction with Elastic Weight Consolidation (which assumes a diagonal Fisher Information Matrix), leads to significantly better performance on lifelong learning of sequential tasks. Experimental results on the MNIST, CIFAR-100, CUB-200 and
Stanford-40 datasets demonstrate that we significantly improve the results of standard elastic weight consolidation, and that we obtain competitive results when compared to the state-of-the-art in lifelong learning without forgetting.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes LAMP; ADAS; 601.305; 601.109; 600.124; 600.106; 602.200; 600.120; 600.118 Approved no  
  Call Number Admin @ si @ LMH2018 Serial 3160  
Permanent link to this record
 

 
Author Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes edit   pdf
doi  openurl
  Title Learning Graph Distances with Message Passing Neural Networks Type Conference Article
  Year 2018 Publication (down) 24th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 2239-2244  
  Keywords ★Best Paper Award★  
  Abstract Graph representations have been widely used in pattern recognition thanks to their powerful representation formalism and rich theoretical background. A number of error-tolerant graph matching algorithms such as graph edit distance have been proposed for computing a distance between two labelled graphs. However, they typically suffer from a high
computational complexity, which makes it difficult to apply
these matching algorithms in a real scenario. In this paper, we propose an efficient graph distance based on the emerging field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure and learns a metric with a siamese network approach. The performance of the proposed graph distance is validated in two application cases, graph classification and graph retrieval of handwritten words, and shows a promising performance when compared with
(approximate) graph edit distance benchmarks.
 
  Address Beijing; China; August 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.097; 603.057; 601.302; 600.121 Approved no  
  Call Number Admin @ si @ RFL2018 Serial 3168  
Permanent link to this record
 

 
Author Gemma Rotger; Felipe Lumbreras; Francesc Moreno-Noguer; Antonio Agudo edit   pdf
doi  openurl
  Title 2D-to-3D Facial Expression Transfer Type Conference Article
  Year 2018 Publication (down) 24th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 2008 - 2013  
  Keywords  
  Abstract Automatically changing the expression and physical features of a face from an input image is a topic that has been traditionally tackled in a 2D domain. In this paper, we bring this problem to 3D and propose a framework that given an
input RGB video of a human face under a neutral expression, initially computes his/her 3D shape and then performs a transfer to a new and potentially non-observed expression. For this purpose, we parameterize the rest shape –obtained from standard factorization approaches over the input video– using a triangular
mesh which is further clustered into larger macro-segments. The expression transfer problem is then posed as a direct mapping between this shape and a source shape, such as the blend shapes of an off-the-shelf 3D dataset of human facial expressions. The mapping is resolved to be geometrically consistent between 3D models by requiring points in specific regions to map on semantic
equivalent regions. We validate the approach on several synthetic and real examples of input faces that largely differ from the source shapes, yielding very realistic expression transfers even in cases with topology changes, such as a synthetic video sequence of a single-eyed cyclops.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes MSIAU; 600.086; 600.130; 600.118 Approved no  
  Call Number Admin @ si @ RLM2018 Serial 3232  
Permanent link to this record
 

 
Author Lu Yu; Yongmei Cheng; Joost Van de Weijer edit   pdf
doi  openurl
  Title Weakly Supervised Domain-Specific Color Naming Based on Attention Type Conference Article
  Year 2018 Publication (down) 24th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 3019 - 3024  
  Keywords  
  Abstract The majority of existing color naming methods focuses on the eleven basic color terms of the English language. However, in many applications, different sets of color names are used for the accurate description of objects. Labeling data to learn these domain-specific color names is an expensive and laborious task. Therefore, in this article we aim to learn color names from weakly labeled data. For this purpose, we add an attention branch to the color naming network. The attention branch is used to modulate the pixel-wise color naming predictions of the network. In experiments, we illustrate that the attention branch correctly identifies the relevant regions. Furthermore, we show that our method obtains state-of-the-art results for pixel-wise and image-wise classification on the EBAY dataset and is able to learn color names for various domains.  
  Address Beijing; August 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes LAMP; 600.109; 602.200; 600.120 Approved no  
  Call Number Admin @ si @ YCW2018 Serial 3243  
Permanent link to this record
 

 
Author Victor Vaquero; German Ros; Francesc Moreno-Noguer; Antonio Lopez; Alberto Sanfeliu edit   pdf
doi  openurl
  Title Joint coarse-and-fine reasoning for deep optical flow Type Conference Article
  Year 2017 Publication (down) 24th International Conference on Image Processing Abbreviated Journal  
  Volume Issue Pages 2558-2562  
  Keywords  
  Abstract We propose a novel representation for dense pixel-wise estimation tasks using CNNs that boosts accuracy and reduces training time, by explicitly exploiting joint coarse-and-fine reasoning. The coarse reasoning is performed over a discrete classification space to obtain a general rough solution, while the fine details of the solution are obtained over a continuous regression space. In our approach both components are jointly estimated, which proved to be beneficial for improving estimation accuracy. Additionally, we propose a new network architecture, which combines coarse and fine components by treating the fine estimation as a refinement built on top of the coarse solution, and therefore adding details to the general prediction. We apply our approach to the challenging problem of optical flow estimation and empirically validate it against state-of-the-art CNN-based solutions trained from scratch and tested on large optical flow datasets.  
  Address Beijing; China; September 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ VRM2017 Serial 2898  
Permanent link to this record
 

 
Author Andrei Polzounov; Artsiom Ablavatski; Sergio Escalera; Shijian Lu; Jianfei Cai edit  openurl
  Title WordFences: Text Localization and Recognition Type Conference Article
  Year 2017 Publication (down) 24th International Conference on Image Processing Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Beijing; China; September 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes HUPBA; no menciona Approved no  
  Call Number Admin @ si @ PAE2017 Serial 3007  
Permanent link to this record
 

 
Author Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva edit   pdf
openurl 
  Title All the people around me: face clustering in egocentric photo streams Type Conference Article
  Year 2017 Publication (down) 24th International Conference on Image Processing Abbreviated Journal  
  Volume Issue Pages  
  Keywords face discovery; face clustering; deepmatching; bag-of-tracklets; egocentric photo-streams  
  Abstract arxiv1703.01790
Given an unconstrained stream of images captured by a wearable photo-camera (2fpm), we propose an unsupervised bottom-up approach for automatic clustering appearing faces into the individual identities present in these data. The problem is challenging since images are acquired under real world conditions; hence the visible appearance of the people in the images undergoes intensive variations. Our proposed pipeline consists of first arranging the photo-stream into events, later, localizing the appearance of multiple people in them, and
finally, grouping various appearances of the same person across different events. Experimental results performed on a dataset acquired by wearing a photo-camera during one month, demonstrate the effectiveness of the proposed approach for the considered purpose.
 
  Address Beijing; China; September 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes MILAB; no menciona Approved no  
  Call Number Admin @ si @ EDR2017 Serial 3025  
Permanent link to this record
 

 
Author Ivet Rafegas; Maria Vanrell edit   pdf
openurl 
  Title Color spaces emerging from deep convolutional networks Type Conference Article
  Year 2016 Publication (down) 24th Color and Imaging Conference Abbreviated Journal  
  Volume Issue Pages 225-230  
  Keywords  
  Abstract Award for the best interactive session
Defining color spaces that provide a good encoding of spatio-chromatic properties of color surfaces is an open problem in color science [8, 22]. Related to this, in computer vision the fusion of color with local image features has been studied and evaluated [16]. In human vision research, the cells which are selective to specific color hues along the visual pathway are also a focus of attention [7, 14]. In line with these research aims, in this paper we study how color is encoded in a deep Convolutional Neural Network (CNN) that has been trained on more than one million natural images for object recognition. These convolutional nets achieve impressive performance in computer vision, and rival the representations in human brain. In this paper we explore how color is represented in a CNN architecture that can give some intuition about efficient spatio-chromatic representations. In convolutional layers the activation of a neuron is related to a spatial filter, that combines spatio-chromatic representations. We use an inverted version of it to explore the properties. Using a series of unsupervised methods we classify different type of neurons depending on the color axes they define and we propose an index of color-selectivity of a neuron. We estimate the main color axes that emerge from this trained net and we prove that colorselectivity of neurons decreases from early to deeper layers.
 
  Address San Diego; USA; November 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CIC  
  Notes CIC Approved no  
  Call Number Admin @ si @ RaV2016a Serial 2894  
Permanent link to this record
 

 
Author German Ros; J. Guerrero; Angel Sappa; Daniel Ponsa; Antonio Lopez edit   pdf
openurl 
  Title Fast and Robust l1-averaging-based Pose Estimation for Driving Scenarios Type Conference Article
  Year 2013 Publication (down) 24th British Machine Vision Conference Abbreviated Journal  
  Volume Issue Pages  
  Keywords SLAM  
  Abstract Robust visual pose estimation is at the core of many computer vision applications, being fundamental for Visual SLAM and Visual Odometry problems. During the last decades, many approaches have been proposed to solve these problems, being RANSAC one of the most accepted and used. However, with the arrival of new challenges, such as large driving scenarios for autonomous vehicles, along with the improvements in the data gathering frameworks, new issues must be considered. One of these issues is the capability of a technique to deal with very large amounts of data while meeting the realtime
constraint. With this purpose in mind, we present a novel technique for the problem of robust camera-pose estimation that is more suitable for dealing with large amount of data, which additionally, helps improving the results. The method is based on a combination of a very fast coarse-evaluation function and a robust ℓ1-averaging procedure. Such scheme leads to high-quality results while taking considerably less time than RANSAC.
Experimental results on the challenging KITTI Vision Benchmark Suite are provided, showing the validity of the proposed approach.
 
  Address Bristol; UK; September 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference BMVC  
  Notes ADAS Approved no  
  Call Number Admin @ si @ RGS2013b; ADAS @ adas @ Serial 2274  
Permanent link to this record
 

 
Author Dena Bazazian; Raul Gomez; Anguelos Nicolaou; Lluis Gomez; Dimosthenis Karatzas; Andrew Bagdanov edit   pdf
openurl 
  Title Improving Text Proposals for Scene Images with Fully Convolutional Networks Type Conference Article
  Year 2016 Publication (down) 23rd International Conference on Pattern Recognition Workshops Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Text Proposals have emerged as a class-dependent version of object proposals – efficient approaches to reduce the search space of possible text object locations in an image. Combined with strong word classifiers, text proposals currently yield top state of the art results in end-to-end scene text
recognition. In this paper we propose an improvement over the original Text Proposals algorithm of [1], combining it with Fully Convolutional Networks to improve the ranking of proposals. Results on the ICDAR RRC and the COCO-text datasets show superior performance over current state-of-the-art.
 
  Address Cancun; Mexico; December 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPRW  
  Notes DAG; LAMP; 600.084 Approved no  
  Call Number Admin @ si @ BGN2016 Serial 2823  
Permanent link to this record
 

 
Author Fatemeh Noroozi; Marina Marjanovic; Angelina Njegus; Sergio Escalera; Gholamreza Anbarjafari edit  openurl
  Title Fusion of Classifier Predictions for Audio-Visual Emotion Recognition Type Conference Article
  Year 2016 Publication (down) 23rd International Conference on Pattern Recognition Workshops Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In this paper is presented a novel multimodal emotion recognition system which is based on the analysis of audio and visual cues. MFCC-based features are extracted from the audio channel and facial landmark geometric relations are
computed from visual data. Both sets of features are learnt separately using state-of-the-art classifiers. In addition, we summarise each emotion video into a reduced set of key-frames, which are learnt in order to visually discriminate emotions by means of a Convolutional Neural Network. Finally, confidence
outputs of all classifiers from all modalities are used to define a new feature space to be learnt for final emotion prediction, in a late fusion/stacking fashion. The conducted experiments on eNTERFACE’05 database show significant performance improvements of our proposed system in comparison to state-of-the-art approaches.
 
  Address Cancun; Mexico; December 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPRW  
  Notes HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ NMN2016 Serial 2839  
Permanent link to this record
 

 
Author Iiris Lusi; Sergio Escalera; Gholamreza Anbarjafari edit  doi
openurl 
  Title Human Head Pose Estimation on SASE database using Random Hough Regression Forests Type Conference Article
  Year 2016 Publication (down) 23rd International Conference on Pattern Recognition Workshops Abbreviated Journal  
  Volume 10165 Issue Pages  
  Keywords  
  Abstract In recent years head pose estimation has become an important task in face analysis scenarios. Given the availability of high resolution 3D sensors, the design of a high resolution head pose database would be beneficial for the community. In this paper, Random Hough Forests are used to estimate 3D head pose and location on a new 3D head database, SASE, which represents the baseline performance on the new data for an upcoming international head pose estimation competition. The data in SASE is acquired with a Microsoft Kinect 2 camera, including the RGB and depth information of 50 subjects with a large sample of head poses, allowing us to test methods for real-life scenarios. We briefly review the database while showing baseline head pose estimation results based on Random Hough Forests.  
  Address Cancun; Mexico; December 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPRW  
  Notes HuPBA; Approved no  
  Call Number Admin @ si @ LEA2016b Serial 2910  
Permanent link to this record
 

 
Author Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva edit   pdf
openurl 
  Title With whom do I interact with? Social interaction detection in egocentric photo-streams Type Conference Article
  Year 2016 Publication (down) 23rd International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Given a user wearing a low frame rate wearable camera during a day, this work aims to automatically detect the moments when the user gets engaged into a social interaction solely by reviewing the automatically captured photos by the worn camera. The proposed method, inspired by the sociological concept of F-formation, exploits distance and orientation of the appearing individuals -with respect to the user- in the scene from a bird-view perspective. As a result, the interaction pattern over the sequence can be understood as a two-dimensional time series that corresponds to the temporal evolution of the distance and orientation features over time. A Long-Short Term Memory-based Recurrent Neural Network is then trained to classify each time series. Experimental evaluation over a dataset of 30.000 images has shown promising results on the proposed method for social interaction detection in egocentric photo-streams.  
  Address Cancun; Mexico; December 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes MILAB Approved no  
  Call Number Admin @ si @ADR2016a Serial 2791  
Permanent link to this record
 

 
Author Hugo Jair Escalante; Victor Ponce; Jun Wan; Michael A. Riegler; Baiyu Chen; Albert Clapes; Sergio Escalera; Isabelle Guyon; Xavier Baro; Pal Halvorsen; Henning Muller; Martha Larson edit   pdf
url  doi
openurl 
  Title ChaLearn Joint Contest on Multimedia Challenges Beyond Visual Analysis: An Overview Type Conference Article
  Year 2016 Publication (down) 23rd International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This paper provides an overview of the Joint Contest on Multimedia Challenges Beyond Visual Analysis. We organized an academic competition that focused on four problems that require effective processing of multimodal information in order to be solved. Two tracks were devoted to gesture spotting and recognition from RGB-D video, two fundamental problems for human computer interaction. Another track was devoted to a second round of the first impressions challenge of which the goal was to develop methods to recognize personality traits from
short video clips. For this second round we adopted a novel collaborative-competitive (i.e., coopetition) setting. The fourth track was dedicated to the problem of video recommendation for improving user experience. The challenge was open for about 45 days, and received outstanding participation: almost
200 participants registered to the contest, and 20 teams sent predictions in the final stage. The main goals of the challenge were fulfilled: the state of the art was advanced considerably in the four tracks, with novel solutions to the proposed problems (mostly relying on deep learning). However, further research is still required. The data of the four tracks will be available to
allow researchers to keep making progress in the four tracks.
 
  Address Cancun; Mexico; December 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes HuPBA; 602.143;MV Approved no  
  Call Number Admin @ si @ EPW2016 Serial 2827  
Permanent link to this record
 

 
Author Marc Bolaños; Petia Radeva edit   pdf
url  doi
openurl 
  Title Simultaneous Food Localization and Recognition Type Conference Article
  Year 2016 Publication (down) 23rd International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract CoRR abs/1604.07953
The development of automatic nutrition diaries, which would allow to keep track objectively of everything we eat, could enable a whole new world of possibilities for people concerned about their nutrition patterns. With this purpose, in this paper we propose the first method for simultaneous food localization and recognition. Our method is based on two main steps, which consist in, first, produce a food activation map on the input image (i.e. heat map of probabilities) for generating bounding boxes proposals and, second, recognize each of the food types or food-related objects present in each bounding box. We demonstrate that our proposal, compared to the most similar problem nowadays – object localization, is able to obtain high precision and reasonable recall levels with only a few bounding boxes. Furthermore, we show that it is applicable to both conventional and egocentric images.
 
  Address Cancun; Mexico; December 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ BoR2016 Serial 2834  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: