|   | 
Details
   web
Records
Author Kai Wang; Joost Van de Weijer; Luis Herranz
Title ACAE-REMIND for online continual learning with compressed feature replay Type Journal Article
Year 2021 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 150 Issue Pages 122-129
Keywords online continual learning; autoencoders; vector quantization
Abstract Online continual learning aims to learn from a non-IID stream of data from a number of different tasks, where the learner is only allowed to consider data once. Methods are typically allowed to use a limited buffer to store some of the images in the stream. Recently, it was found that feature replay, where an intermediate layer representation of the image is stored (or generated) leads to superior results than image replay, while requiring less memory. Quantized exemplars can further reduce the memory usage. However, a drawback of these methods is that they use a fixed (or very intransigent) backbone network. This significantly limits the learning of representations that can discriminate between all tasks. To address this problem, we propose an auxiliary classifier auto-encoder (ACAE) module for feature replay at intermediate layers with high compression rates. The reduced memory footprint per image allows us to save more exemplars for replay. In our experiments, we conduct task-agnostic evaluation under online continual learning setting and get state-of-the-art performance on ImageNet-Subset, CIFAR100 and CIFAR10 dataset.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) LAMP; 600.147; 601.379; 600.120; 600.141 Approved no
Call Number Admin @ si @ WWH2021 Serial 3575
Permanent link to this record
 

 
Author Carola Figueroa Flores; David Berga; Joost Van de Weijer; Bogdan Raducanu
Title Saliency for free: Saliency prediction as a side-effect of object recognition Type Journal Article
Year 2021 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 150 Issue Pages 1-7
Keywords Saliency maps; Unsupervised learning; Object recognition
Abstract Saliency is the perceptual capacity of our visual system to focus our attention (i.e. gaze) on relevant objects instead of the background. So far, computational methods for saliency estimation required the explicit generation of a saliency map, process which is usually achieved via eyetracking experiments on still images. This is a tedious process that needs to be repeated for each new dataset. In the current paper, we demonstrate that is possible to automatically generate saliency maps without ground-truth. In our approach, saliency maps are learned as a side effect of object recognition. Extensive experiments carried out on both real and synthetic datasets demonstrated that our approach is able to generate accurate saliency maps, achieving competitive results when compared with supervised methods.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) LAMP; 600.147; 600.120 Approved no
Call Number Admin @ si @ FBW2021 Serial 3559
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Muhammad Anwer Rao; Joost Van de Weijer; Michael Felsberg; J.Laaksonen
Title Compact color texture description for texture classification Type Journal Article
Year 2015 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 51 Issue Pages 16-22
Keywords
Abstract Describing textures is a challenging problem in computer vision and pattern recognition. The classification problem involves assigning a category label to the texture class it belongs to. Several factors such as variations in scale, illumination and viewpoint make the problem of texture description extremely challenging. A variety of histogram based texture representations exists in literature.
However, combining multiple texture descriptors and assessing their complementarity is still an open research problem. In this paper, we first show that combining multiple local texture descriptors significantly improves the recognition performance compared to using a single best method alone. This
gain in performance is achieved at the cost of high-dimensional final image representation. To counter this problem, we propose to use an information-theoretic compression technique to obtain a compact texture description without any significant loss in accuracy. In addition, we perform a comprehensive
evaluation of pure color descriptors, popular in object recognition, for the problem of texture classification. Experiments are performed on four challenging texture datasets namely, KTH-TIPS-2a, KTH-TIPS-2b, FMD and Texture-10. The experiments clearly demonstrate that our proposed compact multi-texture approach outperforms the single best texture method alone. In all cases, discriminative color names outperforms other color features for texture classification. Finally, we show that combining discriminative color names with compact texture representation outperforms state-of-the-art methods by 7:8%, 4:3% and 5:0% on KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets respectively.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) LAMP; 600.068; 600.079;ADAS Approved no
Call Number Admin @ si @ KRW2015a Serial 2587
Permanent link to this record
 

 
Author Marco Pedersoli; Jordi Gonzalez; Andrew Bagdanov; Xavier Roca
Title Efficient Discriminative Multiresolution Cascade for Real-Time Human Detection Applications Type Journal Article
Year 2011 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 32 Issue 13 Pages 1581-1587
Keywords
Abstract Human detection is fundamental in many machine vision applications, like video surveillance, driving assistance, action recognition and scene understanding. However in most of these applications real-time performance is necessary and this is not achieved yet by current detection methods.

This paper presents a new method for human detection based on a multiresolution cascade of Histograms of Oriented Gradients (HOG) that can highly reduce the computational cost of detection search without affecting accuracy. The method consists of a cascade of sliding window detectors. Each detector is a linear Support Vector Machine (SVM) composed of HOG features at different resolutions, from coarse at the first level to fine at the last one.

In contrast to previous methods, our approach uses a non-uniform stride of the sliding window that is defined by the feature resolution and allows the detection to be incrementally refined as going from coarse-to-fine resolution. In this way, the speed-up of the cascade is not only due to the fewer number of features computed at the first levels of the cascade, but also to the reduced number of windows that need to be evaluated at the coarse resolution. Experimental results show that our method reaches a detection rate comparable with the state-of-the-art of detectors based on HOG features, while at the same time the detection search is up to 23 times faster.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) ISE Approved no
Call Number Admin @ si @ PGB2011a Serial 1707
Permanent link to this record
 

 
Author Carles Fernandez; Pau Baiget; Xavier Roca; Jordi Gonzalez
Title Augmenting Video Surveillance Footage with Virtual Agents for Incremental Event Evaluation Type Journal Article
Year 2011 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 32 Issue 6 Pages 878–889
Keywords
Abstract The fields of segmentation, tracking and behavior analysis demand for challenging video resources to test, in a scalable manner, complex scenarios like crowded environments or scenes with high semantics. Nevertheless, existing public databases cannot scale the presence of appearing agents, which would be useful to study long-term occlusions and crowds. Moreover, creating these resources is expensive and often too particularized to specific needs. We propose an augmented reality framework to increase the complexity of image sequences in terms of occlusions and crowds, in a scalable and controllable manner. Existing datasets can be increased with augmented sequences containing virtual agents. Such sequences are automatically annotated, thus facilitating evaluation in terms of segmentation, tracking, and behavior recognition. In order to easily specify the desired contents, we propose a natural language interface to convert input sentences into virtual agent behaviors. Experimental tests and validation in indoor, street, and soccer environments are provided to show the feasibility of the proposed approach in terms of robustness, scalability, and semantics.
Address
Corporate Author Thesis
Publisher Elsevier Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) ISE Approved no
Call Number Admin @ si @ FBR2011b Serial 1723
Permanent link to this record
 

 
Author Debora Gil; Petia Radeva
Title Inhibition of false landmarks Type Journal Article
Year 2006 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 27 Issue 9 Pages 1022-1030
Keywords
Abstract Corners and junctions are landmarks characterized by the lack of differentiability in the unit tangent to the image level curve. Detectors based on differential operators are not, by their own definition, the best posed as they require a higher degree of differentiability to yield a reliable response. We argue that a corner detector should be based on the degree of continuity of the tangent vector to the image level sets, work on the image domain and need no assumptions on neither the image local structure nor the particular geometry of the corner/junction. An operator measuring the degree of differentiability of the projection matrix on the image gradient fulfills the above requirements. Because using smoothing kernels leads to corner misplacement, we suggest an alternative fake response remover based on the receptive field inhibition of spurious details. The combination of both orientation discontinuity detection and noise inhibition produce our inhibition orientation energy (IOE) landmark locator.
Address
Corporate Author Thesis
Publisher Elsevier Science Inc. Place of Publication New York, NY, USA Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0167-8655 ISBN Medium
Area Expedition Conference
Notes (down) IAM;MILAB Approved no
Call Number IAM @ iam @ GiR2006 Serial 1529
Permanent link to this record
 

 
Author Antonio Hernandez; Miguel Angel Bautista; Xavier Perez Sala; Victor Ponce; Sergio Escalera; Xavier Baro; Oriol Pujol; Cecilio Angulo
Title Probability-based Dynamic Time Warping and Bag-of-Visual-and-Depth-Words for Human Gesture Recognition in RGB-D Type Journal Article
Year 2014 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 50 Issue 1 Pages 112-121
Keywords RGB-D; Bag-of-Words; Dynamic Time Warping; Human Gesture Recognition
Abstract PATREC5825
We present a methodology to address the problem of human gesture segmentation and recognition in video and depth image sequences. A Bag-of-Visual-and-Depth-Words (BoVDW) model is introduced as an extension of the Bag-of-Visual-Words (BoVW) model. State-of-the-art RGB and depth features, including a newly proposed depth descriptor, are analysed and combined in a late fusion form. The method is integrated in a Human Gesture Recognition pipeline, together with a novel probability-based Dynamic Time Warping (PDTW) algorithm which is used to perform prior segmentation of idle gestures. The proposed DTW variant uses samples of the same gesture category to build a Gaussian Mixture Model driven probabilistic model of that gesture class. Results of the whole Human Gesture Recognition pipeline in a public data set show better performance in comparison to both standard BoVW model and DTW approach.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) HuPBA;MV; 605.203 Approved no
Call Number Admin @ si @ HBP2014 Serial 2353
Permanent link to this record
 

 
Author Victor Ponce; Sergio Escalera; Marc Perez; Oriol Janes; Xavier Baro
Title Non-Verbal Communication Analysis in Victim-Offender Mediations Type Journal Article
Year 2015 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 67 Issue 1 Pages 19-27
Keywords Victim–Offender Mediation; Multi-modal human behavior analysis; Face and gesture recognition; Social signal processing; Computer vision; Machine learning
Abstract We present a non-invasive ambient intelligence framework for the semi-automatic analysis of non-verbal communication applied to the restorative justice field. We propose the use of computer vision and social signal processing technologies in real scenarios of Victim–Offender Mediations, applying feature extraction techniques to multi-modal audio-RGB-depth data. We compute a set of behavioral indicators that define communicative cues from the fields of psychology and observational methodology. We test our methodology on data captured in real Victim–Offender Mediation sessions in Catalonia. We define the ground truth based on expert opinions when annotating the observed social responses. Using different state of the art binary classification approaches, our system achieves recognition accuracies of 86% when predicting satisfaction, and 79% when predicting both agreement and receptivity. Applying a regression strategy, we obtain a mean deviation for the predictions between 0.5 and 0.7 in the range [1–5] for the computed social signals.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) HuPBA;MV Approved no
Call Number Admin @ si @ PEP2015 Serial 2583
Permanent link to this record
 

 
Author Frederic Sampedro; Sergio Escalera; Anna Puig
Title Iterative Multiclass Multiscale Stacked Sequential Learning: definition and application to medical volume segmentation Type Journal Article
Year 2014 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 46 Issue Pages 1-10
Keywords Machine learning; Sequential learning; Multi-class problems; Contextual learning; Medical volume segmentation
Abstract In this work we present the iterative multi-class multi-scale stacked sequential learning framework (IMMSSL), a novel learning scheme that is particularly suited for medical volume segmentation applications. This model exploits the inherent voxel contextual information of the structures of interest in order to improve its segmentation performance results. Without any feature set or learning algorithm prior assumption, the proposed scheme directly seeks to learn the contextual properties of a region from the predicted classifications of previous classifiers within an iterative scheme. Performance results regarding segmentation accuracy in three two-class and multi-class medical volume datasets show a significant improvement with respect to state of the art alternatives. Due to its easiness of implementation and its independence of feature space and learning algorithm, the presented machine learning framework could be taken into consideration as a first choice in complex volume segmentation scenarios.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) HuPBA;MILAB Approved no
Call Number Admin @ si @ SEP2014 Serial 2550
Permanent link to this record
 

 
Author Zhengying Liu; Zhen Xu; Sergio Escalera; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Adrien Pavao; Sebastien Treguer; Wei-Wei Tu
Title Towards automated computer vision: analysis of the AutoCV challenges 2019 Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 135 Issue Pages 196-203
Keywords Computer vision; AutoML; Deep learning
Abstract We present the results of recent challenges in Automated Computer Vision (AutoCV, renamed here for clarity AutoCV1 and AutoCV2, 2019), which are part of a series of challenge on Automated Deep Learning (AutoDL). These two competitions aim at searching for fully automated solutions for classification tasks in computer vision, with an emphasis on any-time performance. The first competition was limited to image classification while the second one included both images and videos. Our design imposed to the participants to submit their code on a challenge platform for blind testing on five datasets, both for training and testing, without any human intervention whatsoever. Winning solutions adopted deep learning techniques based on already published architectures, such as AutoAugment, MobileNet and ResNet, to reach state-of-the-art performance in the time budget of the challenge (only 20 minutes of GPU time). The novel contributions include strategies to deliver good preliminary results at any time during the learning process, such that a method can be stopped early and still deliver good performance. This feature is key for the adoption of such techniques by data analysts desiring to obtain rapidly preliminary results on large datasets and to speed up the development process. The soundness of our design was verified in several aspects: (1) Little overfitting of the on-line leaderboard providing feedback on 5 development datasets was observed, compared to the final blind testing on the 5 (separate) final test datasets, suggesting that winning solutions might generalize to other computer vision classification tasks; (2) Error bars on the winners’ performance allow us to say with confident that they performed significantly better than the baseline solutions we provided; (3) The ranking of participants according to the any-time metric we designed, namely the Area under the Learning Curve, was different from that of the fixed-time metric, i.e. AUC at the end of the fixed time budget. We released all winning solutions under open-source licenses. At the end of the AutoDL challenge series, all data of the challenge will be made publicly available, thus providing a collection of uniformly formatted datasets, which can serve to conduct further research, particularly on meta-learning.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) HuPBA; no proj Approved no
Call Number Admin @ si @ LXE2020 Serial 3427
Permanent link to this record
 

 
Author Mikkel Thogersen; Sergio Escalera; Jordi Gonzalez; Thomas B. Moeslund
Title Segmentation of RGB-D Indoor scenes by Stacking Random Forests and Conditional Random Fields Type Journal Article
Year 2016 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 80 Issue Pages 208–215
Keywords
Abstract This paper proposes a technique for RGB-D scene segmentation using Multi-class
Multi-scale Stacked Sequential Learning (MMSSL) paradigm. Following recent trends in state-of-the-art, a base classifier uses an initial SLIC segmentation to obtain superpixels which provide a diminution of data while retaining object boundaries. A series of color and depth features are extracted from the superpixels, and are used in a Conditional Random Field (CRF) to predict superpixel labels. Furthermore, a Random Forest (RF) classifier using random offset features is also used as an input to the CRF, acting as an initial prediction. As a stacked classifier, another Random Forest is used acting on a spatial multi-scale decomposition of the CRF confidence map to correct the erroneous labels assigned by the previous classifier. The model is tested on the popular NYU-v2 dataset.
The approach shows that simple multi-modal features with the power of the MMSSL
paradigm can achieve better performance than state of the art results on the same dataset.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) HuPBA; ISE;MILAB; 600.098; 600.119 Approved no
Call Number Admin @ si @ TEG2016 Serial 2843
Permanent link to this record
 

 
Author Meysam Madadi; Sergio Escalera; Jordi Gonzalez; Xavier Roca; Felipe Lumbreras
Title Multi-part body segmentation based on depth maps for soft biometry analysis Type Journal Article
Year 2015 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 56 Issue Pages 14-21
Keywords 3D shape context; 3D point cloud alignment; Depth maps; Human body segmentation; Soft biometry analysis
Abstract This paper presents a novel method extracting biometric measures using depth sensors. Given a multi-part labeled training data, a new subject is aligned to the best model of the dataset, and soft biometrics such as lengths or circumference sizes of limbs and body are computed. The process is performed by training relevant pose clusters, defining a representative model, and fitting a 3D shape context descriptor within an iterative matching procedure. We show robust measures by applying orthogonal plates to body hull. We test our approach in a novel full-body RGB-Depth data set, showing accurate estimation of soft biometrics and better segmentation accuracy in comparison with random forest approach without requiring large training data.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) HuPBA; ISE; ADAS; 600.076;600.049; 600.063; 600.054; 302.018;MILAB Approved no
Call Number Admin @ si @ MEG2015 Serial 2588
Permanent link to this record
 

 
Author Sergio Escalera; Alicia Fornes; O. Pujol; Petia Radeva; Gemma Sanchez; Josep Llados
Title Blurred Shape Model for Binary and Grey-level Symbol Recognition Type Journal Article
Year 2009 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 30 Issue 15 Pages 1424–1433
Keywords
Abstract Many symbol recognition problems require the use of robust descriptors in order to obtain rich information of the data. However, the research of a good descriptor is still an open issue due to the high variability of symbols appearance. Rotation, partial occlusions, elastic deformations, intra-class and inter-class variations, or high variability among symbols due to different writing styles, are just a few problems. In this paper, we introduce a symbol shape description to deal with the changes in appearance that these types of symbols suffer. The shape of the symbol is aligned based on principal components to make the recognition invariant to rotation and reflection. Then, we present the Blurred Shape Model descriptor (BSM), where new features encode the probability of appearance of each pixel that outlines the symbols shape. Moreover, we include the new descriptor in a system to deal with multi-class symbol categorization problems. Adaboost is used to train the binary classifiers, learning the BSM features that better split symbol classes. Then, the binary problems are embedded in an Error-Correcting Output Codes framework (ECOC) to deal with the multi-class case. The methodology is evaluated on different synthetic and real data sets. State-of-the-art descriptors and classifiers are compared, showing the robustness and better performance of the present scheme to classify symbols with high variability of appearance.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) HuPBA; DAG; MILAB Approved no
Call Number BCNPCL @ bcnpcl @ EFP2009a Serial 1180
Permanent link to this record
 

 
Author Albert Clapes; Miguel Reyes; Sergio Escalera
Title Multi-modal User Identification and Object Recognition Surveillance System Type Journal Article
Year 2013 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 34 Issue 7 Pages 799-808
Keywords Multi-modal RGB-Depth data analysis; User identification; Object recognition; Intelligent surveillance; Visual features; Statistical learning
Abstract We propose an automatic surveillance system for user identification and object recognition based on multi-modal RGB-Depth data analysis. We model a RGBD environment learning a pixel-based background Gaussian distribution. Then, user and object candidate regions are detected and recognized using robust statistical approaches. The system robustly recognizes users and updates the system in an online way, identifying and detecting new actors in the scene. Moreover, segmented objects are described, matched, recognized, and updated online using view-point 3D descriptions, being robust to partial occlusions and local 3D viewpoint rotations. Finally, the system saves the historic of user–object assignments, being specially useful for surveillance scenarios. The system has been evaluated on a novel data set containing different indoor/outdoor scenarios, objects, and users, showing accurate recognition and better performance than standard state-of-the-art approaches.
Address
Corporate Author Thesis
Publisher Elsevier Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) HUPBA; 600.046; 605.203;MILAB Approved no
Call Number Admin @ si @ CRE2013 Serial 2248
Permanent link to this record
 

 
Author Josep Llados; Horst Bunke; Enric Marti
Title Finding rotational symmetries by cyclic string matching Type Journal Article
Year 1997 Publication Pattern recognition letters Abbreviated Journal PRL
Volume 18 Issue 14 Pages 1435-1442
Keywords Rotational symmetry; Reflectional symmetry; String matching
Abstract Symmetry is an important shape feature. In this paper, a simple and fast method to detect perfect and distorted rotational symmetries of 2D objects is described. The boundary of a shape is polygonally approximated and represented as a string. Rotational symmetries are found by cyclic string matching between two identical copies of the shape string. The set of minimum cost edit sequences that transform the shape string to a cyclically shifted version of itself define the rotational symmetry and its order. Finally, a modification of the algorithm is proposed to detect reflectional symmetries. Some experimental results are presented to show the reliability of the proposed algorithm
Address
Corporate Author Thesis
Publisher Elsevier Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes (down) DAG;IAM; Approved no
Call Number IAM @ iam @ LBM1997a Serial 1562
Permanent link to this record