|   | 
Details
   web
Records
Author (down) Marco Cotogni; Fei Yang; Claudio Cusano; Andrew Bagdanov; Joost Van de Weijer
Title Exemplar-free Continual Learning of Vision Transformers via Gated Class-Attention and Cascaded Feature Drift Compensation Type Miscellaneous
Year 2023 Publication ARXIV Abbreviated Journal
Volume Issue Pages
Keywords
Abstract We propose a new method for exemplar-free class incremental training of ViTs. The main challenge of exemplar-free continual learning is maintaining plasticity of the learner without causing catastrophic forgetting of previously learned tasks. This is often achieved via exemplar replay which can help recalibrate previous task classifiers to the feature drift which occurs when learning new tasks. Exemplar replay, however, comes at the cost of retaining samples from previous tasks which for many applications may not be possible. To address the problem of continual ViT training, we first propose gated class-attention to minimize the drift in the final ViT transformer block. This mask-based gating is applied to class-attention mechanism of the last transformer block and strongly regulates the weights crucial for previous tasks. Importantly, gated class-attention does not require the task-ID during inference, which distinguishes it from other parameter isolation methods. Secondly, we propose a new method of feature drift compensation that accommodates feature drift in the backbone when learning new tasks. The combination of gated class-attention and cascaded feature drift compensation allows for plasticity towards new tasks while limiting forgetting of previous ones. Extensive experiments performed on CIFAR-100, Tiny-ImageNet and ImageNet100 demonstrate that our exemplar-free method obtains competitive results when compared to rehearsal based ViT methods.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP Approved no
Call Number Admin @ si @ CYC2023 Serial 3981
Permanent link to this record
 

 
Author (down) Marco Buzzelli; Joost Van de Weijer; Raimondo Schettini
Title Learning Illuminant Estimation from Object Recognition Type Conference Article
Year 2018 Publication 25th International Conference on Image Processing Abbreviated Journal
Volume Issue Pages 3234 - 3238
Keywords Illuminant estimation; computational color constancy; semi-supervised learning; deep learning; convolutional neural networks
Abstract In this paper we present a deep learning method to estimate the illuminant of an image. Our model is not trained with illuminant annotations, but with the objective of improving performance on an auxiliary task such as object recognition. To the best of our knowledge, this is the first example of a deep
learning architecture for illuminant estimation that is trained without ground truth illuminants. We evaluate our solution on standard datasets for color constancy, and compare it with state of the art methods. Our proposal is shown to outperform most deep learning methods in a cross-dataset evaluation
setup, and to present competitive results in a comparison with parametric solutions.
Address Athens; Greece; October 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIP
Notes LAMP; 600.109; 600.120 Approved no
Call Number Admin @ si @ BWS2018 Serial 3157
Permanent link to this record
 

 
Author (down) Marco Bellantonio; Mohammad A. Haque; Pau Rodriguez; Kamal Nasrollahi; Taisi Telve; Sergio Escalera; Jordi Gonzalez; Thomas B. Moeslund; Pejman Rasti; Golamreza Anbarjafari
Title Spatio-Temporal Pain Recognition in CNN-based Super-Resolved Facial Images Type Conference Article
Year 2016 Publication 23rd International Conference on Pattern Recognition Abbreviated Journal
Volume 10165 Issue Pages
Keywords
Abstract Automatic pain detection is a long expected solution to a prevalent medical problem of pain management. This is more relevant when the subject of pain is young children or patients with limited ability to communicate about their pain experience. Computer vision-based analysis of facial pain expression provides a way of efficient pain detection. When deep machine learning methods came into the scene, automatic pain detection exhibited even better performance. In this paper, we figured out three important factors to exploit in automatic pain detection: spatial information available regarding to pain in each of the facial video frames, temporal axis information regarding to pain expression pattern in a subject video sequence, and variation of face resolution. We employed a combination of convolutional neural network and recurrent neural network to setup a deep hybrid pain detection framework that is able to exploit both spatial and temporal pain information from facial video. In order to analyze the effect of different facial resolutions, we introduce a super-resolution algorithm to generate facial video frames with different resolution setups. We investigated the performance on the publicly available UNBC-McMaster Shoulder Pain database. As a contribution, the paper provides novel and important information regarding to the performance of a hybrid deep learning framework for pain detection in facial images of different resolution.
Address Cancun; Mexico; December 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes HuPBA; ISE; 600.098; 600.119 Approved no
Call Number Admin @ si @ BHR2016 Serial 2902
Permanent link to this record
 

 
Author (down) Marcin Przewiezlikowski; Mateusz Pyla; Bartosz Zielinski; Bartłomiej Twardowski; Jacek Tabor; Marek Smieja
Title Augmentation-aware Self-supervised Learning with Guided Projector Type Miscellaneous
Year 2023 Publication arxiv Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Self-supervised learning (SSL) is a powerful technique for learning robust representations from unlabeled data. By learning to remain invariant to applied data augmentations, methods such as SimCLR and MoCo are able to reach quality on par with supervised approaches. However, this invariance may be harmful to solving some downstream tasks which depend on traits affected by augmentations used during pretraining, such as color. In this paper, we propose to foster sensitivity to such characteristics in the representation space by modifying the projector network, a common component of self-supervised architectures. Specifically, we supplement the projector with information about augmentations applied to images. In order for the projector to take advantage of this auxiliary conditioning when solving the SSL task, the feature extractor learns to preserve the augmentation information in its representations. Our approach, coined Conditional Augmentation-aware Self-supervised Learning (CASSLE), is directly applicable to typical joint-embedding SSL methods regardless of their objective functions. Moreover, it does not require major changes in the network architecture or prior knowledge of downstream tasks. In addition to an analysis of sensitivity towards different data augmentations, we conduct a series of experiments, which show that CASSLE improves over various SSL methods, reaching state-of-the-art performance in multiple downstream tasks.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP Approved no
Call Number Admin @ si @ PPZ2023 Serial 3971
Permanent link to this record
 

 
Author (down) Marcelo D. Pistarelli; Angel Sappa; Ricardo Toledo
Title Multispectral Stereo Image Correspondence Type Conference Article
Year 2013 Publication 15th International Conference on Computer Analysis of Images and Patterns Abbreviated Journal
Volume 8048 Issue Pages 217-224
Keywords
Abstract This paper presents a novel multispectral stereo image correspondence approach. It is evaluated using a stereo rig constructed with a visible spectrum camera and a long wave infrared spectrum camera. The novelty of the proposed approach lies on the usage of Hough space as a correspondence search domain. In this way it avoids searching for correspondence in the original multispectral image domains, where information is low correlated, and a common domain is used. The proposed approach is intended to be used in outdoor urban scenarios, where images contain large amount of edges. These edges are used as distinctive characteristics for the matching in the Hough space. Experimental results are provided showing the validity of the proposed approach.
Address York; uk; August 2013
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-40245-6 Medium
Area Expedition Conference CAIP
Notes ADAS; 600.055 Approved no
Call Number Admin @ si @ PST2013 Serial 2561
Permanent link to this record
 

 
Author (down) Marcel P. Lucassen; Theo Gevers; Arjan Gijsenij
Title Texture Affects Color Emotion Type Journal Article
Year 2011 Publication Color Research & Applications Abbreviated Journal CRA
Volume 36 Issue 6 Pages 426–436
Keywords color;texture;color emotion;observer variability;ranking
Abstract Several studies have recorded color emotions in subjects viewing uniform color (UC) samples. We conduct an experiment to measure and model how these color emotions change when texture is added to the color samples. Using a computer monitor, our subjects arrange samples along four scales: warm–cool, masculine–feminine, hard–soft, and heavy–light. Three sample types of increasing visual complexity are used: UC, grayscale textures, and color textures (CTs). To assess the intraobserver variability, the experiment is repeated after 1 week. Our results show that texture fully determines the responses on the Hard-Soft scale, and plays a role of decreasing weight for the masculine–feminine, heavy–light, and warm–cool scales. Using some 25,000 observer responses, we derive color emotion functions that predict the group-averaged scale responses from the samples' color and texture parameters. For UC samples, the accuracy of our functions is significantly higher (average R2 = 0.88) than that of previously reported functions applied to our data. The functions derived for CT samples have an accuracy of R2 = 0.80. We conclude that when textured samples are used in color emotion studies, the psychological responses may be strongly affected by texture. © 2010 Wiley Periodicals, Inc. Col Res Appl, 2010
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ALTRES;ISE Approved no
Call Number Admin @ si @ LGG2011 Serial 1844
Permanent link to this record
 

 
Author (down) Marc Sunset Perez; Marc Comino Trinidad; Dimosthenis Karatzas; Antonio Chica Calaf; Pere Pau Vazquez Alcocer
Title Development of general‐purpose projection‐based augmented reality systems Type Journal
Year 2016 Publication IADIs international journal on computer science and information systems Abbreviated Journal IADIs
Volume 11 Issue 2 Pages 1-18
Keywords
Abstract Despite the large amount of methods and applications of augmented reality, there is little homogenizatio n on the software platforms that support them. An exception may be the low level control software that is provided by some high profile vendors such as Qualcomm and Metaio. However, these provide fine grain modules for e.g. element tracking. We are more co ncerned on the application framework, that includes the control of the devices working together for the development of the AR experience. In this paper we describe the development of a software framework for AR setups. We concentrate on the modular design of the framework, but also on some hard problems such as the calibration stage, crucial for projection – based AR. The developed framework is suitable and has been tested in AR applications using camera – projector pairs, for both fixed and nomadic setups
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.084 Approved no
Call Number Admin @ si @ SCK2016 Serial 2890
Permanent link to this record
 

 
Author (down) Marc Serra; Olivier Penacchio; Robert Benavente; Maria Vanrell; Dimitris Samaras
Title The Photometry of Intrinsic Images Type Conference Article
Year 2014 Publication 27th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages 1494-1501
Keywords
Abstract Intrinsic characterization of scenes is often the best way to overcome the illumination variability artifacts that complicate most computer vision problems, from 3D reconstruction to object or material recognition. This paper examines the deficiency of existing intrinsic image models to accurately account for the effects of illuminant color and sensor characteristics in the estimation of intrinsic images and presents a generic framework which incorporates insights from color constancy research to the intrinsic image decomposition problem. The proposed mathematical formulation includes information about the color of the illuminant and the effects of the camera sensors, both of which modify the observed color of the reflectance of the objects in the scene during the acquisition process. By modeling these effects, we get a “truly intrinsic” reflectance image, which we call absolute reflectance, which is invariant to changes of illuminant or camera sensors. This model allows us to represent a wide range of intrinsic image decompositions depending on the specific assumptions on the geometric properties of the scene configuration and the spectral properties of the light source and the acquisition system, thus unifying previous models in a single general framework. We demonstrate that even partial information about sensors improves significantly the estimated reflectance images, thus making our method applicable for a wide range of sensors. We validate our general intrinsic image framework experimentally with both synthetic data and natural images.
Address Columbus; Ohio; USA; June 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes CIC; 600.052; 600.051; 600.074 Approved no
Call Number Admin @ si @ SPB2014 Serial 2506
Permanent link to this record
 

 
Author (down) Marc Serra; Olivier Penacchio; Robert Benavente; Maria Vanrell
Title Names and Shades of Color for Intrinsic Image Estimation Type Conference Article
Year 2012 Publication 25th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages 278-285
Keywords
Abstract In the last years, intrinsic image decomposition has gained attention. Most of the state-of-the-art methods are based on the assumption that reflectance changes come along with strong image edges. Recently, user intervention in the recovery problem has proved to be a remarkable source of improvement. In this paper, we propose a novel approach that aims to overcome the shortcomings of pure edge-based methods by introducing strong surface descriptors, such as the color-name descriptor which introduces high-level considerations resembling top-down intervention. We also use a second surface descriptor, termed color-shade, which allows us to include physical considerations derived from the image formation model capturing gradual color surface variations. Both color cues are combined by means of a Markov Random Field. The method is quantitatively tested on the MIT ground truth dataset using different error metrics, achieving state-of-the-art performance.
Address Providence, Rhode Island
Corporate Author Thesis
Publisher IEEE Xplore Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1063-6919 ISBN 978-1-4673-1226-4 Medium
Area Expedition Conference CVPR
Notes CIC Approved no
Call Number Admin @ si @ SPB2012 Serial 2026
Permanent link to this record
 

 
Author (down) Marc Serra
Title Estimating Intrinsic Images from Physical and Categorical Color Cues Type Report
Year 2010 Publication CVC Technical Report Abbreviated Journal
Volume 151 Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis Master's thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes CIC Approved no
Call Number Admin @ si @ Ser2010 Serial 1345
Permanent link to this record
 

 
Author (down) Marc Serra
Title Modeling, estimation and evaluation of intrinsic images considering color information Type Book Whole
Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Image values are the result of a combination of visual information coming from multiple sources. Recovering information from the multiple factors thatproduced an image seems a hard and ill-posed problem. However, it is important to observe that humans develop the ability to interpret images and recognize and isolate specific physical properties of the scene.

Images describing a single physical characteristic of an scene are called intrinsic images. These images would benefit most computer vision tasks which are often affected by the multiple complex effects that are usually found in natural images (e.g. cast shadows, specularities, interreflections...).

In this thesis we analyze the problem of intrinsic image estimation from different perspectives, including the theoretical formulation of the problem, the visual cues that can be used to estimate the intrinsic components and the evaluation mechanisms of the problem.
Address September 2015
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Robert Benavente;Olivier Penacchio
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-943427-4-5 Medium
Area Expedition Conference
Notes CIC; 600.074 Approved no
Call Number Admin @ si @ Ser2015 Serial 2688
Permanent link to this record
 

 
Author (down) Marc Oliu; Sarah Adel Bargal; Stan Sclaroff; Xavier Baro; Sergio Escalera
Title Multi-varied Cumulative Alignment for Domain Adaptation Type Conference Article
Year 2022 Publication 6th International Conference on Image Analysis and Processing Abbreviated Journal
Volume 13232 Issue Pages 324–334
Keywords Domain Adaptation; Computer vision; Neural networks
Abstract Domain Adaptation methods can be classified into two basic families of approaches: non-parametric and parametric. Non-parametric approaches depend on statistical indicators such as feature covariances to minimize the domain shift. Non-parametric approaches tend to be fast to compute and require no additional parameters, but they are unable to leverage probability density functions with complex internal structures. Parametric approaches, on the other hand, use models of the probability distributions as surrogates in minimizing the domain shift, but they require additional trainable parameters to model these distributions. In this work, we propose a new statistical approach to minimizing the domain shift based on stochastically projecting and evaluating the cumulative density function in both domains. As with non-parametric approaches, there are no additional trainable parameters. As with parametric approaches, the internal structure of both domains’ probability distributions is considered, thus leveraging a higher amount of information when reducing the domain shift. Evaluation on standard datasets used for Domain Adaptation shows better performance of the proposed model compared to non-parametric approaches while being competitive with parametric ones. (Code available at: https://github.com/moliusimon/mca).
Address Indonesia; October 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIAP
Notes HuPBA; no menciona Approved no
Call Number Admin @ si @ OAS2022 Serial 3777
Permanent link to this record
 

 
Author (down) Marc Oliu; Javier Selva; Sergio Escalera
Title Folded Recurrent Neural Networks for Future Video Prediction Type Conference Article
Year 2018 Publication 15th European Conference on Computer Vision Abbreviated Journal
Volume 11218 Issue Pages 745-761
Keywords
Abstract Future video prediction is an ill-posed Computer Vision problem that recently received much attention. Its main challenges are the high variability in video content, the propagation of errors through time, and the non-specificity of the future frames: given a sequence of past frames there is a continuous distribution of possible futures. This work introduces bijective Gated Recurrent Units, a double mapping between the input and output of a GRU layer. This allows for recurrent auto-encoders with state sharing between encoder and decoder, stratifying the sequence representation and helping to prevent capacity problems. We show how with this topology only the encoder or decoder needs to be applied for input encoding and prediction, respectively. This reduces the computational cost and avoids re-encoding the predictions when generating a sequence of frames, mitigating the propagation of errors. Furthermore, it is possible to remove layers from an already trained model, giving an insight to the role performed by each layer and making the model more explainable. We evaluate our approach on three video datasets, outperforming state of the art prediction results on MMNIST and UCF101, and obtaining competitive results on KTH with 2 and 3 times less memory usage and computational cost than the best scored approach.
Address Munich; September 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCV
Notes HUPBA; no menciona Approved no
Call Number Admin @ si @ OSE2018 Serial 3204
Permanent link to this record
 

 
Author (down) Marc Oliu; Ciprian Corneanu; Laszlo A. Jeni; Jeffrey F. Cohn; Takeo Kanade; Sergio Escalera
Title Continuous Supervised Descent Method for Facial Landmark Localisation Type Conference Article
Year 2016 Publication 13th Asian Conference on Computer Vision Abbreviated Journal
Volume 10112 Issue Pages 121-135
Keywords
Abstract Recent methods for facial landmark location perform well on close-to-frontal faces but have problems in generalising to large head rotations. In order to address this issue we propose a second order linear regression method that is both compact and robust against strong rotations. We provide a closed form solution, making the method fast to train. We test the method’s performance on two challenging datasets. The first has been intensely used by the community. The second has been specially generated from a well known 3D face dataset. It is considerably more challenging, including a high diversity of rotations and more samples than any other existing public dataset. The proposed method is compared against state-of-the-art approaches, including RCPR, CGPRT, LBF, CFSS, and GSDM. Results upon both datasets show that the proposed method offers state-of-the-art performance on near frontal view data, improves state-of-the-art methods on more challenging head rotation problems and keeps a compact model size.
Address Taipei; Taiwan; November 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ACCV
Notes HuPBA;MILAB; Approved no
Call Number Admin @ si @ OCJ2016 Serial 2838
Permanent link to this record
 

 
Author (down) Marc Oliu; Ciprian Corneanu; Kamal Nasrollahi; Olegs Nikisins; Sergio Escalera; Yunlian Sun; Haiqing Li; Zhenan Sun; Thomas B. Moeslund; Modris Greitans
Title Improved RGB-D-T based Face Recognition Type Journal Article
Year 2016 Publication IET Biometrics Abbreviated Journal BIO
Volume 5 Issue 4 Pages 297 - 303
Keywords
Abstract Reliable facial recognition systems are of crucial importance in various applications from entertainment to security. Thanks to the deep-learning concepts introduced in the field, a significant improvement in the performance of the unimodal facial recognition systems has been observed in the recent years. At the same time a multimodal facial recognition is a promising approach. This study combines the latest successes in both directions by applying deep learning convolutional neural networks (CNN) to the multimodal RGB, depth, and thermal (RGB-D-T) based facial recognition problem outperforming previously published results. Furthermore, a late fusion of the CNN-based recognition block with various hand-crafted features (local binary patterns, histograms of oriented gradients, Haar-like rectangular features, histograms of Gabor ordinal measures) is introduced, demonstrating even better recognition performance on a benchmark RGB-D-T database. The obtained results in this study show that the classical engineered features and CNN-based features can complement each other for recognition purposes.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA;MILAB; Approved no
Call Number Admin @ si @ OCN2016 Serial 2854
Permanent link to this record