|
Adria Ruiz, Joost Van de Weijer, & Xavier Binefa. (2015). From emotions to action units with hidden and semi-hidden-task learning. In 16th IEEE International Conference on Computer Vision (pp. 3703–3711).
Abstract: Limited annotated training data is a challenging problem in Action Unit recognition. In this paper, we investigate how the use of large databases labelled according to the 6 universal facial expressions can increase the generalization ability of Action Unit classifiers. For this purpose, we propose a novel learning framework: Hidden-Task Learning. HTL aims to learn a set of Hidden-Tasks (Action Units)for which samples are not available but, in contrast, training data is easier to obtain from a set of related VisibleTasks (Facial Expressions). To that end, HTL is able to exploit prior knowledge about the relation between Hidden and Visible-Tasks. In our case, we base this prior knowledge on empirical psychological studies providing statistical correlations between Action Units and universal facial expressions. Additionally, we extend HTL to Semi-Hidden Task Learning (SHTL) assuming that Action Unit training samples are also provided. Performing exhaustive experiments over four different datasets, we show that HTL and SHTL improve the generalization ability of AU classifiers by training them with additional facial expression data. Additionally, we show that SHTL achieves competitive performance compared with state-of-the-art Transductive Learning approaches which face the problem of limited training data by using unlabelled test samples during training.
|
|
|
Mikhail Mozerov, & Joost Van de Weijer. (2017). Improved Recursive Geodesic Distance Computation for Edge Preserving Filter. TIP - IEEE Transactions on Image Processing, 26(8), 3696–3706.
Abstract: All known recursive filters based on the geodesic distance affinity are realized by two 1D recursions applied in two orthogonal directions of the image plane. The 2D extension of the filter is not valid and has theoretically drawbacks, which lead to known artifacts. In this paper, a maximum influence propagation method is proposed to approximate the 2D extension for the
geodesic distance-based recursive filter. The method allows to partially overcome the drawbacks of the 1D recursion approach. We show that our improved recursion better approximates the true geodesic distance filter, and the application of this improved filter for image denoising outperforms the existing recursive implementation of the geodesic distance. As an application,
we consider a geodesic distance-based filter for image denoising.
Experimental evaluation of our denoising method demonstrates comparable and for several test images better results, than stateof-the-art approaches, while our algorithm is considerably fasterwith computational complexity O(8P).
Keywords: Geodesic distance filter; color image filtering; image enhancement
|
|
|
P. Ricaurte, C. Chilan, Cristhian A. Aguilera-Carrasco, Boris X. Vintimilla, & Angel Sappa. (2014). Feature Point Descriptors: Infrared and Visible Spectra. SENS - Sensors, 14(2), 3690–3701.
Abstract: This manuscript evaluates the behavior of classical feature point descriptors when they are used in images from long-wave infrared spectral band and compare them with the results obtained in the visible spectrum. Robustness to changes in rotation, scaling, blur, and additive noise are analyzed using a state of the art framework. Experimental results using a cross-spectral outdoor image data set are presented and conclusions from these experiments are given.
|
|
|
Hamed H. Aghdam, Abel Gonzalez-Garcia, Joost Van de Weijer, & Antonio Lopez. (2019). Active Learning for Deep Detection Neural Networks. In 18th IEEE International Conference on Computer Vision (pp. 3672–3680).
Abstract: The cost of drawing object bounding boxes (ie labeling) for millions of images is prohibitively high. For instance, labeling pedestrians in a regular urban image could take 35 seconds on average. Active learning aims to reduce the cost of labeling by selecting only those images that are informative to improve the detection network accuracy. In this paper, we propose a method to perform active learning of object detectors based on convolutional neural networks. We propose a new image-level scoring process to rank unlabeled images for their automatic selection, which clearly outperforms classical scores. The proposed method can be applied to videos and sets of still images. In the former case, temporal selection rules can complement our scoring process. As a relevant use case, we extensively study the performance of our method on the task of pedestrian detection. Overall, the experiments show that the proposed method performs better than random selection.
|
|
|
Fahad Shahbaz Khan, Joost Van de Weijer, Muhammad Anwer Rao, Michael Felsberg, & Carlo Gatta. (2014). Semantic Pyramids for Gender and Action Recognition. TIP - IEEE Transactions on Image Processing, 23(8), 3633–3645.
Abstract: Person description is a challenging problem in computer vision. We investigated two major aspects of person description: 1) gender and 2) action recognition in still images. Most state-of-the-art approaches for gender and action recognition rely on the description of a single body part, such as face or full-body. However, relying on a single body part is suboptimal due to significant variations in scale, viewpoint, and pose in real-world images. This paper proposes a semantic pyramid approach for pose normalization. Our approach is fully automatic and based on combining information from full-body, upper-body, and face regions for gender and action recognition in still images. The proposed approach does not require any annotations for upper-body and face of a person. Instead, we rely on pretrained state-of-the-art upper-body and face detectors to automatically extract semantic information of a person. Given multiple bounding boxes from each body part detector, we then propose a simple method to select the best candidate bounding box, which is used for feature extraction. Finally, the extracted features from the full-body, upper-body, and face regions are combined into a single representation for classification. To validate the proposed approach for gender recognition, experiments are performed on three large data sets namely: 1) human attribute; 2) head-shoulder; and 3) proxemics. For action recognition, we perform experiments on four data sets most used for benchmarking action recognition in still images: 1) Sports; 2) Willow; 3) PASCAL VOC 2010; and 4) Stanford-40. Our experiments clearly demonstrate that the proposed approach, despite its simplicity, outperforms state-of-the-art methods for gender and action recognition.
|
|
|
Kai Wang, Luis Herranz, & Joost Van de Weijer. (2021). Continual learning in cross-modal retrieval. In 2nd CLVISION workshop (pp. 3628–3638).
Abstract: Multimodal representations and continual learning are two areas closely related to human intelligence. The former considers the learning of shared representation spaces where information from different modalities can be compared and integrated (we focus on cross-modal retrieval between language and visual representations). The latter studies how to prevent forgetting a previously learned task when learning a new one. While humans excel in these two aspects, deep neural networks are still quite limited. In this paper, we propose a combination of both problems into a continual cross-modal retrieval setting, where we study how the catastrophic interference caused by new tasks impacts the embedding spaces and their cross-modal alignment required for effective retrieval. We propose a general framework that decouples the training, indexing and querying stages. We also identify and study different factors that may lead to forgetting, and propose tools to alleviate it. We found that the indexing stage pays an important role and that simply avoiding reindexing the database with updated embedding networks can lead to significant gains. We evaluated our methods in two image-text retrieval datasets, obtaining significant gains with respect to the fine tuning baseline.
|
|
|
Vincenzo Lomonaco, Lorenzo Pellegrini, Andrea Cossu, Antonio Carta, Gabriele Graffieti, Tyler L. Hayes, et al. (2021). Avalanche: an End-to-End Library for Continual Learning. In 34th IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 3595–3605).
Abstract: Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose Avalanche, an open-source end-to-end library for continual learning research based on PyTorch. Avalanche is designed to provide a shared and collaborative codebase for fast prototyping, training, and reproducible evaluation of continual learning algorithms.
|
|
|
Michal Drozdzal, Jordi Vitria, Santiago Segui, Carolina Malagelada, Fernando Azpiroz, & Petia Radeva. (2014). Intestinal event segmentation for endoluminal video analysis. In 21st IEEE International Conference on Image Processing (pp. 3592–3596).
|
|
|
German Ros, Jesus Martinez del Rincon, & Gines Garcia-Mateos. (2012). Articulated Particle Filter for Hand Tracking. In 21st International Conference on Pattern Recognition (pp. 3581–3585).
Abstract: This paper proposes a new version of Particle Filter, called Articulated Particle Filter – ArPF -, which has been specifically designed for an efficient sampling of hierarchical spaces, generated by articulated objects. Our approach decomposes the articulated motion into layers for efficiency purposes, making use of a careful modeling of the diffusion noise along with its propagation through the articulations. This produces an increase of accuracy and prevent for divergences. The algorithm is tested on hand tracking due to its complex hierarchical articulated nature. With this purpose, a new dataset generation tool for quantitative evaluation is also presented in this paper.
|
|
|
Marc Masana, Tinne Tuytelaars, & Joost Van de Weijer. (2021). Ternary Feature Masks: zero-forgetting for task-incremental learning. In 34th IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 3565–3574).
Abstract: We propose an approach without any forgetting to continual learning for the task-aware regime, where at inference the task-label is known. By using ternary masks we can upgrade a model to new tasks, reusing knowledge from previous tasks while not forgetting anything about them. Using masks prevents both catastrophic forgetting and backward transfer. We argue -- and show experimentally -- that avoiding the former largely compensates for the lack of the latter, which is rarely observed in practice. In contrast to earlier works, our masks are applied to the features (activations) of each layer instead of the weights. This considerably reduces the number of mask parameters for each new task; with more than three orders of magnitude for most networks. The encoding of the ternary masks into two bits per feature creates very little overhead to the network, avoiding scalability issues. To allow already learned features to adapt to the current task without changing the behavior of these features for previous tasks, we introduce task-specific feature normalization. Extensive experiments on several finegrained datasets and ImageNet show that our method outperforms current state-of-the-art while reducing memory overhead in comparison to weight-based approaches.
|
|
|
Simone Balocco, O. Basset, G. Courbebaisse, E. Boni, Alejandro F. Frangi, P. Tortoli, et al. (2010). Estimation Of Viscoelastic Properties Of Vessel Walls Using a Computational Model and Doppler Ultrasound. PMB - Physics in Medicine and Biology, 55(12), 3557–3575.
Abstract: Human arteries affected by atherosclerosis are characterized by altered wall viscoelastic properties. The possibility of noninvasively assessing arterial viscoelasticity in vivo would significantly contribute to the early diagnosis and prevention of this disease. This paper presents a noniterative technique to estimate the viscoelastic parameters of a vascular wall Zener model. The approach requires the simultaneous measurement of flow variations and wall displacements, which can be provided by suitable ultrasound Doppler instruments. Viscoelastic parameters are estimated by fitting the theoretical constitutive equations to the experimental measurements using an ARMA parameter approach. The accuracy and sensitivity of the proposed method are tested using reference data generated by numerical simulations of arterial pulsation in which the physiological conditions and the viscoelastic parameters of the model can be suitably varied. The estimated values quantitatively agree with the reference values, showing that the only parameter affected by changing the physiological conditions is viscosity, whose relative error was about 27% even when a poor signal-to-noise ratio is simulated. Finally, the feasibility of the method is illustrated through three measurements made at different flow regimes on a cylindrical vessel phantom, yielding a parameter mean estimation error of 25%.
|
|
|
Victor M. Campello, Polyxeni Gkontra, Cristian Izquierdo, Carlos Martin-Isla, Alireza Sojoudi, Peter M. Full, et al. (2021). Multi-Centre, Multi-Vendor and Multi-Disease Cardiac Segmentation: The M&Ms Challenge. TMI - IEEE Transactions on Medical Imaging, 40(12), 3543–3554.
Abstract: The emergence of deep learning has considerably advanced the state-of-the-art in cardiac magnetic resonance (CMR) segmentation. Many techniques have been proposed over the last few years, bringing the accuracy of automated segmentation close to human performance. However, these models have been all too often trained and validated using cardiac imaging samples from single clinical centres or homogeneous imaging protocols. This has prevented the development and validation of models that are generalizable across different clinical centres, imaging conditions or scanner vendors. To promote further research and scientific benchmarking in the field of generalizable deep learning for cardiac segmentation, this paper presents the results of the Multi-Centre, Multi-Vendor and Multi-Disease Cardiac Segmentation (M&Ms) Challenge, which was recently organized as part of the MICCAI 2020 Conference. A total of 14 teams submitted different solutions to the problem, combining various baseline models, data augmentation strategies, and domain adaptation techniques. The obtained results indicate the importance of intensity-driven data augmentation, as well as the need for further research to improve generalizability towards unseen scanner vendors or new imaging protocols. Furthermore, we present a new resource of 375 heterogeneous CMR datasets acquired by using four different scanner vendors in six hospitals and three different countries (Spain, Canada and Germany), which we provide as open-access for the community to enable future research in the field.
|
|
|
Federico Bartoli, Giuseppe Lisanti, Svebor Karaman, Andrew Bagdanov, & Alberto del Bimbo. (2014). Unsupervised scene adaptation for faster multi- scale pedestrian detection. In 22nd International Conference on Pattern Recognition (pp. 3534–3539).
|
|
|
Angel Sappa, & Mohammad Rouhani. (2009). Efficient Distance Estimation for Fitting Implicit Quadric Surfaces. In 16th IEEE International Conference on Image Processing (3521–3524).
Abstract: This paper presents a novel approach for estimating the shortest Euclidean distance from a given point to the corresponding implicit quadric fitting surface. It first estimates the orthogonal orientation to the surface from the given point; then the shortest distance is directly estimated by intersecting the implicit surface with a line passing through the given point according to the estimated orthogonal orientation. The proposed orthogonal distance estimation is easily obtained without increasing computational complexity; hence it can be used in error minimization surface fitting frameworks. Comparisons of the proposed metric with previous approaches are provided to show both improvements in CPU time as well as in the accuracy of the obtained results. Surfaces fitted by using the proposed geometric distance estimation and state of the art metrics are presented to show the viability of the proposed approach.
|
|
|
Filip Szatkowski, Mateusz Pyla, Marcin Przewięzlikowski, Sebastian Cygert, Bartłomiej Twardowski, & Tomasz Trzcinski. (2023). Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-Free Continual Learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops (pp. 3512–3517).
Abstract: In this work, we investigate exemplar-free class incremental learning (CIL) with knowledge distillation (KD) as a regularization strategy, aiming to prevent forgetting. KD-based methods are successfully used in CIL, but they often struggle to regularize the model without access to exemplars of the training data from previous tasks. Our analysis reveals that this issue originates from substantial representation shifts in the teacher network when dealing with out-of-distribution data. This causes large errors in the KD loss component, leading to performance degradation in CIL. Inspired by recent test-time adaptation methods, we introduce Teacher Adaptation (TA), a method that concurrently updates the teacher and the main model during incremental training. Our method seamlessly integrates with KD-based CIL approaches and allows for consistent enhancement of their performance across multiple exemplar-free CIL benchmarks.
|
|