|
Marco Pedersoli, Andrea Vedaldi, Jordi Gonzalez, & Xavier Roca. (2015). A coarse-to-fine approach for fast deformable object detection. PR - Pattern Recognition, 48(5), 1844–1853.
Abstract: We present a method that can dramatically accelerate object detection with part based models. The method is based on the observation that the cost of detection is likely to be dominated by the cost of matching each part to the image, and not by the cost of computing the optimal configuration of the parts as commonly assumed. Therefore accelerating detection requires minimizing the number of
part-to-image comparisons. To this end we propose a multiple-resolutions hierarchical part based model and a corresponding coarse-to-fine inference procedure that recursively eliminates from the search space unpromising part
placements. The method yields a ten-fold speedup over the standard dynamic programming approach and is complementary to the cascade-of-parts approach of [9]. Compared to the latter, our method does not have parameters to be determined empirically, which simplifies its use during the training of the model. Most importantly, the two techniques can be combined to obtain a very significant speedup, of two orders of magnitude in some cases. We evaluate our method extensively on the PASCAL VOC and INRIA datasets, demonstrating a very high increase in the detection speed with little degradation of the accuracy.
|
|
|
Pau Rodriguez, Guillem Cucurull, Josep M. Gonfaus, Xavier Roca, & Jordi Gonzalez. (2017). Age and gender recognition in the wild with deep attention. PR - Pattern Recognition, 72, 563–571.
Abstract: Face analysis in images in the wild still pose a challenge for automatic age and gender recognition tasks, mainly due to their high variability in resolution, deformation, and occlusion. Although the performance has highly increased thanks to Convolutional Neural Networks (CNNs), it is still far from optimal when compared to other image recognition tasks, mainly because of the high sensitiveness of CNNs to facial variations. In this paper, inspired by biology and the recent success of attention mechanisms on visual question answering and fine-grained recognition, we propose a novel feedforward attention mechanism that is able to discover the most informative and reliable parts of a given face for improving age and gender classification. In particular, given a downsampled facial image, the proposed model is trained based on a novel end-to-end learning framework to extract the most discriminative patches from the original high-resolution image. Experimental validation on the standard Adience, Images of Groups, and MORPH II benchmarks show that including attention mechanisms enhances the performance of CNNs in terms of robustness and accuracy.
Keywords: Age recognition; Gender recognition; Deep neural networks; Attention mechanisms
|
|
|
Parichehr Behjati, Pau Rodriguez, Carles Fernandez, Isabelle Hupont, Armin Mehri, & Jordi Gonzalez. (2023). Single image super-resolution based on directional variance attention network. PR - Pattern Recognition, 133, 108997.
Abstract: Recent advances in single image super-resolution (SISR) explore the power of deep convolutional neural networks (CNNs) to achieve better performance. However, most of the progress has been made by scaling CNN architectures, which usually raise computational demands and memory consumption. This makes modern architectures less applicable in practice. In addition, most CNN-based SR methods do not fully utilize the informative hierarchical features that are helpful for final image recovery. In order to address these issues, we propose a directional variance attention network (DiVANet), a computationally efficient yet accurate network for SISR. Specifically, we introduce a novel directional variance attention (DiVA) mechanism to capture long-range spatial dependencies and exploit inter-channel dependencies simultaneously for more discriminative representations. Furthermore, we propose a residual attention feature group (RAFG) for parallelizing attention and residual block computation. The output of each residual block is linearly fused at the RAFG output to provide access to the whole feature hierarchy. In parallel, DiVA extracts most relevant features from the network for improving the final output and preventing information loss along the successive operations inside the network. Experimental results demonstrate the superiority of DiVANet over the state of the art in several datasets, while maintaining relatively low computation and memory footprint. The code is available at https://github.com/pbehjatii/DiVANet.
|
|
|
Mikhail Mozerov, Ariel Amato, Xavier Roca, & Jordi Gonzalez. (2008). Trajectory Occlusion Handling with Multiple View Distance Minimisation Clustering. Optical Engineering, vol. 47(04)04702, DOI:10.11781.2909665.
|
|
|
Mikhail Mozerov, & V. Kober. (2006). Impulse Noise Removal with Gradient Adaptive Neighborhoods. Optical Engineering, 45: 67003.
|
|