Records |
Author |
Y. Patel; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar |
Title |
Self-Supervised Visual Representations for Cross-Modal Retrieval |
Type |
Conference Article |
Year |
2019 |
Publication |
ACM International Conference on Multimedia Retrieval |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
182–186 |
Keywords |
|
Abstract |
Cross-modal retrieval methods have been significantly improved in last years with the use of deep neural networks and large-scale annotated datasets such as ImageNet and Places. However, collecting and annotating such datasets requires a tremendous amount of human effort and, besides, their annotations are limited to discrete sets of popular visual classes that may not be representative of the richer semantics found on large-scale cross-modal retrieval datasets. In this paper, we present a self-supervised cross-modal retrieval framework that leverages as training data the correlations between images and text on the entire set of Wikipedia articles. Our method consists in training a CNN to predict: (1) the semantic context of the article in which an image is more probable to appear as an illustration, and (2) the semantic context of its caption. Our experiments demonstrate that the proposed method is not only capable of learning discriminative visual representations for solving vision tasks like classification, but that the learned representations are better for cross-modal retrieval when compared to supervised pre-training of the network on the ImageNet dataset. |
Address |
Otawa; Canada; june 2019 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICMR |
Notes |
DAG; 600.121; 600.129 |
Approved |
no |
Call Number |
Admin @ si @ PGR2019 |
Serial |
3288 |
Permanent link to this record |
|
|
|
Author |
Ali Furkan Biten; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas |
Title |
Good News, Everyone! Context driven entity-aware captioning for news images |
Type |
Conference Article |
Year |
2019 |
Publication |
32nd IEEE Conference on Computer Vision and Pattern Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
12458-12467 |
Keywords |
|
Abstract |
Current image captioning systems perform at a merely descriptive level, essentially enumerating the objects in the scene and their relations. Humans, on the contrary, interpret images by integrating several sources of prior knowledge of the world. In this work, we aim to take a step closer to producing captions that offer a plausible interpretation of the scene, by integrating such contextual information into the captioning pipeline. For this we focus on the captioning of images used to illustrate news articles. We propose a novel captioning method that is able to leverage contextual information provided by the text of news articles associated with an image. Our model is able to selectively draw information from the article guided by visual cues, and to dynamically extend the output dictionary to out-of-vocabulary named entities that appear in the context source. Furthermore we introduce“ GoodNews”, the largest news image captioning dataset in the literature and demonstrate state-of-the-art results. |
Address |
Long beach; California; USA; june 2019 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
CVPR |
Notes |
DAG; 600.129; 600.135; 601.338; 600.121 |
Approved |
no |
Call Number |
Admin @ si @ BGR2019 |
Serial |
3289 |
Permanent link to this record |
|
|
|
Author |
Axel Barroso-Laguna; Edgar Riba; Daniel Ponsa; Krystian Mikolajczyk |
Title |
Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters |
Type |
Conference Article |
Year |
2019 |
Publication |
18th IEEE International Conference on Computer Vision |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
5835-5843 |
Keywords |
|
Abstract |
We introduce a novel approach for keypoint detection task that combines handcrafted and learned CNN filters within a shallow multi-scale architecture. Handcrafted filters provide anchor structures for learned filters, which localize, score and rank repeatable features. Scale-space representation is used within the network to extract keypoints at different levels. We design a loss function to detect robust features that exist across a range of scales and to maximize the repeatability score. Our Key.Net model is trained on data synthetically created from ImageNet and evaluated on HPatches benchmark. Results show that our approach outperforms state-of-the-art detectors in terms of repeatability, matching performance and complexity. |
Address |
Seul; Corea; October 2019 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICCV |
Notes |
MSIAU; 600.122 |
Approved |
no |
Call Number |
Admin @ si @ BRP2019 |
Serial |
3290 |
Permanent link to this record |
|
|
|
Author |
Edgar Riba; D. Mishkin; Daniel Ponsa; E. Rublee; G. Bradski |
Title |
Kornia: an Open Source Differentiable Computer Vision Library for PyTorch |
Type |
Conference Article |
Year |
2020 |
Publication |
IEEE Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
|
Address |
Aspen; Colorado; USA; March 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
WACV |
Notes |
MSIAU; 600.122; 600.130 |
Approved |
no |
Call Number |
Admin @ si @ RMP2020 |
Serial |
3291 |
Permanent link to this record |
|
|
|
Author |
Javad Zolfaghari Bengar; Abel Gonzalez-Garcia; Gabriel Villalonga; Bogdan Raducanu; Hamed H. Aghdam; Mikhail Mozerov; Antonio Lopez; Joost Van de Weijer |
Title |
Temporal Coherence for Active Learning in Videos |
Type |
Conference Article |
Year |
2019 |
Publication |
IEEE International Conference on Computer Vision Workshops |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
914-923 |
Keywords |
|
Abstract |
Autonomous driving systems require huge amounts of data to train. Manual annotation of this data is time-consuming and prohibitively expensive since it involves human resources. Therefore, active learning emerged as an alternative to ease this effort and to make data annotation more manageable. In this paper, we introduce a novel active learning approach for object detection in videos by exploiting temporal coherence. Our active learning criterion is based on the estimated number of errors in terms of false positives and false negatives. The detections obtained by the object detector are used to define the nodes of a graph and tracked forward and backward to temporally link the nodes. Minimizing an energy function defined on this graphical model provides estimates of both false positives and false negatives. Additionally, we introduce a synthetic video dataset, called SYNTHIA-AL, specially designed to evaluate active learning for video object detection in road scenes. Finally, we show that our approach outperforms active learning baselines tested on two datasets. |
Address |
Seul; Corea; October 2019 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICCVW |
Notes |
LAMP; ADAS; 600.124; 602.200; 600.118; 600.120; 600.141 |
Approved |
no |
Call Number |
Admin @ si @ ZGV2019 |
Serial |
3294 |
Permanent link to this record |
|
|
|
Author |
Corina Krauter; Ursula Reiter; Albrecht Schmidt; Marc Masana; Rudolf Stollberger; Michael Fuchsjager; Gert Reiter |
Title |
Objective extraction of the temporal evolution of the mitral valve vortex ring from 4D flow MRI |
Type |
Conference Article |
Year |
2019 |
Publication |
27th Annual Meeting & Exhibition of the International Society for Magnetic Resonance in Medicine |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
The mitral valve vortex ring is a promising flow structure for analysis of diastolic function, however, methods for objective extraction of its formation to dissolution are lacking. We present a novel algorithm for objective extraction of the temporal evolution of the mitral valve vortex ring from magnetic resonance 4D flow data and validated the method against visual analysis. The algorithm successfully extracted mitral valve vortex rings during both early- and late-diastolic filling and agreed substantially with visual assessment. Early-diastolic mitral valve vortex ring properties differed between healthy subjects and patients with ischemic heart disease. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ISMRM |
Notes |
LAMP; 600.120 |
Approved |
no |
Call Number |
Admin @ si @ KRS2019 |
Serial |
3300 |
Permanent link to this record |
|
|
|
Author |
Parichehr Behjati Ardakani; Pau Rodriguez; Armin Mehri; Isabelle Hupont; Carles Fernandez; Jordi Gonzalez |
Title |
OverNet: Lightweight Multi-Scale Super-Resolution with Overscaling Network |
Type |
Conference Article |
Year |
2021 |
Publication |
IEEE Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
2693-2702 |
Keywords |
|
Abstract |
Super-resolution (SR) has achieved great success due to the development of deep convolutional neural networks (CNNs). However, as the depth and width of the networks increase, CNN-based SR methods have been faced with the challenge of computational complexity in practice. More- over, most SR methods train a dedicated model for each target resolution, losing generality and increasing memory requirements. To address these limitations we introduce OverNet, a deep but lightweight convolutional network to solve SISR at arbitrary scale factors with a single model. We make the following contributions: first, we introduce a lightweight feature extractor that enforces efficient reuse of information through a novel recursive structure of skip and dense connections. Second, to maximize the performance of the feature extractor, we propose a model agnostic reconstruction module that generates accurate high-resolution images from overscaled feature maps obtained from any SR architecture. Third, we introduce a multi-scale loss function to achieve generalization across scales. Experiments show that our proposal outperforms previous state-of-the-art approaches in standard benchmarks, while maintaining relatively low computation and memory requirements. |
Address |
Virtual; January 2021 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
WACV |
Notes |
ISE; 600.119; 600.098 |
Approved |
no |
Call Number |
Admin @ si @ BRM2021 |
Serial |
3512 |
Permanent link to this record |
|
|
|
Author |
Hamed H. Aghdam; Abel Gonzalez-Garcia; Joost Van de Weijer; Antonio Lopez |
Title |
Active Learning for Deep Detection Neural Networks |
Type |
Conference Article |
Year |
2019 |
Publication |
18th IEEE International Conference on Computer Vision |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
3672-3680 |
Keywords |
|
Abstract |
The cost of drawing object bounding boxes (ie labeling) for millions of images is prohibitively high. For instance, labeling pedestrians in a regular urban image could take 35 seconds on average. Active learning aims to reduce the cost of labeling by selecting only those images that are informative to improve the detection network accuracy. In this paper, we propose a method to perform active learning of object detectors based on convolutional neural networks. We propose a new image-level scoring process to rank unlabeled images for their automatic selection, which clearly outperforms classical scores. The proposed method can be applied to videos and sets of still images. In the former case, temporal selection rules can complement our scoring process. As a relevant use case, we extensively study the performance of our method on the task of pedestrian detection. Overall, the experiments show that the proposed method performs better than random selection. |
Address |
Seul; Korea; October 2019 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICCV |
Notes |
ADAS; LAMP; 600.124; 600.109; 600.141; 600.120; 600.118 |
Approved |
no |
Call Number |
Admin @ si @ AGW2019 |
Serial |
3321 |
Permanent link to this record |
|
|
|
Author |
Felipe Codevilla; Eder Santana; Antonio Lopez; Adrien Gaidon |
Title |
Exploring the Limitations of Behavior Cloning for Autonomous Driving |
Type |
Conference Article |
Year |
2019 |
Publication |
18th IEEE International Conference on Computer Vision |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
9328-9337 |
Keywords |
|
Abstract |
Driving requires reacting to a wide variety of complex environment conditions and agent behaviors. Explicitly modeling each possible scenario is unrealistic. In contrast, imitation learning can, in theory, leverage data from large fleets of human-driven cars. Behavior cloning in particular has been successfully used to learn simple visuomotor policies end-to-end, but scaling to the full spectrum of driving behaviors remains an unsolved problem. In this paper, we propose a new benchmark to experimentally investigate the scalability and limitations of behavior cloning. We show that behavior cloning leads to state-of-the-art results, executing complex lateral and longitudinal maneuvers, even in unseen environments, without being explicitly programmed to do so. However, we confirm some limitations of the behavior cloning approach: some well-known limitations (eg, dataset bias and overfitting), new generalization issues (eg, dynamic objects and the lack of a causal modeling), and training instabilities, all requiring further research before behavior cloning can graduate to real-world driving. The code, dataset, benchmark, and agent studied in this paper can be found at github. |
Address |
Seul; Korea; October 2019 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICCV |
Notes |
ADAS; 600.124; 600.118 |
Approved |
no |
Call Number |
Admin @ si @ CSL2019 |
Serial |
3322 |
Permanent link to this record |
|
|
|
Author |
Zhengying Liu; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Sergio Escalera; Adrien Pavao; Hugo Jair Escalante; Wei-Wei Tu; Zhen Xu; Sebastien Treguer |
Title |
AutoCV Challenge Design and Baseline Results |
Type |
Conference Article |
Year |
2019 |
Publication |
La Conference sur l’Apprentissage Automatique |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
We present the design and beta tests of a new machine learning challenge called AutoCV (for Automated Computer Vision), which is the first event in a series of challenges we are planning on the theme of Automated Deep Learning. We target applications for which Deep Learning methods have had great success in the past few years, with the aim of pushing the state of the art in fully automated methods to design the architecture of neural networks and train them without any human intervention. The tasks are restricted to multi-label image classification problems, from domains including medical, areal, people, object, and handwriting imaging. Thus the type of images will vary a lot in scales, textures, and structure. Raw data are provided (no features extracted), but all datasets are formatted in a uniform tensor manner (although images may have fixed or variable sizes within a dataset). The participants's code will be blind tested on a challenge platform in a controlled manner, with restrictions on training and test time and memory limitations. The challenge is part of the official selection of IJCNN 2019. |
Address |
Toulouse; Francia; July 2019 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
HUPBA; no proj |
Approved |
no |
Call Number |
Admin @ si @ LGJ2019 |
Serial |
3323 |
Permanent link to this record |
|
|
|
Author |
Reza Azad; Maryam Asadi Aghbolaghi; Mahmood Fathy; Sergio Escalera |
Title |
Bi-Directional ConvLSTM U-Net with Densley Connected Convolutions |
Type |
Conference Article |
Year |
2019 |
Publication |
Visual Recognition for Medical Images workshop |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
406-415 |
Keywords |
|
Abstract |
In recent years, deep learning-based networks have achieved state-of-the-art performance in medical image segmentation. Among the existing networks, U-Net has been successfully applied on medical image segmentation. In this paper, we propose an extension of U-Net, Bi-directional ConvLSTM U-Net with Densely connected convolutions (BCDU-Net), for medical image segmentation, in which we take full advantages of U-Net, bi-directional ConvLSTM (BConvLSTM) and the mechanism of dense convolutions. Instead of a simple concatenation in the skip connection of U-Net, we employ BConvLSTM to combine the feature maps extracted from the corresponding encoding path and the previous decoding up-convolutional layer in a non-linear way. To strengthen feature propagation and encourage feature reuse, we use densely connected convolutions in the last convolutional layer of the encoding path. Finally, we can accelerate the convergence speed of the proposed network by employing batch normalization (BN). The proposed model is evaluated on three datasets of: retinal blood vessel segmentation, skin lesion segmentation, and lung nodule segmentation, achieving state-of-the-art performance. |
Address |
Seul; Korea; October 2019 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICCVW |
Notes |
HUPBA; no proj |
Approved |
no |
Call Number |
Admin @ si @ AAF2019 |
Serial |
3324 |
Permanent link to this record |
|
|
|
Author |
Maria Ines Torres; Javier Mikel Olaso; Cesar Montenegro; Riberto Santana; A.Vazquez; Raquel Justo; J.A.Lozano; Stephan Schogl; Gerard Chollet; Nazim Dugan; M.Irvine; N.Glackin; C.Pickard; Anna Esposito; Gennaro Cordasco; Alda Troncone; Dijana Petrovska Delacretaz; Aymen Mtibaa; Mohamed Amine Hmani; M.S.Korsnes; L.J.Martinussen; Sergio Escalera; C.Palmero Cantariño; Olivier Deroo; O.Gordeeva; Jofre Tenorio Laranga; E.Gonzalez Fraile; Begoña Fernandez Ruanova; A.Gonzalez Pinto |
Title |
The EMPATHIC project: mid-term achievements |
Type |
Conference Article |
Year |
2019 |
Publication |
12th ACM International Conference on PErvasive Technologies Related to Assistive Environments |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
629-638 |
Keywords |
|
Abstract |
Maria Ines Torres; Javier Mikel Olaso, César Montenegro, Riberto Santana, A. Vázquez, Raquel Justo, J. A. Lozano, Stephan Schlögl, Gérard Chollet, Nazim Dugan, M. Irvine, N. Glackin, C. Pickard, Anna Esposito, Gennaro Cordasco, Alda Troncone, Dijana Petrovska-Delacrétaz, Aymen Mtibaa, Mohamed Amine Hmani, M. S. Korsnes, L. J. Martinussen, Sergio Escalera, C. Palmero Cantariño, Olivier Deroo, O. Gordeeva, Jofre Tenorio-Laranga, E. Gonzalez-Fraile, Begoña Fernández-Ruanova, A. Gonzalez-Pinto |
Address |
Rhodes Greece; June 2019 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
PETRA |
Notes |
HUPBA; no proj |
Approved |
no |
Call Number |
Admin @ si @ TOM2019 |
Serial |
3325 |
Permanent link to this record |
|
|
|
Author |
Daniel Sanchez; Meysam Madadi; Marc Oliu; Sergio Escalera |
Title |
Multi-task human analysis in still images: 2D/3D pose, depth map, and multi-part segmentation |
Type |
Conference Article |
Year |
2019 |
Publication |
14th IEEE International Conference on Automatic Face and Gesture Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
While many individual tasks in the domain of human analysis have recently received an accuracy boost from deep learning approaches, multi-task learning has mostly been ignored due to a lack of data. New synthetic datasets are being released, filling this gap with synthetic generated data. In this work, we analyze four related human analysis tasks in still images in a multi-task scenario by leveraging such datasets. Specifically, we study the correlation of 2D/3D pose estimation, body part segmentation and full-body depth estimation. These tasks are learned via the well-known Stacked Hourglass module such that each of the task-specific streams shares information with the others. The main goal is to analyze how training together these four related tasks can benefit each individual task for a better generalization. Results on the newly released SURREAL dataset show that all four tasks benefit from the multi-task approach, but with different combinations of tasks: while combining all four tasks improves 2D pose estimation the most, 2D pose improves neither 3D pose nor full-body depth estimation. On the other hand 2D parts segmentation can benefit from 2D pose but not from 3D pose. In all cases, as expected, the maximum improvement is achieved on those human body parts that show more variability in terms of spatial distribution, appearance and shape, e.g. wrists and ankles. |
Address |
Lille; France; May 2019 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
FG |
Notes |
HUPBA; no proj |
Approved |
no |
Call Number |
Admin @ si @ SMO2019 |
Serial |
3326 |
Permanent link to this record |
|
|
|
Author |
Ajian Liu; Jun Wan; Sergio Escalera; Hugo Jair Escalante; Zichang Tan; Qi Yuan; Kai Wang; Chi Lin; Guodong Guo; Isabelle Guyon; Stan Z. Li |
Title |
Multi-Modal Face Anti-Spoofing Attack Detection Challenge at CVPR2019 |
Type |
Conference Article |
Year |
2019 |
Publication |
IEEE International Conference on Computer Vision and Pattern Recognition-Workshop |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
Anti-spoofing attack detection is critical to guarantee the security of face-based authentication and facial analysis systems. Recently, a multi-modal face anti-spoofing dataset, CASIA-SURF, has been released with the goal of boosting research in this important topic. CASIA-SURF is the largest public data set for facial anti-spoofing attack detection in terms of both, diversity and modalities: it comprises 1,000 subjects and 21,000 video samples. We organized a challenge around this novel resource to boost research in the subject. The Chalearn LAP multi-modal face anti-spoofing attack detection challenge attracted more than 300 teams for the development phase with a total of 13 teams qualifying for the final round. This paper presents an overview of the challenge, including its design, evaluation protocol and a summary of results. We analyze the top ranked solutions and draw conclusions derived from the competition. In addition we outline future work directions. |
Address |
California; June 2019 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
CVPRW |
Notes |
HuPBA; no proj |
Approved |
no |
Call Number |
Admin @ si @ LWE2019 |
Serial |
3329 |
Permanent link to this record |
|
|
|
Author |
Shifeng Zhang; Xiaobo Wang; Ajian Liu; Chenxu Zhao; Jun Wan; Sergio Escalera; Hailin Shi; Zezheng Wang; Stan Z. Li |
Title |
A Dataset and Benchmark for Large-scale Multi-modal Face Anti-spoofing |
Type |
Conference Article |
Year |
2019 |
Publication |
32nd IEEE Conference on Computer Vision and Pattern Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
919-928 |
Keywords |
|
Abstract |
Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects (≤170) and modalities (≤2), which hinder the further development of the academic community. To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and visual modalities. Specifically, it consists of 1,000 subjects with 21,000 videos and each sample has 3 modalities (i.e., RGB, Depth and IR). We also provide a measurement set, evaluation protocol and training/validation/testing subsets, developing a new benchmark for face anti-spoofing. Moreover, we present a new multi-modal fusion method as baseline, which performs feature re-weighting to select the more informative channel features while suppressing the less useful ones for each modal. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability. The dataset is available at https://sites.google.com/qq.com/chalearnfacespoofingattackdete/. |
Address |
California; June 2019 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
CVPR |
Notes |
HuPBA; no proj |
Approved |
no |
Call Number |
Admin @ si @ ZWL2019 |
Serial |
3331 |
Permanent link to this record |