|   | 
Details
   web
Records
Author Zhengying Liu; Zhen Xu; Shangeth Rajaa; Meysam Madadi; Julio C. S. Jacques Junior; Sergio Escalera; Adrien Pavao; Sebastien Treguer; Wei-Wei Tu; Isabelle Guyon
Title Towards Automated Deep Learning: Analysis of the AutoDL challenge series 2019 Type Conference Article
Year 2020 Publication Proceedings of Machine Learning Research Abbreviated Journal
Volume 123 Issue Pages (down) 242-252
Keywords
Abstract We present the design and results of recent competitions in Automated Deep Learning (AutoDL). In the AutoDL challenge series 2019, we organized 5 machine learning challenges: AutoCV, AutoCV2, AutoNLP, AutoSpeech and AutoDL. The first 4 challenges concern each a specific application domain, such as computer vision, natural language processing and speech recognition. At the time of March 2020, the last challenge AutoDL is still on-going and we only present its design. Some highlights of this work include: (1) a benchmark suite of baseline AutoML solutions, with emphasis on domains for which Deep Learning methods have had prior success (image, video, text, speech, etc); (2) a novel any-time learning framework, which opens doors for further theoretical consideration; (3) a repository of around 100 datasets (from all above domains) over half of which are released as public datasets to enable research on meta-learning; (4) analyses revealing that winning solutions generalize to new unseen datasets, validating progress towards universal AutoML solution; (5) open-sourcing of the challenge platform, the starting kit, the dataset formatting toolkit, and all winning solutions (All information available at {autodl.chalearn.org}).
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference NEURIPS
Notes HUPBA Approved no
Call Number Admin @ si @ LXR2020 Serial 3500
Permanent link to this record
 

 
Author Eduardo Aguilar; Petia Radeva
Title Uncertainty-aware integration of local and flat classifiers for food recognition Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 136 Issue Pages (down) 237-243
Keywords
Abstract Food image recognition has recently attracted the attention of many researchers, due to the challenging problem it poses, the ease collection of food images, and its numerous applications to health and leisure. In real applications, it is necessary to analyze and recognize thousands of different foods. For this purpose, we propose a novel prediction scheme based on a class hierarchy that considers local classifiers, in addition to a flat classifier. In order to make a decision about which approach to use, we define different criteria that take into account both the analysis of the Epistemic Uncertainty estimated from the ‘children’ classifiers and the prediction from the ‘parent’ classifier. We evaluate our proposal using three Uncertainty estimation methods, tested on two public food datasets. The results show that the proposed method reduces parent-child error propagation in hierarchical schemes and improves classification results compared to the single flat classifier, meanwhile maintains good performance regardless the Uncertainty estimation method chosen.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no proj Approved no
Call Number Admin @ si @ AgR2020 Serial 3525
Permanent link to this record
 

 
Author Cristina Sanchez Montes; Jorge Bernal; Ana Garcia Rodriguez; Henry Cordova; Gloria Fernandez Esparrach
Title Revisión de métodos computacionales de detección y clasificación de pólipos en imagen de colonoscopia Type Journal Article
Year 2020 Publication Gastroenterología y Hepatología Abbreviated Journal GH
Volume 43 Issue 4 Pages (down) 222-232
Keywords
Abstract Computer-aided diagnosis (CAD) is a tool with great potential to help endoscopists in the tasks of detecting and histologically classifying colorectal polyps. In recent years, different technologies have been described and their potential utility has been increasingly evidenced, which has generated great expectations among scientific societies. However, most of these works are retrospective and use images of different quality and characteristics which are analysed off line. This review aims to familiarise gastroenterologists with computational methods and the particularities of endoscopic imaging, which have an impact on image processing analysis. Finally, the publicly available image databases, needed to compare and confirm the results obtained with different methods, are presented.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MV; Approved no
Call Number Admin @ si @ SBG2020 Serial 3404
Permanent link to this record
 

 
Author Manuel Carbonell; Alicia Fornes; Mauricio Villegas; Josep Llados
Title A Neural Model for Text Localization, Transcription and Named Entity Recognition in Full Pages Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 136 Issue Pages (down) 219-227
Keywords
Abstract In the last years, the consolidation of deep neural network architectures for information extraction in document images has brought big improvements in the performance of each of the tasks involved in this process, consisting of text localization, transcription, and named entity recognition. However, this process is traditionally performed with separate methods for each task. In this work we propose an end-to-end model that combines a one stage object detection network with branches for the recognition of text and named entities respectively in a way that shared features can be learned simultaneously from the training error of each of the tasks. By doing so the model jointly performs handwritten text detection, transcription, and named entity recognition at page level with a single feed forward step. We exhaustively evaluate our approach on different datasets, discussing its advantages and limitations compared to sequential approaches. The results show that the model is capable of benefiting from shared features by simultaneously solving interdependent tasks.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 601.311; 600.121 Approved no
Call Number Admin @ si @ CFV2020 Serial 3451
Permanent link to this record
 

 
Author Zhengying Liu; Zhen Xu; Sergio Escalera; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Adrien Pavao; Sebastien Treguer; Wei-Wei Tu
Title Towards automated computer vision: analysis of the AutoCV challenges 2019 Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 135 Issue Pages (down) 196-203
Keywords Computer vision; AutoML; Deep learning
Abstract We present the results of recent challenges in Automated Computer Vision (AutoCV, renamed here for clarity AutoCV1 and AutoCV2, 2019), which are part of a series of challenge on Automated Deep Learning (AutoDL). These two competitions aim at searching for fully automated solutions for classification tasks in computer vision, with an emphasis on any-time performance. The first competition was limited to image classification while the second one included both images and videos. Our design imposed to the participants to submit their code on a challenge platform for blind testing on five datasets, both for training and testing, without any human intervention whatsoever. Winning solutions adopted deep learning techniques based on already published architectures, such as AutoAugment, MobileNet and ResNet, to reach state-of-the-art performance in the time budget of the challenge (only 20 minutes of GPU time). The novel contributions include strategies to deliver good preliminary results at any time during the learning process, such that a method can be stopped early and still deliver good performance. This feature is key for the adoption of such techniques by data analysts desiring to obtain rapidly preliminary results on large datasets and to speed up the development process. The soundness of our design was verified in several aspects: (1) Little overfitting of the on-line leaderboard providing feedback on 5 development datasets was observed, compared to the final blind testing on the 5 (separate) final test datasets, suggesting that winning solutions might generalize to other computer vision classification tasks; (2) Error bars on the winners’ performance allow us to say with confident that they performed significantly better than the baseline solutions we provided; (3) The ranking of participants according to the any-time metric we designed, namely the Area under the Learning Curve, was different from that of the fixed-time metric, i.e. AUC at the end of the fixed time budget. We released all winning solutions under open-source licenses. At the end of the AutoDL challenge series, all data of the challenge will be made publicly available, thus providing a collection of uniformly formatted datasets, which can serve to conduct further research, particularly on meta-learning.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ LXE2020 Serial 3427
Permanent link to this record
 

 
Author Shifeng Zhang; Ajian Liu; Jun Wan; Yanyan Liang; Guogong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z. Li
Title CASIA-SURF: A Dataset and Benchmark for Large-scale Multi-modal Face Anti-spoofing Type Journal
Year 2020 Publication IEEE Transactions on Biometrics, Behavior, and Identity Science Abbreviated Journal TTBIS
Volume 2 Issue 2 Pages (down) 182 - 193
Keywords
Abstract Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects (≤170) and modalities (≤2), which hinder the further development of the academic community. To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and modalities. Specifically, it consists of 1,000 subjects with 21,000 videos and each sample has 3 modalities ( i.e. , RGB, Depth and IR). We also provide comprehensive evaluation metrics, diverse evaluation protocols, training/validation/testing subsets and a measurement tool, developing a new benchmark for face anti-spoofing. Moreover, we present a novel multi-modal multi-scale fusion method as a strong baseline, which performs feature re-weighting to select the more informative channel features while suppressing the less useful ones for each modality across different scales. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability. The dataset is available at https://sites.google.com/qq.com/face-anti-spoofing/welcome/challengecvpr2019?authuser=0
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ ZLW2020 Serial 3412
Permanent link to this record
 

 
Author Guillermo Torres; Debora Gil
Title A multi-shape loss function with adaptive class balancing for the segmentation of lung structures Type Journal Article
Year 2020 Publication International Journal of Computer Assisted Radiology and Surgery Abbreviated Journal IJCAR
Volume 15 Issue 1 Pages (down) S154-55
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM Approved no
Call Number Admin @ si @ ToG2020 Serial 3590
Permanent link to this record
 

 
Author B. Gautam; Oriol Ramos Terrades; Joana Maria Pujadas-Mora; Miquel Valls-Figols
Title Knowledge graph based methods for record linkage Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 136 Issue Pages (down) 127-133
Keywords
Abstract Nowadays, it is common in Historical Demography the use of individual-level data as a consequence of a predominant life-course approach for the understanding of the demographic behaviour, family transition, mobility, etc. Advanced record linkage is key since it allows increasing the data complexity and its volume to be analyzed. However, current methods are constrained to link data from the same kind of sources. Knowledge graph are flexible semantic representations, which allow to encode data variability and semantic relations in a structured manner.

In this paper we propose the use of knowledge graph methods to tackle record linkage tasks. The proposed method, named WERL, takes advantage of the main knowledge graph properties and learns embedding vectors to encode census information. These embeddings are properly weighted to maximize the record linkage performance. We have evaluated this method on benchmark data sets and we have compared it to related methods with stimulating and satisfactory results.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 600.121 Approved no
Call Number Admin @ si @ GRP2020 Serial 3453
Permanent link to this record
 

 
Author Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
Title Hand pose aware multimodal isolated sign language recognition Type Journal Article
Year 2020 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 80 Issue Pages (down) 127–163
Keywords
Abstract Isolated hand sign language recognition from video is a challenging research area in computer vision. Some of the most important challenges in this area include dealing with hand occlusion, fast hand movement, illumination changes, or background complexity. While most of the state-of-the-art results in the field have been achieved using deep learning-based models, the previous challenges are not completely solved. In this paper, we propose a hand pose aware model for isolated hand sign language recognition using deep learning approaches from two input modalities, RGB and depth videos. Four spatial feature types: pixel-level, flow, deep hand, and hand pose features, fused from both visual modalities, are input to LSTM for temporal sign recognition. While we use Optical Flow (OF) for flow information in RGB video inputs, Scene Flow (SF) is used for depth video inputs. By including hand pose features, we show a consistent performance improvement of the sign language recognition model. To the best of our knowledge, this is the first time that this discriminant spatiotemporal features, benefiting from the hand pose estimation features and multi-modal inputs, are fused for isolated hand sign language recognition. We perform a step-by-step analysis of the impact in terms of recognition performance of the hand pose features, different combinations of the spatial features, and different recurrent models, especially LSTM and GRU. Results on four public datasets confirm that the proposed model outperforms the current state-of-the-art models on Montalbano II, MSR Daily Activity 3D, and CAD-60 datasets with a relative accuracy improvement of 1.64%, 6.5%, and 7.6%. Furthermore, our model obtains a competitive results on isoGD dataset with only 0.22% margin lower than the current state-of-the-art model.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no menciona Approved no
Call Number Admin @ si @ RKE2020 Serial 3524
Permanent link to this record
 

 
Author Pau Riba; Josep Llados; Alicia Fornes
Title Hierarchical graphs for coarse-to-fine error tolerant matching Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 134 Issue Pages (down) 116-124
Keywords Hierarchical graph representation; Coarse-to-fine graph matching; Graph-based retrieval
Abstract During the last years, graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their ability to capture both structural and appearance-based information. Thus, they provide a greater representational power than classical statistical frameworks. However, graph-based representations leads to high computational complexities usually dealt by graph embeddings or approximated matching techniques. Despite their representational power, they are very sensitive to noise and small variations of the input image. With the aim to cope with the time complexity and the variability present in the generated graphs, in this paper we propose to construct a novel hierarchical graph representation. Graph clustering techniques adapted from social media analysis have been used in order to contract a graph at different abstraction levels while keeping information about the topology. Abstract nodes attributes summarise information about the contracted graph partition. For the proposed representations, a coarse-to-fine matching technique is defined. Hence, small graphs are used as a filtering before more accurate matching methods are applied. This approach has been validated in real scenarios such as classification of colour images or retrieval of handwritten words (i.e. word spotting).
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.097; 601.302; 603.057; 600.140; 600.121 Approved no
Call Number Admin @ si @ RLF2020 Serial 3349
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla
Title Thermal Image Super-resolution: A Novel Architecture and Dataset Type Conference Article
Year 2020 Publication 15th International Conference on Computer Vision Theory and Applications Abbreviated Journal
Volume Issue Pages (down) 111-119
Keywords
Abstract This paper proposes a novel CycleGAN architecture for thermal image super-resolution, together with a large dataset consisting of thermal images at different resolutions. The dataset has been acquired using three thermal cameras at different resolutions, which acquire images from the same scenario at the same time. The thermal cameras are mounted in rig trying to minimize the baseline distance to make easier the registration problem.
The proposed architecture is based on ResNet6 as a Generator and PatchGAN as Discriminator. The novelty on the proposed unsupervised super-resolution training (CycleGAN) is possible due to the existence of aforementioned thermal images—images of the same scenario with different resolutions. The proposed approach is evaluated in the dataset and compared with classical bicubic interpolation. The dataset and the network are available.
Address Valletta; Malta; February 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VISAPP
Notes MSIAU; 600.130; 600.122 Approved no
Call Number Admin @ si @ RSV2020 Serial 3432
Permanent link to this record
 

 
Author Akhil Gurram; Onay Urfalioglu; Ibrahim Halfaoui; Fahd Bouzaraa; Antonio Lopez
Title Semantic Monocular Depth Estimation Based on Artificial Intelligence Type Journal Article
Year 2020 Publication IEEE Intelligent Transportation Systems Magazine Abbreviated Journal ITSM
Volume 13 Issue 4 Pages (down) 99-103
Keywords
Abstract Depth estimation provides essential information to perform autonomous driving and driver assistance. A promising line of work consists of introducing additional semantic information about the traffic scene when training CNNs for depth estimation. In practice, this means that the depth data used for CNN training is complemented with images having pixel-wise semantic labels where the same raw training data is associated with both types of ground truth, i.e., depth and semantic labels. The main contribution of this paper is to show that this hard constraint can be circumvented, i.e., that we can train CNNs for depth estimation by leveraging the depth and semantic information coming from heterogeneous datasets. In order to illustrate the benefits of our approach, we combine KITTI depth and Cityscapes semantic segmentation datasets, outperforming state-of-the-art results on monocular depth estimation.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.124; 600.118 Approved no
Call Number Admin @ si @ GUH2019 Serial 3306
Permanent link to this record
 

 
Author Jialuo Chen; M.A.Souibgui; Alicia Fornes; Beata Megyesi
Title A Web-based Interactive Transcription Tool for Encrypted Manuscripts Type Conference Article
Year 2020 Publication 3rd International Conference on Historical Cryptology Abbreviated Journal
Volume Issue Pages (down) 52-59
Keywords
Abstract Manual transcription of handwritten text is a time consuming task. In the case of encrypted manuscripts, the recognition is even more complex due to the huge variety of alphabets and symbol sets. To speed up and ease this process, we present a web-based tool aimed to (semi)-automatically transcribe the encrypted sources. The user uploads one or several images of the desired encrypted document(s) as input, and the system returns the transcription(s). This process is carried out in an interactive fashion with
the user to obtain more accurate results. For discovering and testing, the developed web tool is freely available.
Address Virtual; June 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference HistoCrypt
Notes DAG; 600.140; 602.230; 600.121 Approved no
Call Number Admin @ si @ CSF2020 Serial 3447
Permanent link to this record
 

 
Author Ajian Liu; Xuan Li; Jun Wan; Yanyan Liang; Sergio Escalera; Hugo Jair Escalante; Meysam Madadi; Yi Jin; Zhuoyuan Wu; Xiaogang Yu; Zichang Tan; Qi Yuan; Ruikun Yang; Benjia Zhou; Guodong Guo; Stan Z. Li
Title Cross-ethnicity Face Anti-spoofing Recognition Challenge: A Review Type Journal Article
Year 2020 Publication IET Biometrics Abbreviated Journal BIO
Volume 10 Issue 1 Pages (down) 24-43
Keywords
Abstract Face anti-spoofing is critical to prevent face recognition systems from a security breach. The biometrics community has %possessed achieved impressive progress recently due the excellent performance of deep neural networks and the availability of large datasets. Although ethnic bias has been verified to severely affect the performance of face recognition systems, it still remains an open research problem in face anti-spoofing. Recently, a multi-ethnic face anti-spoofing dataset, CASIA-SURF CeFA, has been released with the goal of measuring the ethnic bias. It is the largest up to date cross-ethnicity face anti-spoofing dataset covering 3 ethnicities, 3 modalities, 1,607 subjects, 2D plus 3D attack types, and the first dataset including explicit ethnic labels among the recently released datasets for face anti-spoofing. We organized the Chalearn Face Anti-spoofing Attack Detection Challenge which consists of single-modal (e.g., RGB) and multi-modal (e.g., RGB, Depth, Infrared (IR)) tracks around this novel resource to boost research aiming to alleviate the ethnic bias. Both tracks have attracted 340 teams in the development stage, and finally 11 and 8 teams have submitted their codes in the single-modal and multi-modal face anti-spoofing recognition challenges, respectively. All the results were verified and re-ran by the organizing team, and the results were used for the final ranking. This paper presents an overview of the challenge, including its design, evaluation protocol and a summary of results. We analyze the top ranked solutions and draw conclusions derived from the competition. In addition we outline future work directions.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ LLW2020b Serial 3523
Permanent link to this record
 

 
Author Hassan Ahmed Sial; Ramon Baldrich; Maria Vanrell
Title Deep intrinsic decomposition trained on surreal scenes yet with realistic light effects Type Journal Article
Year 2020 Publication Journal of the Optical Society of America A Abbreviated Journal JOSA A
Volume 37 Issue 1 Pages (down) 1-15
Keywords
Abstract Estimation of intrinsic images still remains a challenging task due to weaknesses of ground-truth datasets, which either are too small or present non-realistic issues. On the other hand, end-to-end deep learning architectures start to achieve interesting results that we believe could be improved if important physical hints were not ignored. In this work, we present a twofold framework: (a) a flexible generation of images overcoming some classical dataset problems such as larger size jointly with coherent lighting appearance; and (b) a flexible architecture tying physical properties through intrinsic losses. Our proposal is versatile, presents low computation time, and achieves state-of-the-art results.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes CIC; 600.140; 600.12; 600.118 Approved no
Call Number Admin @ si @ SBV2019 Serial 3311
Permanent link to this record