|   | 
Details
   web
Records
Author Isabelle Guyon; Lisheng Sun Hosoya; Marc Boulle; Hugo Jair Escalante; Sergio Escalera; Zhengying Liu; Damir Jajetic; Bisakha Ray; Mehreen Saeed; Michele Sebag; Alexander R.Statnikov; Wei-Wei Tu; Evelyne Viegas
Title Analysis of the AutoML Challenge Series 2015-2018. Type Book Chapter
Year 2019 Publication Automated Machine Learning Abbreviated Journal
Volume Issue Pages 177-219
Keywords
Abstract The ChaLearn AutoML Challenge (The authors are in alphabetical order of last name, except the first author who did most of the writing and the second author who produced most of the numerical analyses and plots.) (NIPS 2015 – ICML 2016) consisted of six rounds of a machine learning competition of progressive difficulty, subject to limited computational resources. It was followed bya one-round AutoML challenge (PAKDD 2018). The AutoML setting differs from former model selection/hyper-parameter selection challenges, such as the one we previously organized for NIPS 2006: the participants aim to develop fully automated and computationally efficient systems, capable of being trained and tested without human intervention, with code submission. This chapter analyzes the results of these competitions and provides details about the datasets, which were not revealed to the participants. The solutions of the winners are systematically benchmarked over all datasets of all rounds and compared with canonical machine learning algorithms available in scikit-learn. All materials discussed in this chapter (data and code) have been made publicly available at http://automl.chalearn.org/.
Address (down)
Corporate Author Thesis
Publisher Springer Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title SSCML
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ GHB2019 Serial 3330
Permanent link to this record
 

 
Author Dena Bazazian; Raul Gomez; Anguelos Nicolaou; Lluis Gomez; Dimosthenis Karatzas; Andrew Bagdanov
Title Fast: Facilitated and accurate scene text proposals through fcn guided pruning Type Journal Article
Year 2019 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 119 Issue Pages 112-120
Keywords
Abstract Class-specific text proposal algorithms can efficiently reduce the search space for possible text object locations in an image. In this paper we combine the Text Proposals algorithm with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same recall level and thus gaining a significant speed up. Our experiments demonstrate that such text proposal approaches yield significantly higher recall rates than state-of-the-art text localization techniques, while also producing better-quality localizations. Our results on the ICDAR 2015 Robust Reading Competition (Challenge 4) and the COCO-text datasets show that, when combined with strong word classifiers, this recall margin leads to state-of-the-art results in end-to-end scene text recognition.
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.084; 600.121; 600.129 Approved no
Call Number Admin @ si @ BGN2019 Serial 3342
Permanent link to this record
 

 
Author Lei Kang; Pau Riba; Mauricio Villegas; Alicia Fornes; Marçal Rusiñol
Title Candidate Fusion: Integrating Language Modelling into a Sequence-to-Sequence Handwritten Word Recognition Architecture Type Journal Article
Year 2021 Publication Pattern Recognition Abbreviated Journal PR
Volume 112 Issue Pages 107790
Keywords
Abstract Sequence-to-sequence models have recently become very popular for tackling
handwritten word recognition problems. However, how to effectively integrate an external language model into such recognizer is still a challenging
problem. The main challenge faced when training a language model is to
deal with the language model corpus which is usually different to the one
used for training the handwritten word recognition system. Thus, the bias
between both word corpora leads to incorrectness on the transcriptions, providing similar or even worse performances on the recognition task. In this
work, we introduce Candidate Fusion, a novel way to integrate an external
language model to a sequence-to-sequence architecture. Moreover, it provides suggestions from an external language knowledge, as a new input to
the sequence-to-sequence recognizer. Hence, Candidate Fusion provides two
improvements. On the one hand, the sequence-to-sequence recognizer has
the flexibility not only to combine the information from itself and the language model, but also to choose the importance of the information provided
by the language model. On the other hand, the external language model
has the ability to adapt itself to the training corpus and even learn the
most commonly errors produced from the recognizer. Finally, by conducting
comprehensive experiments, the Candidate Fusion proves to outperform the
state-of-the-art language models for handwritten word recognition tasks.
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 601.302; 601.312; 600.121 Approved no
Call Number Admin @ si @ KRV2021 Serial 3343
Permanent link to this record
 

 
Author Fei Yang; Yongmei Cheng; Joost Van de Weijer; Mikhail Mozerov
Title Improved Discrete Optical Flow Estimation With Triple Image Matching Cost Type Journal Article
Year 2020 Publication IEEE Access Abbreviated Journal ACCESS
Volume 8 Issue Pages 17093 - 17102
Keywords
Abstract Approaches that use more than two consecutive video frames in the optical flow estimation have a long research history. However, almost all such methods utilize extra information for a pre-processing flow prediction or for a post-processing flow correction and filtering. In contrast, this paper differs from previously developed techniques. We propose a new algorithm for the likelihood function calculation (alternatively the matching cost volume) that is used in the maximum a posteriori estimation. We exploit the fact that in general, optical flow is locally constant in the sense of time and the likelihood function depends on both the previous and the future frame. Implementation of our idea increases the robustness of optical flow estimation. As a result, our method outperforms 9% over the DCFlow technique, which we use as prototype for our CNN based computation architecture, on the most challenging MPI-Sintel dataset for the non-occluded mask metric. Furthermore, our approach considerably increases the accuracy of the flow estimation for the matching cost processing, consequently outperforming the original DCFlow algorithm results up to 50% in occluded regions and up to 9% in non-occluded regions on the MPI-Sintel dataset. The experimental section shows that the proposed method achieves state-of-the-arts results especially on the MPI-Sintel dataset.
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ YCW2020 Serial 3345
Permanent link to this record
 

 
Author Fei Yang; Luis Herranz; Joost Van de Weijer; Jose Antonio Iglesias; Antonio Lopez; Mikhail Mozerov
Title Variable Rate Deep Image Compression with Modulated Autoencoder Type Journal Article
Year 2020 Publication IEEE Signal Processing Letters Abbreviated Journal SPL
Volume 27 Issue Pages 331-335
Keywords
Abstract Variable rate is a requirement for flexible and adaptable image and video compression. However, deep image compression methods (DIC) are optimized for a single fixed rate-distortion (R-D) tradeoff. While this can be addressed by training multiple models for different tradeoffs, the memory requirements increase proportionally to the number of models. Scaling the bottleneck representation of a shared autoencoder can provide variable rate compression with a single shared autoencoder. However, the R-D performance using this simple mechanism degrades in low bitrates, and also shrinks the effective range of bitrates. To address these limitations, we formulate the problem of variable R-D optimization for DIC, and propose modulated autoencoders (MAEs), where the representations of a shared autoencoder are adapted to the specific R-D tradeoff via a modulation network. Jointly training this modulated autoencoder and the modulation network provides an effective way to navigate the R-D operational curve. Our experiments show that the proposed method can achieve almost the same R-D performance of independent models with significantly fewer parameters.
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; ADAS; 600.141; 600.120; 600.118 Approved no
Call Number Admin @ si @ YHW2020 Serial 3346
Permanent link to this record
 

 
Author Beata Megyesi; Bernhard Esslinger; Alicia Fornes; Nils Kopal; Benedek Lang; George Lasry; Karl de Leeuw; Eva Pettersson; Arno Wacker; Michelle Waldispuhl
Title Decryption of historical manuscripts: the DECRYPT project Type Journal Article
Year 2020 Publication Cryptologia Abbreviated Journal CRYPT
Volume 44 Issue 6 Pages 545-559
Keywords automatic decryption; cipher collection; historical cryptology; image transcription
Abstract Many historians and linguists are working individually and in an uncoordinated fashion on the identification and decryption of historical ciphers. This is a time-consuming process as they often work without access to automatic methods and processes that can accelerate the decipherment. At the same time, computer scientists and cryptologists are developing algorithms to decrypt various cipher types without having access to a large number of original ciphertexts. In this paper, we describe the DECRYPT project aiming at the creation of resources and tools for historical cryptology by bringing the expertise of various disciplines together for collecting data, exchanging methods for faster progress to transcribe, decrypt and contextualize historical encrypted manuscripts. We present our goals and work-in progress of a general approach for analyzing historical encrypted manuscripts using standardized methods and a new set of state-of-the-art tools. We release the data and tools as open-source hoping that all mentioned disciplines would benefit and contribute to the research infrastructure of historical cryptology.
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 600.121 Approved no
Call Number Admin @ si @ MEF2020 Serial 3347
Permanent link to this record
 

 
Author Anjan Dutta; Pau Riba; Josep Llados; Alicia Fornes
Title Hierarchical Stochastic Graphlet Embedding for Graph-based Pattern Recognition Type Journal Article
Year 2020 Publication Neural Computing and Applications Abbreviated Journal NEUCOMA
Volume 32 Issue Pages 11579–11596
Keywords
Abstract Despite being very successful within the pattern recognition and machine learning community, graph-based methods are often unusable because of the lack of mathematical operations defined in graph domain. Graph embedding, which maps graphs to a vectorial space, has been proposed as a way to tackle these difficulties enabling the use of standard machine learning techniques. However, it is well known that graph embedding functions usually suffer from the loss of structural information. In this paper, we consider the hierarchical structure of a graph as a way to mitigate this loss of information. The hierarchical structure is constructed by topologically clustering the graph nodes and considering each cluster as a node in the upper hierarchical level. Once this hierarchical structure is constructed, we consider several configurations to define the mapping into a vector space given a classical graph embedding, in particular, we propose to make use of the stochastic graphlet embedding (SGE). Broadly speaking, SGE produces a distribution of uniformly sampled low-to-high-order graphlets as a way to embed graphs into the vector space. In what follows, the coarse-to-fine structure of a graph hierarchy and the statistics fetched by the SGE complements each other and includes important structural information with varied contexts. Altogether, these two techniques substantially cope with the usual information loss involved in graph embedding techniques, obtaining a more robust graph representation. This fact has been corroborated through a detailed experimental evaluation on various benchmark graph datasets, where we outperform the state-of-the-art methods.
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 600.121; 600.141 Approved no
Call Number Admin @ si @ DRL2020 Serial 3348
Permanent link to this record
 

 
Author Pau Riba; Josep Llados; Alicia Fornes
Title Hierarchical graphs for coarse-to-fine error tolerant matching Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 134 Issue Pages 116-124
Keywords Hierarchical graph representation; Coarse-to-fine graph matching; Graph-based retrieval
Abstract During the last years, graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their ability to capture both structural and appearance-based information. Thus, they provide a greater representational power than classical statistical frameworks. However, graph-based representations leads to high computational complexities usually dealt by graph embeddings or approximated matching techniques. Despite their representational power, they are very sensitive to noise and small variations of the input image. With the aim to cope with the time complexity and the variability present in the generated graphs, in this paper we propose to construct a novel hierarchical graph representation. Graph clustering techniques adapted from social media analysis have been used in order to contract a graph at different abstraction levels while keeping information about the topology. Abstract nodes attributes summarise information about the contracted graph partition. For the proposed representations, a coarse-to-fine matching technique is defined. Hence, small graphs are used as a filtering before more accurate matching methods are applied. This approach has been validated in real scenarios such as classification of colour images or retrieval of handwritten words (i.e. word spotting).
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.097; 601.302; 603.057; 600.140; 600.121 Approved no
Call Number Admin @ si @ RLF2020 Serial 3349
Permanent link to this record
 

 
Author Alicia Fornes; Josep Llados; Joana Maria Pujadas-Mora
Title Browsing of the Social Network of the Past: Information Extraction from Population Manuscript Images Type Book Chapter
Year 2020 Publication Handwritten Historical Document Analysis, Recognition, and Retrieval – State of the Art and Future Trends Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address (down)
Corporate Author Thesis
Publisher World Scientific Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-981-120-323-7 Medium
Area Expedition Conference
Notes DAG; 600.140; 600.121 Approved no
Call Number Admin @ si @ FLP2020 Serial 3350
Permanent link to this record
 

 
Author Joana Maria Pujadas-Mora; Alicia Fornes; Josep Llados; Gabriel Brea-Martinez; Miquel Valls-Figols
Title The Baix Llobregat (BALL) Demographic Database, between Historical Demography and Computer Vision (nineteenth–twentieth centuries Type Book Chapter
Year 2019 Publication Nominative Data in Demographic Research in the East and the West: monograph Abbreviated Journal
Volume Issue Pages 29-61
Keywords
Abstract The Baix Llobregat (BALL) Demographic Database is an ongoing database project containing individual census data from the Catalan region of Baix Llobregat (Spain) during the nineteenth and twentieth centuries. The BALL Database is built within the project ‘NETWORKS: Technology and citizen innovation for building historical social networks to understand the demographic past’ directed by Alícia Fornés from the Center for Computer Vision and Joana Maria Pujadas-Mora from the Center for Demographic Studies, both at the Universitat Autònoma de Barcelona, funded by the Recercaixa program (2017–2019).
Its webpage is http://dag.cvc.uab.es/xarxes/.The aim of the project is to develop technologies facilitating massive digitalization of demographic sources, and more specifically the padrones (local censuses), in order to reconstruct historical ‘social’ networks employing computer vision technology. Such virtual networks can be created thanks to the linkage of nominative records compiled in the local censuses across time and space. Thus, digitized versions of individual and family lifespans are established, and individuals and families can be located spatially.
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-5-7996-2656-3 Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ PFL2019 Serial 3351
Permanent link to this record
 

 
Author Debora Gil; Carles Sanchez; Agnes Borras; Marta Diez-Ferrer; Antoni Rosell
Title Segmentation of Distal Airways using Structural Analysis Type Journal Article
Year 2019 Publication PloS one Abbreviated Journal Plos
Volume 14 Issue 12 Pages
Keywords
Abstract Segmentation of airways in Computed Tomography (CT) scans is a must for accurate support of diagnosis and intervention of many pulmonary disorders. In particular, lung cancer diagnosis would benefit from segmentations reaching most distal airways. We present a method that combines descriptors of bronchi local appearance and graph global structural analysis to fine-tune thresholds on the descriptors adapted for each bronchial level. We have compared our method to the top performers of the EXACT09 challenge and to a commercial software for biopsy planning evaluated in an own-collected data-base of high resolution CT scans acquired under different breathing conditions. Results on EXACT09 data show that our method provides a high leakage reduction with minimum loss in airway detection. Results on our data-base show the reliability across varying breathing conditions and a competitive performance for biopsy planning compared to a commercial solution.
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM; 600.139; 600.145 Approved no
Call Number Admin @ si @ GSB2019 Serial 3357
Permanent link to this record
 

 
Author Rada Deeb; Joost Van de Weijer; Damien Muselet; Mathieu Hebert; Alain Tremeau
Title Deep spectral reflectance and illuminant estimation from self-interreflections Type Journal Article
Year 2019 Publication Journal of the Optical Society of America A Abbreviated Journal JOSA A
Volume 31 Issue 1 Pages 105-114
Keywords
Abstract In this work, we propose a convolutional neural network based approach to estimate the spectral reflectance of a surface and spectral power distribution of light from a single RGB image of a V-shaped surface. Interreflections happening in a concave surface lead to gradients of RGB values over its area. These gradients carry a lot of information concerning the physical properties of the surface and the illuminant. Our network is trained with only simulated data constructed using a physics-based interreflection model. Coupling interreflection effects with deep learning helps to retrieve the spectral reflectance under an unknown light and to estimate spectral power distribution of this light as well. In addition, it is more robust to the presence of image noise than classical approaches. Our results show that the proposed approach outperforms state-of-the-art learning-based approaches on simulated data. In addition, it gives better results on real data compared to other interreflection-based approaches.
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ DWM2019 Serial 3362
Permanent link to this record
 

 
Author Arka Ujjal Dey; Suman Ghosh; Ernest Valveny; Gaurav Harit
Title Beyond Visual Semantics: Exploring the Role of Scene Text in Image Understanding Type Journal Article
Year 2021 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 149 Issue Pages 164-171
Keywords
Abstract Images with visual and scene text content are ubiquitous in everyday life. However, current image interpretation systems are mostly limited to using only the visual features, neglecting to leverage the scene text content. In this paper, we propose to jointly use scene text and visual channels for robust semantic interpretation of images. We do not only extract and encode visual and scene text cues, but also model their interplay to generate a contextual joint embedding with richer semantics. The contextual embedding thus generated is applied to retrieval and classification tasks on multimedia images, with scene text content, to demonstrate its effectiveness. In the retrieval framework, we augment our learned text-visual semantic representation with scene text cues, to mitigate vocabulary misses that may have occurred during the semantic embedding. To deal with irrelevant or erroneous recognition of scene text, we also apply query-based attention to our text channel. We show how the multi-channel approach, involving visual semantics and scene text, improves upon state of the art.
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ DGV2021 Serial 3364
Permanent link to this record
 

 
Author Estefania Talavera; Alexandre Cola; Nicolai Petkov; Petia Radeva
Title Towards Egocentric Person Re-identification and Social Pattern Analysis. Type Book Chapter
Year 2019 Publication Frontiers in Artificial Intelligence and Applications Abbreviated Journal
Volume 310 Issue Pages 203 - 211
Keywords
Abstract CoRR abs/1905.04073
Wearable cameras capture a first-person view of the daily activities of the camera wearer, offering a visual diary of the user behaviour. Detection of the appearance of people the camera user interacts with for social interactions analysis is of high interest. Generally speaking, social events, lifestyle and health are highly correlated, but there is a lack of tools to monitor and analyse them. We consider that egocentric vision provides a tool to obtain information and understand users social interactions. We propose a model that enables us to evaluate and visualize social traits obtained by analysing social interactions appearance within egocentric photostreams. Given sets of egocentric images, we detect the appearance of faces within the days of the camera wearer, and rely on clustering algorithms to group their feature descriptors in order to re-identify persons. Recurrence of detected faces within photostreams allows us to shape an idea of the social pattern of behaviour of the user. We validated our model over several weeks recorded by different camera wearers. Our findings indicate that social profiles are potentially useful for social behaviour interpretation.
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no proj Approved no
Call Number Admin @ si @ TCP2019 Serial 3377
Permanent link to this record
 

 
Author Estefania Talavera; Maria Leyva-Vallina; Md. Mostafa Kamal Sarker; Domenec Puig; Nicolai Petkov; Petia Radeva
Title Hierarchical approach to classify food scenes in egocentric photo-streams Type Journal Article
Year 2020 Publication IEEE Journal of Biomedical and Health Informatics Abbreviated Journal J-BHI
Volume 24 Issue 3 Pages 866 - 877
Keywords
Abstract Recent studies have shown that the environment where people eat can affect their nutritional behaviour. In this work, we provide automatic tools for a personalised analysis of a person's health habits by the examination of daily recorded egocentric photo-streams. Specifically, we propose a new automatic approach for the classification of food-related environments, that is able to classify up to 15 such scenes. In this way, people can monitor the context around their food intake in order to get an objective insight into their daily eating routine. We propose a model that classifies food-related scenes organized in a semantic hierarchy. Additionally, we present and make available a new egocentric dataset composed of more than 33000 images recorded by a wearable camera, over which our proposed model has been tested. Our approach obtains an accuracy and F-score of 56\% and 65\%, respectively, clearly outperforming the baseline methods.
Address (down)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no proj Approved no
Call Number Admin @ si @ TLM2020 Serial 3380
Permanent link to this record