|   | 
Details
   web
Records
Author Hugo Prol; Vincent Dumoulin; Luis Herranz
Title Cross-Modulation Networks for Few-Shot Learning Type Miscellaneous
Year 2018 Publication Arxiv Abbreviated Journal
Volume Issue Pages
Keywords
Abstract A family of recent successful approaches to few-shot learning relies on learning an embedding space in which predictions are made by computing similarities between examples. This corresponds to combining information between support and query examples at a very late stage of the prediction pipeline. Inspired by this observation, we hypothesize that there may be benefits to combining the information at various levels of abstraction along the pipeline. We present an architecture called Cross-Modulation Networks which allows support and query examples to interact throughout the feature extraction process via a feature-wise modulation mechanism. We adapt the Matching Networks architecture to take advantage of these interactions and show encouraging initial results on miniImageNet in the 5-way, 1-shot setting, where we close the gap with state-of-the-art.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ PDH2018 Serial 3248
Permanent link to this record
 

 
Author Luis Herranz; Weiqing Min; Shuqiang Jiang
Title Food recognition and recipe analysis: integrating visual content, context and external knowledge Type Miscellaneous
Year 2018 Publication Arxiv Abbreviated Journal
Volume Issue Pages
Keywords
Abstract The central role of food in our individual and social life, combined with recent technological advances, has motivated a growing interest in applications that help to better monitor dietary habits as well as the exploration and retrieval of food-related information. We review how visual content, context and external knowledge can be integrated effectively into food-oriented applications, with special focus on recipe analysis and retrieval, food recommendation and restaurant context as emerging directions.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ HMJ2018 Serial 3250
Permanent link to this record
 

 
Author Santi Puch; Irina Sanchez; Aura Hernandez-Sabate; Gemma Piella; Vesna Prckovska
Title Global Planar Convolutions for Improved Context Aggregation in Brain Tumor Segmentation Type Conference Article
Year 2018 Publication International MICCAI Brainlesion Workshop Abbreviated Journal
Volume 11384 Issue Pages 393-405
Keywords Brain tumors; 3D fully-convolutional CNN; Magnetic resonance imaging; Global planar convolution
Abstract In this work, we introduce the Global Planar Convolution module as a building-block for fully-convolutional networks that aggregates global information and, therefore, enhances the context perception capabilities of segmentation networks in the context of brain tumor segmentation. We implement two baseline architectures (3D UNet and a residual version of 3D UNet, ResUNet) and present a novel architecture based on these two architectures, ContextNet, that includes the proposed Global Planar Convolution module. We show that the addition of such module eliminates the need of building networks with several representation levels, which tend to be over-parametrized and to showcase slow rates of convergence. Furthermore, we provide a visual demonstration of the behavior of GPC modules via visualization of intermediate representations. We finally participate in the 2018 edition of the BraTS challenge with our best performing models, that are based on ContextNet, and report the evaluation scores on the validation and the test sets of the challenge.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MICCAIW
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ PSH2018 Serial 3251
Permanent link to this record
 

 
Author Spyridon Bakas; Mauricio Reyes; Andras Jakab; Stefan Bauer; Markus Rempfler; Alessandro Crimi; Russell Takeshi Shinohara; Christoph Berger; Sung Min Ha; Martin Rozycki; Marcel Prastawa; Esther Alberts; Jana Lipkova; John Freymann; Justin Kirby; Michel Bilello; Hassan Fathallah-Shaykh; Roland Wiest; Jan Kirschke; Benedikt Wiestler; Rivka Colen; Aikaterini Kotrotsou; Pamela Lamontagne; Daniel Marcus; Mikhail Milchenko; Arash Nazeri; Marc-Andre Weber; Abhishek Mahajan; Ujjwal Baid; Dongjin Kwon; Manu Agarwal; Mahbubul Alam; Alberto Albiol; Antonio Albiol; Varghese Alex; Tuan Anh Tran; Tal Arbel; Aaron Avery; Subhashis Banerjee; Thomas Batchelder; Kayhan Batmanghelich; Enzo Battistella; Martin Bendszus; Eze Benson; Jose Bernal; George Biros; Mariano Cabezas; Siddhartha Chandra; Yi-Ju Chang; Joseph Chazalon; Shengcong Chen; Wei Chen; Jefferson Chen; Kun Cheng; Meinel Christoph; Roger Chylla; Albert Clérigues; Anthony Costa; Xiaomeng Cui; Zhenzhen Dai; Lutao Dai; Eric Deutsch; Changxing Ding; Chao Dong; Wojciech Dudzik; Theo Estienne; Hyung Eun Shin; Richard Everson; Jonathan Fabrizio; Longwei Fang; Xue Feng; Lucas Fidon; Naomi Fridman; Huan Fu; David Fuentes; David G Gering; Yaozong Gao; Evan Gates; Amir Gholami; Mingming Gong; Sandra Gonzalez-Villa; J Gregory Pauloski; Yuanfang Guan; Sheng Guo; Sudeep Gupta; Meenakshi H Thakur; Klaus H Maier-Hein; Woo-Sup Han; Huiguang He; Aura Hernandez-Sabate; Evelyn Herrmann; Naveen Himthani; Winston Hsu; Cheyu Hsu; Xiaojun Hu; Xiaobin Hu; Yan Hu; Yifan Hu; Rui Hua
Title Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge Type Miscellaneous
Year 2018 Publication Arxiv Abbreviated Journal
Volume Issue Pages
Keywords BraTS; challenge; brain; tumor; segmentation; machine learning; glioma; glioblastoma; radiomics; survival; progression; RECIST
Abstract Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multiparametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e. 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in preoperative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that undergone gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ BRJ2018 Serial 3252
Permanent link to this record
 

 
Author Francisco Cruz; Oriol Ramos Terrades
Title A probabilistic framework for handwritten text line segmentation Type Miscellaneous
Year 2018 Publication Arxiv Abbreviated Journal
Volume Issue Pages
Keywords Document Analysis; Text Line Segmentation; EM algorithm; Probabilistic Graphical Models; Parameter Learning
Abstract We successfully combine Expectation-Maximization algorithm and variational
approaches for parameter learning and computing inference on Markov random fields. This is a general method that can be applied to many computer
vision tasks. In this paper, we apply it to handwritten text line segmentation.
We conduct several experiments that demonstrate that our method deal with
common issues of this task, such as complex document layout or non-latin
scripts. The obtained results prove that our method achieve state-of-theart performance on different benchmark datasets without any particular fine
tuning step.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.097; 600.121 Approved no
Call Number Admin @ si @ CrR2018 Serial 3253
Permanent link to this record
 

 
Author Thanh Ha Do; Oriol Ramos Terrades; Salvatore Tabbone
Title DSD: document sparse-based denoising algorithm Type Journal Article
Year 2019 Publication Pattern Analysis and Applications Abbreviated Journal PAA
Volume 22 Issue 1 Pages 177–186
Keywords Document denoising; Sparse representations; Sparse dictionary learning; Document degradation models
Abstract In this paper, we present a sparse-based denoising algorithm for scanned documents. This method can be applied to any kind of scanned documents with satisfactory results. Unlike other approaches, the proposed approach encodes noise documents through sparse representation and visual dictionary learning techniques without any prior noise model. Moreover, we propose a precision parameter estimator. Experiments on several datasets demonstrate the robustness of the proposed approach compared to the state-of-the-art methods on document denoising.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.097; 600.140; 600.121 Approved no
Call Number Admin @ si @ DRT2019 Serial 3254
Permanent link to this record
 

 
Author Cesar de Souza; Adrien Gaidon; Eleonora Vig; Antonio Lopez
Title System and method for video classification using a hybrid unsupervised and supervised multi-layer architecture Type Patent
Year 2018 Publication US9946933B2 Abbreviated Journal
Volume Issue Pages
Keywords US9946933B2
Abstract A computer-implemented video classification method and system are disclosed. The method includes receiving an input video including a sequence of frames. At least one transformation of the input video is generated, each transformation including a sequence of frames. For the input video and each transformation, local descriptors are extracted from the respective sequence of frames. The local descriptors of the input video and each transformation are aggregated to form an aggregated feature vector with a first set of processing layers learned using unsupervised learning. An output classification value is generated for the input video, based on the aggregated feature vector with a second set of processing layers learned using supervised learning.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ SGV2018 Serial 3255
Permanent link to this record
 

 
Author W.Win; B.Bao; Q.Xu; Luis Herranz; Shuqiang Jiang
Title Editorial Note: Efficient Multimedia Processing Methods and Applications Type Miscellaneous
Year 2019 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 78 Issue 1 Pages
Keywords
Abstract
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.141; 600.120 Approved no
Call Number Admin @ si @ WBX2019 Serial 3257
Permanent link to this record
 

 
Author Sounak Dey; Palaiahnakote Shivakumara; K.S. Raghunanda; Umapada Pal; Tong Lu; G. Hemantha Kumar; Chee Seng Chan
Title Script independent approach for multi-oriented text detection in scene image Type Journal Article
Year 2017 Publication Neurocomputing Abbreviated Journal NEUCOM
Volume 242 Issue Pages 96-112
Keywords
Abstract Developing a text detection method which is invariant to scripts in natural scene images is a challeng- ing task due to different geometrical structures of various scripts. Besides, multi-oriented of text lines in natural scene images make the problem more challenging. This paper proposes to explore ring radius transform (RRT) for text detection in multi-oriented and multi-script environments. The method finds component regions based on convex hull to generate radius matrices using RRT. It is a fact that RRT pro- vides low radius values for the pixels that are near to edges, constant radius values for the pixels that represent stroke width, and high radius values that represent holes created in background and convex hull because of the regular structures of text components. We apply k -means clustering on the radius matrices to group such spatially coherent regions into individual clusters. Then the proposed method studies the radius values of such cluster components that are close to the centroid and far from the cen- troid to detect text components. Furthermore, we have developed a Bangla dataset (named as ISI-UM dataset) and propose a semi-automatic system for generating its ground truth for text detection of arbi- trary orientations, which can be used by the researchers for text detection and recognition in the future. The ground truth will be released to public. Experimental results on our ISI-UM data and other standard datasets, namely, ICDAR 2013 scene, SVT and MSRA data, show that the proposed method outperforms the existing methods in terms of multi-lingual and multi-oriented text detection ability.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ DSR2017 Serial 3260
Permanent link to this record
 

 
Author Mikhail Mozerov; Fei Yang; Joost Van de Weijer
Title Sparse Data Interpolation Using the Geodesic Distance Affinity Space Type Journal Article
Year 2019 Publication IEEE Signal Processing Letters Abbreviated Journal SPL
Volume 26 Issue 6 Pages 943 - 947
Keywords
Abstract In this letter, we adapt the geodesic distance-based recursive filter to the sparse data interpolation problem. The proposed technique is general and can be easily applied to any kind of sparse data. We demonstrate its superiority over other interpolation techniques in three experiments for qualitative and quantitative evaluation. In addition, we compare our method with the popular interpolation algorithm presented in the paper on EpicFlow optical flow, which is intuitively motivated by a similar geodesic distance principle. The comparison shows that our algorithm is more accurate and considerably faster than the EpicFlow interpolation technique.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ MYW2019 Serial 3261
Permanent link to this record
 

 
Author Carola Figueroa Flores; Abel Gonzalez-Garcia; Joost Van de Weijer; Bogdan Raducanu
Title Saliency for fine-grained object recognition in domains with scarce training data Type Journal Article
Year 2019 Publication Pattern Recognition Abbreviated Journal PR
Volume 94 Issue Pages 62-73
Keywords
Abstract This paper investigates the role of saliency to improve the classification accuracy of a Convolutional Neural Network (CNN) for the case when scarce training data is available. Our approach consists in adding a saliency branch to an existing CNN architecture which is used to modulate the standard bottom-up visual features from the original image input, acting as an attentional mechanism that guides the feature extraction process. The main aim of the proposed approach is to enable the effective training of a fine-grained recognition model with limited training samples and to improve the performance on the task, thereby alleviating the need to annotate a large dataset. The vast majority of saliency methods are evaluated on their ability to generate saliency maps, and not on their functionality in a complete vision pipeline. Our proposed pipeline allows to evaluate saliency methods for the high-level task of object recognition. We perform extensive experiments on various fine-grained datasets (Flowers, Birds, Cars, and Dogs) under different conditions and show that saliency can considerably improve the network’s performance, especially for the case of scarce training data. Furthermore, our experiments show that saliency methods that obtain improved saliency maps (as measured by traditional saliency benchmarks) also translate to saliency methods that yield improved performance gains when applied in an object recognition pipeline.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.109; 600.141; 600.120 Approved no
Call Number Admin @ si @ FGW2019 Serial 3264
Permanent link to this record
 

 
Author Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
Title Self-Supervised Learning from Web Data for Multimodal Retrieval Type Book Chapter
Year 2019 Publication Multi-Modal Scene Understanding Book Abbreviated Journal
Volume Issue Pages 279-306
Keywords self-supervised learning; webly supervised learning; text embeddings; multimodal retrieval; multimodal embedding
Abstract Self-Supervised learning from multimodal image and text data allows deep neural networks to learn powerful features with no need of human annotated data. Web and Social Media platforms provide a virtually unlimited amount of this multimodal data. In this work we propose to exploit this free available data to learn a multimodal image and text embedding, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the proposed pipeline can learn from images with associated text without supervision and analyze the semantic structure of the learnt joint image and text embeddingspace. Weperformathoroughanalysisandperformancecomparisonoffivedifferentstateof the art text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text basedimageretrievaltask,andweclearlyoutperformstateoftheartintheMIRFlickrdatasetwhen training in the target data. Further, we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.129; 601.338; 601.310 Approved no
Call Number Admin @ si @ GGG2019 Serial 3266
Permanent link to this record
 

 
Author Xialei Liu; Joost Van de Weijer; Andrew Bagdanov
Title Exploiting Unlabeled Data in CNNs by Self-Supervised Learning to Rank Type Journal Article
Year 2019 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 41 Issue 8 Pages 1862-1878
Keywords Task analysis;Training;Image quality;Visualization;Uncertainty;Labeling;Neural networks;Learning from rankings;image quality assessment;crowd counting;active learning
Abstract For many applications the collection of labeled data is expensive laborious. Exploitation of unlabeled data during training is thus a long pursued objective of machine learning. Self-supervised learning addresses this by positing an auxiliary task (different, but related to the supervised task) for which data is abundantly available. In this paper, we show how ranking can be used as a proxy task for some regression problems. As another contribution, we propose an efficient backpropagation technique for Siamese networks which prevents the redundant computation introduced by the multi-branch network architecture. We apply our framework to two regression problems: Image Quality Assessment (IQA) and Crowd Counting. For both we show how to automatically generate ranked image sets from unlabeled data. Our results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting. In addition, we show that measuring network uncertainty on the self-supervised proxy task is a good measure of informativeness of unlabeled data. This can be used to drive an algorithm for active learning and we show that this reduces labeling effort by up to 50 percent.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.109; 600.106; 600.120 Approved no
Call Number LWB2019 Serial 3267
Permanent link to this record
 

 
Author David Berga; Xose R. Fernandez-Vidal; Xavier Otazu; V. Leboran; Xose M. Pardo
Title Psychophysical evaluation of individual low-level feature influences on visual attention Type Journal Article
Year 2019 Publication Vision Research Abbreviated Journal VR
Volume 154 Issue Pages 60-79
Keywords Visual attention; Psychophysics; Saliency; Task; Context; Contrast; Center bias; Low-level; Synthetic; Dataset
Abstract In this study we provide the analysis of eye movement behavior elicited by low-level feature distinctiveness with a dataset of synthetically-generated image patterns. Design of visual stimuli was inspired by the ones used in previous psychophysical experiments, namely in free-viewing and visual searching tasks, to provide a total of 15 types of stimuli, divided according to the task and feature to be analyzed. Our interest is to analyze the influences of low-level feature contrast between a salient region and the rest of distractors, providing fixation localization characteristics and reaction time of landing inside the salient region. Eye-tracking data was collected from 34 participants during the viewing of a 230 images dataset. Results show that saliency is predominantly and distinctively influenced by: 1. feature type, 2. feature contrast, 3. temporality of fixations, 4. task difficulty and 5. center bias. This experimentation proposes a new psychophysical basis for saliency model evaluation using synthetic images.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes NEUROBIT; 600.128; 600.120 Approved no
Call Number Admin @ si @ BFO2019a Serial 3274
Permanent link to this record
 

 
Author Arnau Baro; Pau Riba; Jorge Calvo-Zaragoza; Alicia Fornes
Title From Optical Music Recognition to Handwritten Music Recognition: a Baseline Type Journal Article
Year 2019 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 123 Issue Pages 1-8
Keywords
Abstract Optical Music Recognition (OMR) is the branch of document image analysis that aims to convert images of musical scores into a computer-readable format. Despite decades of research, the recognition of handwritten music scores, concretely the Western notation, is still an open problem, and the few existing works only focus on a specific stage of OMR. In this work, we propose a full Handwritten Music Recognition (HMR) system based on Convolutional Recurrent Neural Networks, data augmentation and transfer learning, that can serve as a baseline for the research community.
Address (up)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.097; 601.302; 601.330; 600.140; 600.121 Approved no
Call Number Admin @ si @ BRC2019 Serial 3275
Permanent link to this record