Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	16–30 of 155 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

List View

Citations

Details

	Records
	Author	Dena Bazazian
	Title	Fully Convolutional Networks for Text Understanding in Scene Images			Type	Book Whole
	Year	2018	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Text understanding in scene images has gained plenty of attention in the computer vision community and it is an important task in many applications as text carries semantically rich information about scene content and context. For instance, reading text in a scene can be applied to autonomous driving, scene understanding or assisting visually impaired people. The general aim of scene text understanding is to localize and recognize text in scene images. Text regions are first localized in the original image by a trained detector model and afterwards fed into a recognition module. The tasks of localization and recognition are highly correlated since an inaccurate localization can affect the recognition task. The main purpose of this thesis is to devise efficient methods for scene text understanding. We investigate how the latest results on deep learning can advance text understanding pipelines. Recently, Fully Convolutional Networks (FCNs) and derived methods have achieved a significant performance on semantic segmentation and pixel level classification tasks. Therefore, we took benefit of the strengths of FCN approaches in order to detect text in natural scenes. In this thesis we have focused on two challenging tasks of scene text understanding which are Text Detection and Word Spotting. For the task of text detection, we have proposed an efficient text proposal technique in scene images. We have considered the Text Proposals method as the baseline which is an approach to reduce the search space of possible text regions in an image. In order to improve the Text Proposals method we combined it with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same level of accuracy and thus gaining a significant speed up. Our experiments demonstrate that this text proposal approach yields significantly higher recall rates than the line based text localization techniques, while also producing better-quality localization. We have also applied this technique on compressed images such as videos from wearable egocentric cameras. For the task of word spotting, we have introduced a novel mid-level word representation method. We have proposed a technique to create and exploit an intermediate representation of images based on text attributes which roughly correspond to character probability maps. Our representation extends the concept of Pyramidal Histogram Of Characters (PHOC) by exploiting Fully Convolutional Networks to derive a pixel-wise mapping of the character distribution within candidate word regions. We call this representation the Soft-PHOC. Furthermore, we show how to use Soft-PHOC descriptors for word spotting tasks through an efficient text line proposal algorithm. To evaluate the detected text, we propose a novel line based evaluation along with the classic bounding box based approach. We test our method on incidental scene text images which comprises real-life scenarios such as urban scenes. The importance of incidental scene text images is due to the complexity of backgrounds, perspective, variety of script and language, short text and little linguistic context. All of these factors together makes the incidental scene text images challenging.
	Address	November 2018
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Dimosthenis Karatzas;Andrew Bagdanov
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-948531-1-1	Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ Baz2018			Serial	3220
Permanent link to this record



	Author	Simone Balocco; Mauricio Gonzalez; Ricardo Ñancule; Petia Radeva; Gabriel Thomas
	Title	Calcified Plaque Detection in IVUS Sequences: Preliminary Results Using Convolutional Nets			Type	Conference Article
	Year	2018	Publication	International Workshop on Artificial Intelligence and Pattern Recognition	Abbreviated Journal
	Volume	11047	Issue		Pages	34-42
	Keywords	Intravascular ultrasound images; Convolutional nets; Deep learning; Medical image analysis
	Abstract	The manual inspection of intravascular ultrasound (IVUS) images to detect clinically relevant patterns is a difficult and laborious task performed routinely by physicians. In this paper, we present a framework based on convolutional nets for the quick selection of IVUS frames containing arterial calcification, a pattern whose detection plays a vital role in the diagnosis of atherosclerosis. Preliminary experiments on a dataset acquired from eighty patients show that convolutional architectures improve detections of a shallow classifier in terms of 𝐹1-measure, precision and recall.
	Address	Cuba; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	IWAIPR
	Notes	MILAB; no menciona			Approved	no
	Call Number	Admin @ si @ BGÑ2018			Serial	3237
Permanent link to this record



	Author	Jorge Bernal; Aymeric Histace; Marc Masana; Quentin Angermann; Cristina Sanchez Montes; Cristina Rodriguez de Miguel; Maroua Hammami; Ana Garcia Rodriguez; Henry Cordova; Olivier Romain; Gloria Fernandez Esparrach; Xavier Dray; F. Javier Sanchez
	Title	Polyp Detection Benchmark in Colonoscopy Videos using GTCreator: A Novel Fully Configurable Tool for Easy and Fast Annotation of Image Databases			Type	Conference Article
	Year	2018	Publication	32nd International Congress and Exhibition on Computer Assisted Radiology & Surgery	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CARS
	Notes	ISE; MV; 600.119			Approved	no
	Call Number	Admin @ si @ BHM2018			Serial	3089
Permanent link to this record



	Author	Dena Bazazian; Dimosthenis Karatzas; Andrew Bagdanov
	Title	Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images			Type	Conference Article
	Year	2018	Publication	International Workshop on Egocentric Perception, Interaction and Computing at ECCV	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Word spotting in natural scene images has many applications in scene understanding and visual assistance. We propose Soft-PHOC, an intermediate representation of images based on character probability maps. Our representation extends the concept of the Pyramidal Histogram Of Characters (PHOC) by exploiting Fully Convolutional Networks to derive a pixel-wise mapping of the character distribution within candidate word regions. We show how to use our descriptors for word spotting tasks in egocentric camera streams through an efficient text line proposal algorithm. This is based on the Hough Transform over character attribute maps followed by scoring using Dynamic Time Warping (DTW). We evaluate our results on ICDAR 2015 Challenge 4 dataset of incidental scene text captured by an egocentric camera.
	Address	Munich; Alemanya; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCVW
	Notes	DAG; 600.129; 600.121;			Approved	no
	Call Number	Admin @ si @ BKB2018b			Serial	3174
Permanent link to this record



	Author	Sumit K. Banchhor; Narendra D. Londhe; Tadashi Araki; Luca Saba; Petia Radeva; Narendra N. Khanna; Jasjit S. Suri
	Title	Calcium detection, its quantification, and grayscale morphology-based risk stratification using machine learning in multimodality big data coronary and carotid scans: A review.			Type	Journal Article
	Year	2018	Publication	Computers in Biology and Medicine	Abbreviated Journal	CBM
	Volume	101	Issue		Pages	184-198
	Keywords	Heart disease; Stroke; Atherosclerosis; Intravascular; Coronary; Carotid; Calcium; Morphology; Risk stratification
	Abstract	Purpose of review Atherosclerosis is the leading cause of cardiovascular disease (CVD) and stroke. Typically, atherosclerotic calcium is found during the mature stage of the atherosclerosis disease. It is therefore often a challenge to identify and quantify the calcium. This is due to the presence of multiple components of plaque buildup in the arterial walls. The American College of Cardiology/American Heart Association guidelines point to the importance of calcium in the coronary and carotid arteries and further recommend its quantification for the prevention of heart disease. It is therefore essential to stratify the CVD risk of the patient into low- and high-risk bins. Recent finding Calcium formation in the artery walls is multifocal in nature with sizes at the micrometer level. Thus, its detection requires high-resolution imaging. Clinical experience has shown that even though optical coherence tomography offers better resolution, intravascular ultrasound still remains an important imaging modality for coronary wall imaging. For a computer-based analysis system to be complete, it must be scientifically and clinically validated. This study presents a state-of-the-art review (condensation of 152 publications after examining 200 articles) covering the methods for calcium detection and its quantification for coronary and carotid arteries, the pros and cons of these methods, and the risk stratification strategies. The review also presents different kinds of statistical models and gold standard solutions for the evaluation of software systems useful for calcium detection and quantification. Finally, the review concludes with a possible vision for designing the next-generation system for better clinical outcomes.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB; no proj			Approved	no
	Call Number	Admin @ si @ BLA2018			Serial	3188
Permanent link to this record



	Author	Marc Bolaños; Alvaro Peris; Francisco Casacuberta; Sergi Solera; Petia Radeva
	Title	Egocentric video description based on temporally-linked sequences			Type	Journal Article
	Year	2018	Publication	Journal of Visual Communication and Image Representation	Abbreviated Journal	JVCIR
	Volume	50	Issue		Pages	205-216
	Keywords	egocentric vision; video description; deep learning; multi-modal learning
	Abstract	Egocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures. In this paper, we tackle storytelling as an egocentric sequences description problem. We propose a novel methodology that exploits information from temporally neighboring events, matching precisely the nature of egocentric sequences. Furthermore, we present a new method for multimodal data fusion consisting on a multi-input attention recurrent network. We also release the EDUB-SegDesc dataset. This is the first dataset for egocentric image sequences description, consisting of 1,339 events with 3,991 descriptions, from 55 days acquired by 11 people. Finally, we prove that our proposal outperforms classical attentional encoder-decoder methods for video description.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB; no proj			Approved	no
	Call Number	Admin @ si @ BPC2018			Serial	3109
Permanent link to this record



	Author	Miguel Angel Bautista; Oriol Pujol; Fernando De la Torre; Sergio Escalera
	Title	Error-Correcting Factorization			Type	Journal Article
	Year	2018	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	40	Issue		Pages	2388-2401
	Keywords
	Abstract	Error Correcting Output Codes (ECOC) is a successful technique in multi-class classification, which is a core problem in Pattern Recognition and Machine Learning. A major advantage of ECOC over other methods is that the multi- class problem is decoupled into a set of binary problems that are solved independently. However, literature defines a general error-correcting capability for ECOCs without analyzing how it distributes among classes, hindering a deeper analysis of pair-wise error-correction. To address these limitations this paper proposes an Error-Correcting Factorization (ECF) method, our contribution is three fold: (I) We propose a novel representation of the error-correction capability, called the design matrix, that enables us to build an ECOC on the basis of allocating correction to pairs of classes. (II) We derive the optimal code length of an ECOC using rank properties of the design matrix. (III) ECF is formulated as a discrete optimization problem, and a relaxed solution is found using an efficient constrained block coordinate descent approach. (IV) Enabled by the flexibility introduced with the design matrix we propose to allocate the error-correction on classes that are prone to confusion. Experimental results in several databases show that when allocating the error-correction to confusable classes ECF outperforms state-of-the-art approaches.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0162-8828	ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA; no menciona			Approved	no
	Call Number	Admin @ si @ BPT2018			Serial	3015
Permanent link to this record



	Author	Arnau Baro; Pau Riba; Jorge Calvo-Zaragoza; Alicia Fornes
	Title	Optical Music Recognition by Long Short-Term Memory Networks			Type	Book Chapter
	Year	2018	Publication	Graphics Recognition. Current Trends and Evolutions	Abbreviated Journal
	Volume	11009	Issue		Pages	81-95
	Keywords	Optical Music Recognition; Recurrent Neural Network; Long ShortTerm Memory
	Abstract	Optical Music Recognition refers to the task of transcribing the image of a music score into a machine-readable format. Many music scores are written in a single staff, and therefore, they could be treated as a sequence. Therefore, this work explores the use of Long Short-Term Memory (LSTM) Recurrent Neural Networks for reading the music score sequentially, where the LSTM helps in keeping the context. For training, we have used a synthetic dataset of more than 40000 images, labeled at primitive level. The experimental results are promising, showing the benefits of our approach.
	Address
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor	A. Fornes, B. Lamiroy
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-030-02283-9	Medium
	Area		Expedition		Conference	GREC
	Notes	DAG; 600.097; 601.302; 601.330; 600.121			Approved	no
	Call Number	Admin @ si @ BRC2018			Serial	3227
Permanent link to this record



	Author	Arnau Baro; Pau Riba; Alicia Fornes
	Title	A Starting Point for Handwritten Music Recognition			Type	Conference Article
	Year	2018	Publication	1st International Workshop on Reading Music Systems	Abbreviated Journal
	Volume		Issue		Pages	5-6
	Keywords	Optical Music Recognition; Long Short-Term Memory; Convolutional Neural Networks; MUSCIMA++; CVCMUSCIMA
	Abstract	In the last years, the interest in Optical Music Recognition (OMR) has reawakened, especially since the appearance of deep learning. However, there are very few works addressing handwritten scores. In this work we describe a full OMR pipeline for handwritten music scores by using Convolutional and Recurrent Neural Networks that could serve as a baseline for the research community.
	Address	Paris; France; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WORMS
	Notes	DAG; 600.097; 601.302; 601.330; 600.121			Approved	no
	Call Number	Admin @ si @ BRF2018			Serial	3223
Permanent link to this record



	Author	Spyridon Bakas; Mauricio Reyes; Andras Jakab; Stefan Bauer; Markus Rempfler; Alessandro Crimi; Russell Takeshi Shinohara; Christoph Berger; Sung Min Ha; Martin Rozycki; Marcel Prastawa; Esther Alberts; Jana Lipkova; John Freymann; Justin Kirby; Michel Bilello; Hassan Fathallah-Shaykh; Roland Wiest; Jan Kirschke; Benedikt Wiestler; Rivka Colen; Aikaterini Kotrotsou; Pamela Lamontagne; Daniel Marcus; Mikhail Milchenko; Arash Nazeri; Marc-Andre Weber; Abhishek Mahajan; Ujjwal Baid; Dongjin Kwon; Manu Agarwal; Mahbubul Alam; Alberto Albiol; Antonio Albiol; Varghese Alex; Tuan Anh Tran; Tal Arbel; Aaron Avery; Subhashis Banerjee; Thomas Batchelder; Kayhan Batmanghelich; Enzo Battistella; Martin Bendszus; Eze Benson; Jose Bernal; George Biros; Mariano Cabezas; Siddhartha Chandra; Yi-Ju Chang; Joseph Chazalon; Shengcong Chen; Wei Chen; Jefferson Chen; Kun Cheng; Meinel Christoph; Roger Chylla; Albert Clérigues; Anthony Costa; Xiaomeng Cui; Zhenzhen Dai; Lutao Dai; Eric Deutsch; Changxing Ding; Chao Dong; Wojciech Dudzik; Theo Estienne; Hyung Eun Shin; Richard Everson; Jonathan Fabrizio; Longwei Fang; Xue Feng; Lucas Fidon; Naomi Fridman; Huan Fu; David Fuentes; David G Gering; Yaozong Gao; Evan Gates; Amir Gholami; Mingming Gong; Sandra Gonzalez-Villa; J Gregory Pauloski; Yuanfang Guan; Sheng Guo; Sudeep Gupta; Meenakshi H Thakur; Klaus H Maier-Hein; Woo-Sup Han; Huiguang He; Aura Hernandez-Sabate; Evelyn Herrmann; Naveen Himthani; Winston Hsu; Cheyu Hsu; Xiaojun Hu; Xiaobin Hu; Yan Hu; Yifan Hu; Rui Hua
	Title	Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge			Type	Miscellaneous
	Year	2018	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	BraTS; challenge; brain; tumor; segmentation; machine learning; glioma; glioblastoma; radiomics; survival; progression; RECIST
	Abstract	Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multiparametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e. 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in preoperative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that undergone gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS; 600.118			Approved	no
	Call Number	Admin @ si @ BRJ2018			Serial	3252
Permanent link to this record



	Author	Marco Buzzelli; Joost Van de Weijer; Raimondo Schettini
	Title	Learning Illuminant Estimation from Object Recognition			Type	Conference Article
	Year	2018	Publication	25th International Conference on Image Processing	Abbreviated Journal
	Volume		Issue		Pages	3234 - 3238
	Keywords	Illuminant estimation; computational color constancy; semi-supervised learning; deep learning; convolutional neural networks
	Abstract	In this paper we present a deep learning method to estimate the illuminant of an image. Our model is not trained with illuminant annotations, but with the objective of improving performance on an auxiliary task such as object recognition. To the best of our knowledge, this is the first example of a deep learning architecture for illuminant estimation that is trained without ground truth illuminants. We evaluate our solution on standard datasets for color constancy, and compare it with state of the art methods. Our proposal is shown to outperform most deep learning methods in a cross-dataset evaluation setup, and to present competitive results in a comparison with parametric solutions.
	Address	Athens; Greece; October 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIP
	Notes	LAMP; 600.109; 600.120			Approved	no
	Call Number	Admin @ si @ BWS2018			Serial	3157
Permanent link to this record



	Author	Ozan Caglayan; Adrien Bardet; Fethi Bougares; Loic Barrault; Kai Wang; Marc Masana; Luis Herranz; Joost Van de Weijer
	Title	LIUM-CVC Submissions for WMT18 Multimodal Translation Task			Type	Conference Article
	Year	2018	Publication	3rd Conference on Machine Translation	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	This paper describes the multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT18 Shared Task on Multimodal Translation. This year we propose several modifications to our previou multimodal attention architecture in order to better integrate convolutional features and refine them using encoder-side information. Our final constrained submissions ranked first for English→French and second for English→German language pairs among the constrained submissions according to the automatic evaluation metric METEOR.
	Address	Brussels; Belgium; October 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WMT
	Notes	LAMP; 600.106; 600.120			Approved	no
	Call Number	Admin @ si @ CBB2018			Serial	3240
Permanent link to this record



	Author	Felipe Codevilla; Antonio Lopez; Vladlen Koltun; Alexey Dosovitskiy
	Title	On Offline Evaluation of Vision-based Driving Models			Type	Conference Article
	Year	2018	Publication	15th European Conference on Computer Vision	Abbreviated Journal
	Volume	11219	Issue		Pages	246-262
	Keywords	Autonomous driving; deep learning
	Abstract	Autonomous driving models should ideally be evaluated by deploying them on a fleet of physical vehicles in the real world. Unfortunately, this approach is not practical for the vast majority of researchers. An attractive alternative is to evaluate models offline, on a pre-collected validation dataset with ground truth annotation. In this paper, we investigate the relation between various online and offline metrics for evaluation of autonomous driving models. We find that offline prediction error is not necessarily correlated with driving quality, and two models with identical prediction error can differ dramatically in their driving performance. We show that the correlation of offline evaluation with driving quality can be significantly improved by selecting an appropriate validation dataset and suitable offline metrics.
	Address	Munich; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCV
	Notes	ADAS; 600.124; 600.118			Approved	no
	Call Number	Admin @ si @ CLK2018			Serial	3162
Permanent link to this record



	Author	Ciprian Corneanu; Meysam Madadi; Sergio Escalera
	Title	Deep Structure Inference Network for Facial Action Unit Recognition			Type	Conference Article
	Year	2018	Publication	15th European Conference on Computer Vision	Abbreviated Journal
	Volume	11216	Issue		Pages	309-324
	Keywords	Computer Vision; Machine Learning; Deep Learning; Facial Expression Analysis; Facial Action Units; Structure Inference
	Abstract	Facial expressions are combinations of basic components called Action Units (AU). Recognizing AUs is key for general facial expression analysis. Recently, efforts in automatic AU recognition have been dedicated to learning combinations of local features and to exploiting correlations between AUs. We propose a deep neural architecture that tackles both problems by combining learned local and global features in its initial stages and replicating a message passing algorithm between classes similar to a graphical model inference approach in later stages. We show that by training the model end-to-end with increased supervision we improve state-of-the-art by 5.3% and 8.2% performance on BP4D and DISFA datasets, respectively.
	Address	Munich; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCV
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ CME2018			Serial	3205
Permanent link to this record



	Author	Felipe Codevilla; Matthias Muller; Antonio Lopez; Vladlen Koltun; Alexey Dosovitskiy
	Title	End-to-end Driving via Conditional Imitation Learning			Type	Conference Article
	Year	2018	Publication	IEEE International Conference on Robotics and Automation	Abbreviated Journal
	Volume		Issue		Pages	4693 - 4700
	Keywords
	Abstract	Deep networks trained on demonstrations of human driving have learned to follow roads and avoid obstacles. However, driving policies trained via imitation learning cannot be controlled at test time. A vehicle trained end-to-end to imitate an expert cannot be guided to take a specific turn at an upcoming intersection. This limits the utility of such systems. We propose to condition imitation learning on high-level command input. At test time, the learned driving policy functions as a chauffeur that handles sensorimotor coordination but continues to respond to navigational commands. We evaluate different architectures for conditional imitation learning in vision-based driving. We conduct experiments in realistic three-dimensional simulations of urban driving and on a 1/5 scale robotic truck that is trained to drive in a residential area. Both systems drive based on visual input yet remain responsive to high-level navigational commands. The supplementary video can be viewed at this https URL
	Address	Brisbane; Australia; May 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICRA
	Notes	ADAS; 600.116; 600.124; 600.118			Approved	no
	Call Number	Admin @ si @ CML2018			Serial	3108
Permanent link to this record