Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 >>

Details

Records
Author	Eduardo Aguilar; Bhalaji Nagarajan; Rupali Khatun; Marc Bolaños; Petia Radeva
Title	Uncertainty Modeling and Deep Learning Applied to Food Image Analysis			Type	Conference Article
Year	2020	Publication	13th International Joint Conference on Biomedical Engineering Systems and Technologies	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Recently, computer vision approaches specially assisted by deep learning techniques have shown unexpected advancements that practically solve problems that never have been imagined to be automatized like face recognition or automated driving. However, food image recognition has received a little effort in the Computer Vision community. In this project, we review the field of food image analysis and focus on how to combine with two challenging research lines: deep learning and uncertainty modeling. After discussing our methodology to advance in this direction, we comment potential research, social and economic impact of the research on food image analysis.
Address	Villetta; Malta; February 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	BIODEVICES
Notes	MILAB			Approved	no
Call Number	Admin @ si @ ANK2020			Serial	3526
Permanent link to this record



Author	Petia Radeva
Title	Uncertainty Modeling within an End-to-end Framework for Food Image Analysis			Type	Conference Article
Year	2020	Publication	1st DELTA	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	DELTA
Notes	MILAB			Approved	no
Call Number	Admin @ si @ Rad2020			Serial	3527
Permanent link to this record



Author	Eduardo Aguilar; Petia Radeva
Title	Uncertainty-aware integration of local and flat classifiers for food recognition			Type	Journal Article
Year	2020	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	136	Issue		Pages	237-243
Keywords
Abstract	Food image recognition has recently attracted the attention of many researchers, due to the challenging problem it poses, the ease collection of food images, and its numerous applications to health and leisure. In real applications, it is necessary to analyze and recognize thousands of different foods. For this purpose, we propose a novel prediction scheme based on a class hierarchy that considers local classifiers, in addition to a flat classifier. In order to make a decision about which approach to use, we define different criteria that take into account both the analysis of the Epistemic Uncertainty estimated from the ‘children’ classifiers and the prediction from the ‘parent’ classifier. We evaluate our proposal using three Uncertainty estimation methods, tested on two public food datasets. The results show that the proposed method reduces parent-child error propagation in hierarchical schemes and improves classification results compared to the single flat classifier, meanwhile maintains good performance regardless the Uncertainty estimation method chosen.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ AgR2020			Serial	3525
Permanent link to this record



Author	Ivet Rafegas; Maria Vanrell; Luis A Alexandre; G. Arias
Title	Understanding trained CNNs by indexing neuron selectivity			Type	Journal Article
Year	2020	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	136	Issue		Pages	318-325
Keywords
Abstract	The impressive performance of Convolutional Neural Networks (CNNs) when solving different vision problems is shadowed by their black-box nature and our consequent lack of understanding of the representations they build and how these representations are organized. To help understanding these issues, we propose to describe the activity of individual neurons by their Neuron Feature visualization and quantify their inherent selectivity with two specific properties. We explore selectivity indexes for: an image feature (color); and an image label (class membership). Our contribution is a framework to seek or classify neurons by indexing on these selectivity properties. It helps to find color selective neurons, such as a red-mushroom neuron in layer Conv4 or class selective neurons such as dog-face neurons in layer Conv5 in VGG-M, and establishes a methodology to derive other selectivity properties. Indexing on neuron selectivity can statistically draw how features and classes are represented through layers in a moment when the size of trained nets is growing and automatic tools to index neurons can be helpful.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	CIC; 600.087; 600.140; 600.118			Approved	no
Call Number	Admin @ si @ RVL2019			Serial	3310
Permanent link to this record



Author	Lei Kang; Marçal Rusiñol; Alicia Fornes; Pau Riba; Mauricio Villegas
Title	Unsupervised Adaptation for Synthetic-to-Real Handwritten Word Recognition			Type	Conference Article
Year	2020	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Handwritten Text Recognition (HTR) is still a challenging problem because it must deal with two important difficulties: the variability among writing styles, and the scarcity of labelled data. To alleviate such problems, synthetic data generation and data augmentation are typically used to train HTR systems. However, training with such data produces encouraging but still inaccurate transcriptions in real words. In this paper, we propose an unsupervised writer adaptation approach that is able to automatically adjust a generic handwritten word recognizer, fully trained with synthetic fonts, towards a new incoming writer. We have experimentally validated our proposal using five different datasets, covering several challenges (i) the document source: modern and historic samples, which may involve paper degradation problems; (ii) different handwriting styles: single and multiple writer collections; and (iii) language, which involves different character combinations. Across these challenging collections, we show that our system is able to maintain its performance, thus, it provides a practical and generic approach to deal with new document collections without requiring any expensive and tedious manual annotation step.
Address	Aspen; Colorado; USA; March 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	DAG; 600.129; 600.140; 601.302; 601.312; 600.121			Approved	no
Call Number	Admin @ si @ KRF2020			Serial	3446
Permanent link to this record



Author	Shiqi Yang; Yaxing Wang; Joost Van de Weijer; Luis Herranz
Title	Unsupervised Domain Adaptation without Source Data by Casting a BAIT			Type	Miscellaneous
Year	2020	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	arXiv:2010.12427 Unsupervised domain adaptation (UDA) aims to transfer the knowledge learned from a labeled source domain to an unlabeled target domain. Existing UDA methods require access to source data during adaptation, which may not be feasible in some real-world applications. In this paper, we address the source-free unsupervised domain adaptation (SFUDA) problem, where only the source model is available during the adaptation. We propose a method named BAIT to address SFUDA. Specifically, given only the source model, with the source classifier head fixed, we introduce a new learnable classifier. When adapting to the target domain, class prototypes of the new added classifier will act as a bait. They will first approach the target features which deviate from prototypes of the source classifier due to domain shift. Then those target features are pulled towards the corresponding prototypes of the source classifier, thus achieving feature alignment with the source classifier in the absence of source data. Experimental results show that the proposed method achieves state-of-the-art performance on several benchmark datasets compared with existing UDA and SFUDA methods.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.120			Approved	no
Call Number	Admin @ si @ YWW2020			Serial	3539
Permanent link to this record



Author	Fernando Vilariño
Title	Unveiling the Social Impact of AI			Type	Conference Article
Year	2020	Publication	Workshop at Digital Living Lab Days Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	September 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MV; DAG; 600.121; 600.140;SIAI			Approved	no
Call Number	Admin @ si @ Vil2020			Serial	3459
Permanent link to this record



Author	Fei Yang; Luis Herranz; Joost Van de Weijer; Jose Antonio Iglesias; Antonio Lopez; Mikhail Mozerov
Title	Variable Rate Deep Image Compression with Modulated Autoencoder			Type	Journal Article
Year	2020	Publication	IEEE Signal Processing Letters	Abbreviated Journal	SPL
Volume	27	Issue		Pages	331-335
Keywords
Abstract	Variable rate is a requirement for flexible and adaptable image and video compression. However, deep image compression methods (DIC) are optimized for a single fixed rate-distortion (R-D) tradeoff. While this can be addressed by training multiple models for different tradeoffs, the memory requirements increase proportionally to the number of models. Scaling the bottleneck representation of a shared autoencoder can provide variable rate compression with a single shared autoencoder. However, the R-D performance using this simple mechanism degrades in low bitrates, and also shrinks the effective range of bitrates. To address these limitations, we formulate the problem of variable R-D optimization for DIC, and propose modulated autoencoders (MAEs), where the representations of a shared autoencoder are adapted to the specific R-D tradeoff via a modulation network. Jointly training this modulated autoencoder and the modulation network provides an effective way to navigate the R-D operational curve. Our experiments show that the proposed method can achieve almost the same R-D performance of independent models with significantly fewer parameters.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; ADAS; 600.141; 600.120; 600.118			Approved	no
Call Number	Admin @ si @ YHW2020			Serial	3346
Permanent link to this record



Author	Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
Title	Video-based Isolated Hand Sign Language Recognition Using a Deep Cascaded Model			Type	Journal Article
Year	2020	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
Volume	79	Issue		Pages	22965–22987
Keywords
Abstract	In this paper, we propose an efficient cascaded model for sign language recognition taking benefit from spatio-temporal hand-based information using deep learning approaches, especially Single Shot Detector (SSD), Convolutional Neural Network (CNN), and Long Short Term Memory (LSTM), from videos. Our simple yet efficient and accurate model includes two main parts: hand detection and sign recognition. Three types of spatial features, including hand features, Extra Spatial Hand Relation (ESHR) features, and Hand Pose (HP) features, have been fused in the model to feed to LSTM for temporal features extraction. We train SSD model for hand detection using some videos collected from five online sign dictionaries. Our model is evaluated on our proposed dataset (Rastgoo et al., Expert Syst Appl 150: 113336, 2020), including 10’000 sign videos for 100 Persian sign using 10 contributors in 10 different backgrounds, and isoGD dataset. Using the 5-fold cross-validation method, our model outperforms state-of-the-art alternatives in sign language recognition
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA; no menciona			Approved	no
Call Number	Admin @ si @ RKE2020b			Serial	3442
Permanent link to this record



Author	Wenlong Deng; Yongli Mou; Takahiro Kashiwa; Sergio Escalera; Kohei Nagai; Kotaro Nakayama; Yutaka Matsuo; Helmut Prendinger
Title	Vision based Pixel-level Bridge Structural Damage Detection Using a Link ASPP Network			Type	Journal Article
Year	2020	Publication	Automation in Construction	Abbreviated Journal	AC
Volume	110	Issue		Pages	102973
Keywords	Semantic image segmentation; Deep learning
Abstract	Structural Health Monitoring (SHM) has greatly benefited from computer vision. Recently, deep learning approaches are widely used to accurately estimate the state of deterioration of infrastructure. In this work, we focus on the problem of bridge surface structural damage detection, such as delamination and rebar exposure. It is well known that the quality of a deep learning model is highly dependent on the quality of the training dataset. Bridge damage detection, our application domain, has the following main challenges: (i) labeling the damages requires knowledgeable civil engineering professionals, which makes it difficult to collect a large annotated dataset; (ii) the damage area could be very small, whereas the background area is large, which creates an unbalanced training environment; (iii) due to the difficulty to exactly determine the extension of the damage, there is often a variation among different labelers who perform pixel-wise labeling. In this paper, we propose a novel model for bridge structural damage detection to address the first two challenges. This paper follows the idea of an atrous spatial pyramid pooling (ASPP) module that is designed as a novel network for bridge damage detection. Further, we introduce the weight balanced Intersection over Union (IoU) loss function to achieve accurate segmentation on a highly unbalanced small dataset. The experimental results show that (i) the IoU loss function improves the overall performance of damage detection, as compared to cross entropy loss or focal loss, and (ii) the proposed model has a better ability to detect a minority class than other light segmentation networks.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA; no proj			Approved	no
Call Number	Admin @ si @ DMK2020			Serial	3314
Permanent link to this record



Author	Jon Almazan; Lluis Gomez; Suman Ghosh; Ernest Valveny; Dimosthenis Karatzas
Title	WATTS: A common representation of word images and strings using embedded attributes for text recognition and retrieval			Type	Book Chapter
Year	2020	Publication	Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor	Analysis”, K. Alahari; C.V. Jawahar
Language		Summary Language		Original Title
Series Editor		Series Title	Series on Advances in Computer Vision and Pattern Recognition	Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ AGG2020			Serial	3496
Permanent link to this record