Publicacions CVC -- Query Results

<< 1 >>

Details

Records
Author	Pedro Martins; Paulo Carvalho; Carlo Gatta
Title	Context-aware features and robust image representations			Type	Journal Article
Year	2014	Publication	Journal of Visual Communication and Image Representation	Abbreviated Journal	JVCIR
Volume	25	Issue	2	Pages	339-348
Keywords
Abstract	Local image features are often used to efficiently represent image content. The limited number of types of features that a local feature extractor responds to might be insufficient to provide a robust image representation. To overcome this limitation, we propose a context-aware feature extraction formulated under an information theoretic framework. The algorithm does not respond to a specific type of features; the idea is to retrieve complementary features which are relevant within the image context. We empirically validate the method by investigating the repeatability, the completeness, and the complementarity of context-aware features on standard benchmarks. In a comparison with strictly local features, we show that our context-aware features produce more robust image representations. Furthermore, we study the complementarity between strictly local features and context-aware ones to produce an even more robust representation.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.079;MILAB			Approved	no
Call Number	Admin @ si @ MCG2014			Serial	2467
Permanent link to this record



Author	Pejman Rasti; Salma Samiei; Mary Agoyi; Sergio Escalera; Gholamreza Anbarjafari
Title	Robust non-blind color video watermarking using QR decomposition and entropy analysis			Type	Journal Article
Year	2016	Publication	Journal of Visual Communication and Image Representation	Abbreviated Journal	JVCIR
Volume	38	Issue		Pages	838-847
Keywords	Video watermarking; QR decomposition; Discrete Wavelet Transformation; Chirp Z-transform; Singular value decomposition; Orthogonal–triangular decomposition
Abstract	Issues such as content identification, document and image security, audience measurement, ownership and copyright among others can be settled by the use of digital watermarking. Many recent video watermarking methods show drops in visual quality of the sequences. The present work addresses the aforementioned issue by introducing a robust and imperceptible non-blind color video frame watermarking algorithm. The method divides frames into moving and non-moving parts. The non-moving part of each color channel is processed separately using a block-based watermarking scheme. Blocks with an entropy lower than the average entropy of all blocks are subject to a further process for embedding the watermark image. Finally a watermarked frame is generated by adding moving parts to it. Several signal processing attacks are applied to each watermarked frame in order to perform experiments and are compared with some recent algorithms. Experimental results show that the proposed scheme is imperceptible and robust against common signal processing attacks.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA;MILAB;			Approved	no
Call Number	Admin @ si @RSA2016			Serial	2766
Permanent link to this record



Author	Marc Bolaños; Alvaro Peris; Francisco Casacuberta; Sergi Solera; Petia Radeva
Title	Egocentric video description based on temporally-linked sequences			Type	Journal Article
Year	2018	Publication	Journal of Visual Communication and Image Representation	Abbreviated Journal	JVCIR
Volume	50	Issue		Pages	205-216
Keywords	egocentric vision; video description; deep learning; multi-modal learning
Abstract	Egocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures. In this paper, we tackle storytelling as an egocentric sequences description problem. We propose a novel methodology that exploits information from temporally neighboring events, matching precisely the nature of egocentric sequences. Furthermore, we present a new method for multimodal data fusion consisting on a multi-input attention recurrent network. We also release the EDUB-SegDesc dataset. This is the first dataset for egocentric image sequences description, consisting of 1,339 events with 3,991 descriptions, from 55 days acquired by 11 people. Finally, we prove that our proposal outperforms classical attentional encoder-decoder methods for video description.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ BPC2018			Serial	3109
Permanent link to this record



Author	Mariella Dimiccoli; Cathal Gurrin; David J. Crandall; Xavier Giro; Petia Radeva
Title	Introduction to the special issue: Egocentric Vision and Lifelogging			Type	Journal Article
Year	2018	Publication	Journal of Visual Communication and Image Representation	Abbreviated Journal	JVCIR
Volume	55	Issue		Pages	352-353
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ DGC2018			Serial	3187
Permanent link to this record



Author	Eduardo Aguilar; Marc Bolaños; Petia Radeva
Title	Regularized uncertainty-based multi-task learning model for food analysis			Type	Journal Article
Year	2019	Publication	Journal of Visual Communication and Image Representation	Abbreviated Journal	JVCIR
Volume	60	Issue		Pages	360-370
Keywords	Multi-task models; Uncertainty modeling; Convolutional neural networks; Food image analysis; Food recognition; Food group recognition; Ingredients recognition; Cuisine recognition
Abstract	Food plays an important role in several aspects of our daily life. Several computer vision approaches have been proposed for tackling food analysis problems, but very little effort has been done in developing methodologies that could take profit of the existent correlation between tasks. In this paper, we propose a new multi-task model that is able to simultaneously predict different food-related tasks, e.g. dish, cuisine and food categories. Here, we extend the homoscedastic uncertainty modeling to allow single-label and multi-label classification and propose a regularization term, which jointly weighs the tasks as well as their correlations. Furthermore, we propose a new Multi-Attribute Food dataset and a new metric, Multi-Task Accuracy. We prove that using both our uncertainty-based loss and the class regularization term, we are able to improve the coherence of outputs between different tasks. Moreover, we outperform the use of task-specific models on classical measures like accuracy or .
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ ABR2019			Serial	3298
Permanent link to this record



Author	Bhalaji Nagarajan; Marc Bolaños; Eduardo Aguilar; Petia Radeva
Title	Deep ensemble-based hard sample mining for food recognition			Type	Journal Article
Year	2023	Publication	Journal of Visual Communication and Image Representation	Abbreviated Journal	JVCIR
Volume	95	Issue		Pages	103905
Keywords
Abstract	Deep neural networks represent a compelling technique to tackle complex real-world problems, but are over-parameterized and often suffer from over- or under-confident estimates. Deep ensembles have shown better parameter estimations and often provide reliable uncertainty estimates that contribute to the robustness of the results. In this work, we propose a new metric to identify samples that are hard to classify. Our metric is defined as coincidence score for deep ensembles which measures the agreement of its individual models. The main hypothesis we rely on is that deep learning algorithms learn the low-loss samples better compared to large-loss samples. In order to compensate for this, we use controlled over-sampling on the identified ”hard” samples using proper data augmentation schemes to enable the models to learn those samples better. We validate the proposed metric using two public food datasets on different backbone architectures and show the improvements compared to the conventional deep neural network training using different performance metrics.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB			Approved	no
Call Number	Admin @ si @ NBA2023			Serial	3844
Permanent link to this record