Publicacions CVC -- Query Results

[51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–80]

Details

Records
Author	Jordi Gonzalez; Josep M. Gonfaus; Carles Fernandez; Xavier Roca
Title	Exploiting Natural-Language Interaction in Video Surveillance Systems			Type	Conference Article
Year	2011	Publication	V&L Net Workshop on Vision and Language	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Brighton, UK
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	VL
Notes	ISE			Approved	no
Call Number	Admin @ si @ GGF2011			Serial	1813
Permanent link to this record



Author	Theo Gevers; Arjan Gijsenij; Joost Van de Weijer; J.M. Geusebroek
Title	Color in Computer Vision: Fundamentals and Applications			Type	Book Whole
Year	2012	Publication	Color in Computer Vision: Fundamentals and Applications	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher	The Wiley-IS&T Series in Imaging Science and Technology	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-0-470-89084-4	Medium
Area		Expedition		Conference
Notes	ALTRES;ISE			Approved	no
Call Number	Admin @ si @ GGG2012a			Serial	2068
Permanent link to this record



Author	Josep M. Gonfaus; Theo Gevers; Arjan Gijsenij; Xavier Roca; Jordi Gonzalez
Title	Edge Classification using Photo-Geo metric features			Type	Conference Article
Year	2012	Publication	21st International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	1497 - 1500
Keywords
Abstract	Edges are caused by several imaging cues such as shadow, material and illumination transitions. Classification methods have been proposed which are solely based on photometric information, ignoring geometry to classify the physical nature of edges in images. In this paper, the aim is to present a novel strategy to handle both photometric and geometric information for edge classification. Photometric information is obtained through the use of quasi-invariants while geometric information is derived from the orientation and contrast of edges. Different combination frameworks are compared with a new principled approach that captures both information into the same descriptor. From large scale experiments on different datasets, it is shown that, in addition to photometric information, the geometry of edges is an important visual cue to distinguish between different edge types. It is concluded that by combining both cues the performance improves by more than 7% for shadows and highlights.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1051-4651	ISBN	978-1-4673-2216-4	Medium
Area		Expedition		Conference	ICPR
Notes	ISE			Approved	no
Call Number	Admin @ si @ GGG2012b			Serial	2142
Permanent link to this record



Author	Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
Title	Learning to Learn from Web Data through Deep Semantic Embeddings			Type	Conference Article
Year	2018	Publication	15th European Conference on Computer Vision Workshops	Abbreviated Journal
Volume	11134	Issue		Pages	514-529
Keywords
Abstract	In this paper we propose to learn a multimodal image and text embedding from Web and Social Media data, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the pipeline can learn from images with associated text without supervision and perform a thourough analysis of five different text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text based image retrieval task, and we clearly outperform state of the art in the MIRFlickr dataset when training in the target data. Further we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
Address	Munich; Alemanya; September 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	DAG; 600.129; 601.338; 600.121			Approved	no
Call Number	Admin @ si @ GGG2018a			Serial	3175
Permanent link to this record



Author	Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
Title	Learning from# Barcelona Instagram data what Locals and Tourists post about its Neighbourhoods			Type	Conference Article
Year	2018	Publication	15th European Conference on Computer Vision Workshops	Abbreviated Journal
Volume	11134	Issue		Pages	530-544
Keywords
Abstract	Massive tourism is becoming a big problem for some cities, such as Barcelona, due to its concentration in some neighborhoods. In this work we gather Instagram data related to Barcelona consisting on images-captions pairs and, using the text as a supervisory signal, we learn relations between images, words and neighborhoods. Our goal is to learn which visual elements appear in photos when people is posting about each neighborhood. We perform a language separate treatment of the data and show that it can be extrapolated to a tourists and locals separate analysis, and that tourism is reflected in Social Media at a neighborhood level. The presented pipeline allows analyzing the differences between the images that tourists and locals associate to the different neighborhoods. The proposed method, which can be extended to other cities or subjects, proves that Instagram data can be used to train multi-modal (image and text) machine learning models that are useful to analyze publications about a city at a neighborhood level. We publish the collected dataset, InstaBarcelona and the code used in the analysis.
Address	Munich; Alemanya; September 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	DAG; 600.129; 601.338; 600.121			Approved	no
Call Number	Admin @ si @ GGG2018b			Serial	3176
Permanent link to this record



Author	Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
Title	Self-Supervised Learning from Web Data for Multimodal Retrieval			Type	Book Chapter
Year	2019	Publication	Multi-Modal Scene Understanding Book	Abbreviated Journal
Volume		Issue		Pages	279-306
Keywords	self-supervised learning; webly supervised learning; text embeddings; multimodal retrieval; multimodal embedding
Abstract	Self-Supervised learning from multimodal image and text data allows deep neural networks to learn powerful features with no need of human annotated data. Web and Social Media platforms provide a virtually unlimited amount of this multimodal data. In this work we propose to exploit this free available data to learn a multimodal image and text embedding, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the proposed pipeline can learn from images with associated text without supervision and analyze the semantic structure of the learnt joint image and text embeddingspace. Weperformathoroughanalysisandperformancecomparisonofﬁvedifferentstateof the art text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text basedimageretrievaltask,andweclearlyoutperformstateoftheartintheMIRFlickrdatasetwhen training in the target data. Further, we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.129; 601.338; 601.310			Approved	no
Call Number	Admin @ si @ GGG2019			Serial	3266
Permanent link to this record



Author	Raul Gomez; Jaume Gibert; Lluis Gomez; Dimosthenis Karatzas
Title	Exploring Hate Speech Detection in Multimodal Publications			Type	Conference Article
Year	2020	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	In this work we target the problem of hate speech detection in multimodal publications formed by a text and an image. We gather and annotate a large scale dataset from Twitter, MMHS150K, and propose different models that jointly analyze textual and visual information for hate speech detection, comparing them with unimodal detection. We provide quantitative and qualitative results and analyze the challenges of the proposed task. We find that, even though images are useful for the hate speech detection task, current multimodal models cannot outperform models analyzing only text. We discuss why and open the field and the dataset for further research.
Address	Aspen; March 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	DAG; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ GGG2020a			Serial	3280
Permanent link to this record



Author	Raul Gomez; Jaume Gibert; Lluis Gomez; Dimosthenis Karatzas
Title	Location Sensitive Image Retrieval and Tagging			Type	Conference Article
Year	2020	Publication	16th European Conference on Computer Vision	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	People from different parts of the globe describe objects and concepts in distinct manners. Visual appearance can thus vary across different geographic locations, which makes location a relevant contextual information when analysing visual data. In this work, we address the task of image retrieval related to a given tag conditioned on a certain location on Earth. We present LocSens, a model that learns to rank triplets of images, tags and coordinates by plausibility, and two training strategies to balance the location influence in the final ranking. LocSens learns to fuse textual and location information of multimodal queries to retrieve related images at different levels of location granularity, and successfully utilizes location information to improve image tagging.
Address	Virtual; August 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCV
Notes	DAG; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ GGG2020b			Serial	3420
Permanent link to this record



Author	Lluis Garrido; M.Guerrieri; Laura Igual
Title	Image Segmentation with Cage Active Contours			Type	Journal Article
Year	2015	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
Volume	24	Issue	12	Pages	5557 - 5566
Keywords	Level sets; Mean value coordinates; Parametrized active contours; level sets; mean value coordinates
Abstract	In this paper, we present a framework for image segmentation based on parametrized active contours. The evolving contour is parametrized according to a reduced set of control points that form a closed polygon and have a clear visual interpretation. The parametrization, called mean value coordinates, stems from the techniques used in computer graphics to animate virtual models. Our framework allows to easily formulate region-based energies to segment an image. In particular, we present three different local region-based energy terms: 1) the mean model; 2) the Gaussian model; 3) and the histogram model. We show the behavior of our method on synthetic and real images and compare the performance with state-of-the-art level set methods.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1057-7149	ISBN		Medium
Area		Expedition		Conference
Notes	MILAB			Approved	no
Call Number	Admin @ si @ GGI2015			Serial	2673
Permanent link to this record



Author	Suman Ghosh; Lluis Gomez; Dimosthenis Karatzas; Ernest Valveny
Title	Efficient indexing for Query By String text retrieval			Type	Conference Article
Year	2015	Publication	6th IAPR International Workshop on Camera Based Document Analysis and Recognition CBDAR2015	Abbreviated Journal
Volume		Issue		Pages	1236 - 1240
Keywords
Abstract	This paper deals with Query By String word spotting in scene images. A hierarchical text segmentation algorithm based on text specific selective search is used to find text regions. These regions are indexed per character n-grams present in the text region. An attribute representation based on Pyramidal Histogram of Characters (PHOC) is used to compare text regions with the query text. For generation of the index a similar attribute space based Pyramidal Histogram of character n-grams is used. These attribute models are learned using linear SVMs over the Fisher Vector [1] representation of the images along with the PHOC labels of the corresponding strings.
Address	Nancy; France; August 2015
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CBDAR
Notes	DAG; 600.077			Approved	no
Call Number	Admin @ si @ GGK2015			Serial	2693
Permanent link to this record



Author	Antoni Gurgui; Debora Gil; Enric Marti
Title	Laplacian Unitary Domain for Texture Morphing			Type	Conference Article
Year	2015	Publication	Proceedings of the 10th International Conference on Computer Vision Theory and Applications VISIGRAPP2015	Abbreviated Journal
Volume	1	Issue		Pages	693-699
Keywords	Facial; metamorphosis;LaplacianMorphing
Abstract	Deformation of expressive textures is the gateway to realistic computer synthesis of expressions. By their good mathematical properties and flexible formulation on irregular meshes, most texture mappings rely on solutions to the Laplacian in the cartesian space. In the context of facial expression morphing, this approximation can be seen from the opposite point of view by neglecting the metric. In this paper, we use the properties of the Laplacian in manifolds to present a novel approach to warping expressive facial images in order to generate a morphing between them.
Address	Munich; Germany; February 2015
Corporate Author				Thesis
Publisher	SciTePress	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-989-758-089-5	Medium
Area		Expedition		Conference	VISAPP
Notes	IAM; 600.075			Approved	no
Call Number	Admin @ si @ GGM2015			Serial	2614
Permanent link to this record



Author	Antoni Gurgui; Debora Gil; Enric Marti; Vicente Grau
Title	Left-Ventricle Basal Region Constrained Parametric Mapping to Unitary Domain			Type	Conference Article
Year	2016	Publication	7th International Workshop on Statistical Atlases & Computational Modelling of the Heart	Abbreviated Journal
Volume	10124	Issue		Pages	163-171
Keywords	Laplacian; Constrained maps; Parameterization; Basal ring
Abstract	Due to its complex geometry, the basal ring is often omitted when putting different heart geometries into correspondence. In this paper, we present the first results on a new mapping of the left ventricle basal rings onto a normalized coordinate system using a fold-over free approach to the solution to the Laplacian. To guarantee correspondences between different basal rings, we imposed some internal constrained positions at anatomical landmarks in the normalized coordinate system. To prevent internal fold-overs, constraints are handled by cutting the volume into regions defined by anatomical features and mapping each piece of the volume separately. Initial results presented in this paper indicate that our method is able to handle internal constrains without introducing fold-overs and thus guarantees one-to-one mappings between different basal ring geometries.
Address	Athens; October 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	STACOM
Notes	IAM;			Approved	no
Call Number	Admin @ si @ GGM2016			Serial	2884
Permanent link to this record



Author	Umut Guclu; Yagmur Gucluturk; Meysam Madadi; Sergio Escalera; Xavier Baro; Jordi Gonzalez; Rob van Lier; Marcel A. J. van Gerven
Title	End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks			Type	Miscellaneous
Year	2017	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	arXiv:1703.03305 Recent years have seen a sharp increase in the number of related yet distinct advances in semantic segmentation. Here, we tackle this problem by leveraging the respective strengths of these advances. That is, we formulate a conditional random field over a four-connected graph as end-to-end trainable convolutional and recurrent networks, and estimate them via an adversarial process. Importantly, our model learns not only unary potentials but also pairwise potentials, while aggregating multi-scale contexts and controlling higher-order inconsistencies. We evaluate our model on two standard benchmark datasets for semantic face segmentation, achieving state-of-the-art results on both of them.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA; ISE; 600.098; 600.119			Approved	no
Call Number	Admin @ si @ GGM2017			Serial	2932
Permanent link to this record



Author	Yagmur Gucluturk; Umut Guclu; Marc Perez; Hugo Jair Escalante; Xavier Baro; Isabelle Guyon; Carlos Andujar; Julio C. S. Jacques Junior; Meysam Madadi; Sergio Escalera
Title	Visualizing Apparent Personality Analysis with Deep Residual Networks			Type	Conference Article
Year	2017	Publication	Chalearn Workshop on Action, Gesture, and Emotion Recognition: Large Scale Multimodal Gesture Recognition and Real versus Fake expressed emotions at ICCV	Abbreviated Journal
Volume		Issue		Pages	3101-3109
Keywords
Abstract	Automatic prediction of personality traits is a subjective task that has recently received much attention. Specifically, automatic apparent personality trait prediction from multimodal data has emerged as a hot topic within the filed of computer vision and, more particularly, the so called “looking at people” sub-field. Considering “apparent” personality traits as opposed to real ones considerably reduces the subjectivity of the task. The real world applications are encountered in a wide range of domains, including entertainment, health, human computer interaction, recruitment and security. Predictive models of personality traits are useful for individuals in many scenarios (e.g., preparing for job interviews, preparing for public speaking). However, these predictions in and of themselves might be deemed to be untrustworthy without human understandable supportive evidence. Through a series of experiments on a recently released benchmark dataset for automatic apparent personality trait prediction, this paper characterizes the audio and visual information that is used by a state-of-the-art model while making its predictions, so as to provide such supportive evidence by explaining predictions made. Additionally, the paper describes a new web application, which gives feedback on apparent personality traits of its users by combining model predictions with their explanations.
Address	Venice; Italy; October 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCVW
Notes	HUPBA; 6002.143			Approved	no
Call Number	Admin @ si @ GGP2017			Serial	3067
Permanent link to this record



Author	Wenjuan Gong; Jordi Gonzalez; Xavier Roca
Title	Human Action Recognition based on Estimated Weak Poses			Type	Journal Article
Year	2012	Publication	EURASIP Journal on Advances in Signal Processing	Abbreviated Journal	EURASIPJ
Volume		Issue		Pages
Keywords
Abstract	We present a novel method for human action recognition (HAR) based on estimated poses from image sequences. We use 3D human pose data as additional information and propose a compact human pose representation, called a weak pose, in a low-dimensional space while still keeping the most discriminative information for a given pose. With predicted poses from image features, we map the problem from image feature space to pose space, where a Bag of Poses (BOP) model is learned for the final goal of HAR. The BOP model is a modified version of the classical bag of words pipeline by building the vocabulary based on the most representative weak poses for a given action. Compared with the standard k-means clustering, our vocabulary selection criteria is proven to be more efficient and robust against the inherent challenges of action recognition. Moreover, since for action recognition the ordering of the poses is discriminative, the BOP model incorporates temporal information: in essence, groups of consecutive poses are considered together when computing the vocabulary and assignment. We tested our method on two well-known datasets: HumanEva and IXMAS, to demonstrate that weak poses aid to improve action recognition accuracies. The proposed method is scene-independent and is comparable with the state-of-art method.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	Admin @ si @ GGR2012			Serial	2003
Permanent link to this record