Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

Details

Records
Author	Zhijie Fang; Antonio Lopez
Title	Is the Pedestrian going to Cross? Answering by 2D Pose Estimation			Type	Conference Article
Year	2018	Publication	IEEE Intelligent Vehicles Symposium	Abbreviated Journal
Volume		Issue		Pages	1271 - 1276
Keywords
Abstract	Our recent work suggests that, thanks to nowadays powerful CNNs, image-based 2D pose estimation is a promising cue for determining pedestrian intentions such as crossing the road in the path of the ego-vehicle, stopping before entering the road, and starting to walk or bending towards the road. This statement is based on the results obtained on non-naturalistic sequences (Daimler dataset), i.e. in sequences choreographed specifically for performing the study. Fortunately, a new publicly available dataset (JAAD) has appeared recently to allow developing methods for detecting pedestrian intentions in naturalistic driving conditions; more specifically, for addressing the relevant question is the pedestrian going to cross? Accordingly, in this paper we use JAAD to assess the usefulness of 2D pose estimation for answering such a question. We combine CNN-based pedestrian detection, tracking and pose estimation to predict the crossing action from monocular images. Overall, the proposed pipeline provides new state-ofthe-art results.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IV
Notes	ADAS; 600.124; 600.116; 600.118			Approved	no
Call Number	Admin @ si @ FaL2018			Serial	3181
Permanent link to this record



Author	Manuel Carbonell; Mauricio Villegas; Alicia Fornes; Josep Llados
Title	Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model			Type	Conference Article
Year	2018	Publication	13th IAPR International Workshop on Document Analysis Systems	Abbreviated Journal
Volume		Issue		Pages	399-404
Keywords	Named entity recognition; Handwritten Text Recognition; neural networks
Abstract	When extracting information from handwritten documents, text transcription and named entity recognition are usually faced as separate subsequent tasks. This has the disadvantage that errors in the first module affect heavily the performance of the second module. In this work we propose to do both tasks jointly, using a single neural network with a common architecture used for plain text recognition. Experimentally, the work has been tested on a collection of historical marriage records. Results of experiments are presented to show the effect on the performance for different configurations: different ways of encoding the information, doing or not transfer learning and processing at text line or multi-line region level. The results are comparable to state of the art reported in the ICDAR 2017 Information Extraction competition, even though the proposed technique does not use any dictionaries, language modeling or post processing.
Address	Vienna; Austria; April 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	DAS
Notes	DAG; 600.097; 603.057; 601.311; 600.121			Approved	no
Call Number	Admin @ si @ CVF2018			Serial	3170
Permanent link to this record



Author	Sounak Dey; Anjan Dutta; Suman Ghosh; Ernest Valveny; Josep Llados; Umapada Pal
Title	Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch			Type	Conference Article
Year	2018	Publication	24th International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	916 - 921
Keywords
Abstract	In this work we introduce a cross modal image retrieval system that allows both text and sketch as input modalities for the query. A cross-modal deep network architecture is formulated to jointly model the sketch and text input modalities as well as the the image output modality, learning a common embedding between text and images and between sketches and images. In addition, an attention model is used to selectively focus the attention on the different objects of the image, allowing for retrieval with multiple objects in the query. Experiments show that the proposed method performs the best in both single and multiple object image retrieval in standard datasets.
Address	Beijing; China; August 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	DAG; 602.167; 602.168; 600.097; 600.084; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ DDG2018b			Serial	3152
Permanent link to this record



Author	Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
Title	Learning from# Barcelona Instagram data what Locals and Tourists post about its Neighbourhoods			Type	Conference Article
Year	2018	Publication	15th European Conference on Computer Vision Workshops	Abbreviated Journal
Volume	11134	Issue		Pages	530-544
Keywords
Abstract	Massive tourism is becoming a big problem for some cities, such as Barcelona, due to its concentration in some neighborhoods. In this work we gather Instagram data related to Barcelona consisting on images-captions pairs and, using the text as a supervisory signal, we learn relations between images, words and neighborhoods. Our goal is to learn which visual elements appear in photos when people is posting about each neighborhood. We perform a language separate treatment of the data and show that it can be extrapolated to a tourists and locals separate analysis, and that tourism is reflected in Social Media at a neighborhood level. The presented pipeline allows analyzing the differences between the images that tourists and locals associate to the different neighborhoods. The proposed method, which can be extended to other cities or subjects, proves that Instagram data can be used to train multi-modal (image and text) machine learning models that are useful to analyze publications about a city at a neighborhood level. We publish the collected dataset, InstaBarcelona and the code used in the analysis.
Address	Munich; Alemanya; September 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	DAG; 600.129; 601.338; 600.121			Approved	no
Call Number	Admin @ si @ GGG2018b			Serial	3176
Permanent link to this record



Author	Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes
Title	Learning Graph Distances with Message Passing Neural Networks			Type	Conference Article
Year	2018	Publication	24th International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	2239-2244
Keywords	★Best Paper Award★
Abstract	Graph representations have been widely used in pattern recognition thanks to their powerful representation formalism and rich theoretical background. A number of error-tolerant graph matching algorithms such as graph edit distance have been proposed for computing a distance between two labelled graphs. However, they typically suffer from a high computational complexity, which makes it difficult to apply these matching algorithms in a real scenario. In this paper, we propose an efficient graph distance based on the emerging field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure and learns a metric with a siamese network approach. The performance of the proposed graph distance is validated in two application cases, graph classification and graph retrieval of handwritten words, and shows a promising performance when compared with (approximate) graph edit distance benchmarks.
Address	Beijing; China; August 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	DAG; 600.097; 603.057; 601.302; 600.121			Approved	no
Call Number	Admin @ si @ RFL2018			Serial	3168
Permanent link to this record



Author	Marco Buzzelli; Joost Van de Weijer; Raimondo Schettini
Title	Learning Illuminant Estimation from Object Recognition			Type	Conference Article
Year	2018	Publication	25th International Conference on Image Processing	Abbreviated Journal
Volume		Issue		Pages	3234 - 3238
Keywords	Illuminant estimation; computational color constancy; semi-supervised learning; deep learning; convolutional neural networks
Abstract	In this paper we present a deep learning method to estimate the illuminant of an image. Our model is not trained with illuminant annotations, but with the objective of improving performance on an auxiliary task such as object recognition. To the best of our knowledge, this is the first example of a deep learning architecture for illuminant estimation that is trained without ground truth illuminants. We evaluate our solution on standard datasets for color constancy, and compare it with state of the art methods. Our proposal is shown to outperform most deep learning methods in a cross-dataset evaluation setup, and to present competitive results in a comparison with parametric solutions.
Address	Athens; Greece; October 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICIP
Notes	LAMP; 600.109; 600.120			Approved	no
Call Number	Admin @ si @ BWS2018			Serial	3157
Permanent link to this record



Author	I. Sorodoc; S. Pezzelle; A. Herbelot; Mariella Dimiccoli; R. Bernardi
Title	Learning quantification from images: A structured neural architecture			Type	Journal Article
Year	2018	Publication	Natural Language Engineering	Abbreviated Journal	NLE
Volume	24	Issue	3	Pages	363-392
Keywords
Abstract	Major advances have recently been made in merging language and vision representations. Most tasks considered so far have confined themselves to the processing of objects and lexicalised relations amongst objects (content words). We know, however, that humans (even pre-school children) can abstract over raw multimodal data to perform certain types of higher level reasoning, expressed in natural language by function words. A case in point is given by their ability to learn quantifiers, i.e. expressions like few, some and all. From formal semantics and cognitive linguistics, we know that quantifiers are relations over sets which, as a simplification, we can see as proportions. For instance, in most fish are red, most encodes the proportion of fish which are red fish. In this paper, we study how well current neural network strategies model such relations. We propose a task where, given an image and a query expressed by an object–property pair, the system must return a quantifier expressing which proportions of the queried object have the queried property. Our contributions are twofold. First, we show that the best performance on this task involves coupling state-of-the-art attention mechanisms with a network architecture mirroring the logical structure assigned to quantifiers by classic linguistic formalisation. Second, we introduce a new balanced dataset of image scenarios associated with quantification queries, which we hope will foster further research in this area.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no menciona			Approved	no
Call Number	Admin @ si @ SPH2018			Serial	3021
Permanent link to this record



Author	Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
Title	Learning to Learn from Web Data through Deep Semantic Embeddings			Type	Conference Article
Year	2018	Publication	15th European Conference on Computer Vision Workshops	Abbreviated Journal
Volume	11134	Issue		Pages	514-529
Keywords
Abstract	In this paper we propose to learn a multimodal image and text embedding from Web and Social Media data, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the pipeline can learn from images with associated text without supervision and perform a thourough analysis of five different text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text based image retrieval task, and we clearly outperform state of the art in the MIRFlickr dataset when training in the target data. Further we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
Address	Munich; Alemanya; September 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	DAG; 600.129; 601.338; 600.121			Approved	no
Call Number	Admin @ si @ GGG2018a			Serial	3175
Permanent link to this record



Author	Joan Serrat; Felipe Lumbreras; Idoia Ruiz
Title	Learning to measure for preshipment garment sizing			Type	Journal Article
Year	2018	Publication	Measurement	Abbreviated Journal	MEASURE
Volume	130	Issue		Pages	327-339
Keywords	Apparel; Computer vision; Structured prediction; Regression
Abstract	Clothing is still manually manufactured for the most part nowadays, resulting in discrepancies between nominal and real dimensions, and potentially ill-fitting garments. Hence, it is common in the apparel industry to manually perform measures at preshipment time. We present an automatic method to obtain such measures from a single image of a garment that speeds up this task. It is generic and extensible in the sense that it does not depend explicitly on the garment shape or type. Instead, it learns through a probabilistic graphical model to identify the different contour parts. Subsequently, a set of Lasso regressors, one per desired measure, can predict the actual values of the measures. We present results on a dataset of 130 images of jackets and 98 of pants, of varying sizes and styles, obtaining 1.17 and 1.22 cm of mean absolute error, respectively.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; MSIAU; 600.122; 600.118			Approved	no
Call Number	Admin @ si @ SLR2018			Serial	3128
Permanent link to this record



Author	Xialei Liu; Joost Van de Weijer; Andrew Bagdanov
Title	Leveraging Unlabeled Data for Crowd Counting by Learning to Rank			Type	Conference Article
Year	2018	Publication	31st IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	7661 - 7669
Keywords	Task analysis; Training; Computer vision; Visualization; Estimation; Head; Context modeling
Abstract	We propose a novel crowd counting approach that leverages abundantly available unlabeled crowd imagery in a learning-to-rank framework. To induce a ranking of cropped images , we use the observation that any sub-image of a crowded scene image is guaranteed to contain the same number or fewer persons than the super-image. This allows us to address the problem of limited size of existing datasets for crowd counting. We collect two crowd scene datasets from Google using keyword searches and queryby-example image retrieval, respectively. We demonstrate how to efficiently learn from these unlabeled datasets by incorporating learning-to-rank in a multi-task network which simultaneously ranks images and estimates crowd density maps. Experiments on two of the most challenging crowd counting datasets show that our approach obtains state-ofthe-art results.
Address	Salt Lake City; USA; June 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPR
Notes	LAMP; 600.109; 600.106; 600.120			Approved	no
Call Number	Admin @ si @ LWB2018			Serial	3159
Permanent link to this record



Author	Fernando Vilariño; Dimosthenis Karatzas; Alberto Valcarce
Title	Libraries as New Innovation Hubs: The Library Living Lab			Type	Conference Article
Year	2018	Publication	30th ISPIM Innovation Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Libraries are in deep transformation both in EU and around the world, and they are thriving within a great window of opportunity for innovation. In this paper, we show how the Library Living Lab in Barcelona participated of this changing scenario and contributed to create the Bibliolab program, where more than 200 public libraries give voice to their users in a global user-centric innovation initiative, using technology as enabling factor. The Library Living Lab is a real 4-helix implementation where Universities, Research Centers, Public Administration, Companies and the Neighbors are joint together to explore how technology transforms the cultural experience of people. This case is an example of scalability and provides reference tools for policy making, sustainability, user engage methodologies and governance. We provide specific examples of new prototypes and services that help to understand how to redefine the role of the Library as a real hub for social innovation.
Address	Stockholm; May 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ISPIM
Notes	DAG; MV; 600.097; 600.121; 600.129;SIAI			Approved	no
Call Number	Admin @ si @ VKV2018b			Serial	3154
Permanent link to this record



Author	Ozan Caglayan; Adrien Bardet; Fethi Bougares; Loic Barrault; Kai Wang; Marc Masana; Luis Herranz; Joost Van de Weijer
Title	LIUM-CVC Submissions for WMT18 Multimodal Translation Task			Type	Conference Article
Year	2018	Publication	3rd Conference on Machine Translation	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	This paper describes the multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT18 Shared Task on Multimodal Translation. This year we propose several modifications to our previou multimodal attention architecture in order to better integrate convolutional features and refine them using encoder-side information. Our final constrained submissions ranked first for English→French and second for English→German language pairs among the constrained submissions according to the automatic evaluation metric METEOR.
Address	Brussels; Belgium; October 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WMT
Notes	LAMP; 600.106; 600.120			Approved	no
Call Number	Admin @ si @ CBB2018			Serial	3240
Permanent link to this record



Author	Sergio Escalera; Jordi Gonzalez; Hugo Jair Escalante; Xavier Baro; Isabelle Guyon
Title	Looking at People Special Issue			Type	Journal Article
Year	2018	Publication	International Journal of Computer Vision	Abbreviated Journal	IJCV
Volume	126	Issue	2-4	Pages	141-143
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA; ISE; 600.119			Approved	no
Call Number	Admin @ si @ EGJ2018			Serial	3093
Permanent link to this record



Author	Md. Mostafa Kamal Sarker; Hatem A. Rashwan; Hatem A. Rashwan; Estefania Talavera; Syeda Furruka Banu; Petia Radeva; Domenec Puig
Title	MACNet: Multi-scale Atrous Convolution Networks for Food Places Classification in Egocentric Photo-streams			Type	Conference Article
Year	2018	Publication	European Conference on Computer Vision workshops	Abbreviated Journal
Volume		Issue		Pages	423-433
Keywords
Abstract	First-person (wearable) camera continually captures unscripted interactions of the camera user with objects, people, and scenes reflecting his personal and relational tendencies. One of the preferences of people is their interaction with food events. The regulation of food intake and its duration has a great importance to protect against diseases. Consequently, this work aims to develop a smart model that is able to determine the recurrences of a person on food places during a day. This model is based on a deep end-to-end model for automatic food places recognition by analyzing egocentric photo-streams. In this paper, we apply multi-scale Atrous convolution networks to extract the key features related to food places of the input images. The proposed model is evaluated on an in-house private dataset called “EgoFoodPlaces”. Experimental results shows promising results of food places classification recognition in egocentric photo-streams.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LCNS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	MILAB; no menciona			Approved	no
Call Number	Admin @ si @ SRR2018b			Serial	3185
Permanent link to this record



Author	David Aldavert; Marçal Rusiñol
Title	Manuscript text line detection and segmentation using second-order derivatives analysis			Type	Conference Article
Year	2018	Publication	13th IAPR International Workshop on Document Analysis Systems	Abbreviated Journal
Volume		Issue		Pages	293 - 298
Keywords	text line detection; text line segmentation; text region detection; second-order derivatives
Abstract	In this paper, we explore the use of second-order derivatives to detect text lines on handwritten document images. Taking advantage that the second derivative gives a minimum response when a dark linear element over a bright background has the same orientation as the filter, we use this operator to create a map with the local orientation and strength of putative text lines in the document. Then, we detect line segments by selecting and merging the filter responses that have a similar orientation and scale. Finally, text lines are found by merging the segments that are within the same text region. The proposed segmentation algorithm, is learning-free while showing a performance similar to the state of the art methods in publicly available datasets.
Address	Viena; Austria; April 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	DAS
Notes	DAG; 600.084; 600.129; 302.065; 600.121			Approved	no
Call Number	Admin @ si @ AlR2018a			Serial	3104
Permanent link to this record