Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	76–90 of 155 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

List View

Citations

Details

	Records
	Author	Zhijie Fang; Antonio Lopez
	Title	Is the Pedestrian going to Cross? Answering by 2D Pose Estimation			Type	Conference Article
	Year	2018	Publication	IEEE Intelligent Vehicles Symposium	Abbreviated Journal
	Volume		Issue		Pages	1271 - 1276
	Keywords
	Abstract	Our recent work suggests that, thanks to nowadays powerful CNNs, image-based 2D pose estimation is a promising cue for determining pedestrian intentions such as crossing the road in the path of the ego-vehicle, stopping before entering the road, and starting to walk or bending towards the road. This statement is based on the results obtained on non-naturalistic sequences (Daimler dataset), i.e. in sequences choreographed specifically for performing the study. Fortunately, a new publicly available dataset (JAAD) has appeared recently to allow developing methods for detecting pedestrian intentions in naturalistic driving conditions; more specifically, for addressing the relevant question is the pedestrian going to cross? Accordingly, in this paper we use JAAD to assess the usefulness of 2D pose estimation for answering such a question. We combine CNN-based pedestrian detection, tracking and pose estimation to predict the crossing action from monocular images. Overall, the proposed pipeline provides new state-ofthe-art results.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	IV
	Notes	ADAS; 600.124; 600.116; 600.118			Approved	no
	Call Number	Admin @ si @ FaL2018			Serial	3181
Permanent link to this record



	Author	Manuel Carbonell; Mauricio Villegas; Alicia Fornes; Josep Llados
	Title	Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model			Type	Conference Article
	Year	2018	Publication	13th IAPR International Workshop on Document Analysis Systems	Abbreviated Journal
	Volume		Issue		Pages	399-404
	Keywords	Named entity recognition; Handwritten Text Recognition; neural networks
	Abstract	When extracting information from handwritten documents, text transcription and named entity recognition are usually faced as separate subsequent tasks. This has the disadvantage that errors in the first module affect heavily the performance of the second module. In this work we propose to do both tasks jointly, using a single neural network with a common architecture used for plain text recognition. Experimentally, the work has been tested on a collection of historical marriage records. Results of experiments are presented to show the effect on the performance for different configurations: different ways of encoding the information, doing or not transfer learning and processing at text line or multi-line region level. The results are comparable to state of the art reported in the ICDAR 2017 Information Extraction competition, even though the proposed technique does not use any dictionaries, language modeling or post processing.
	Address	Vienna; Austria; April 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	DAS
	Notes	DAG; 600.097; 603.057; 601.311; 600.121			Approved	no
	Call Number	Admin @ si @ CVF2018			Serial	3170
Permanent link to this record



	Author	Sounak Dey; Anjan Dutta; Suman Ghosh; Ernest Valveny; Josep Llados; Umapada Pal
	Title	Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch			Type	Conference Article
	Year	2018	Publication	24th International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	916 - 921
	Keywords
	Abstract	In this work we introduce a cross modal image retrieval system that allows both text and sketch as input modalities for the query. A cross-modal deep network architecture is formulated to jointly model the sketch and text input modalities as well as the the image output modality, learning a common embedding between text and images and between sketches and images. In addition, an attention model is used to selectively focus the attention on the different objects of the image, allowing for retrieval with multiple objects in the query. Experiments show that the proposed method performs the best in both single and multiple object image retrieval in standard datasets.
	Address	Beijing; China; August 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	DAG; 602.167; 602.168; 600.097; 600.084; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ DDG2018b			Serial	3152
Permanent link to this record



	Author	Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
	Title	Learning from# Barcelona Instagram data what Locals and Tourists post about its Neighbourhoods			Type	Conference Article
	Year	2018	Publication	15th European Conference on Computer Vision Workshops	Abbreviated Journal
	Volume	11134	Issue		Pages	530-544
	Keywords
	Abstract	Massive tourism is becoming a big problem for some cities, such as Barcelona, due to its concentration in some neighborhoods. In this work we gather Instagram data related to Barcelona consisting on images-captions pairs and, using the text as a supervisory signal, we learn relations between images, words and neighborhoods. Our goal is to learn which visual elements appear in photos when people is posting about each neighborhood. We perform a language separate treatment of the data and show that it can be extrapolated to a tourists and locals separate analysis, and that tourism is reflected in Social Media at a neighborhood level. The presented pipeline allows analyzing the differences between the images that tourists and locals associate to the different neighborhoods. The proposed method, which can be extended to other cities or subjects, proves that Instagram data can be used to train multi-modal (image and text) machine learning models that are useful to analyze publications about a city at a neighborhood level. We publish the collected dataset, InstaBarcelona and the code used in the analysis.
	Address	Munich; Alemanya; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCVW
	Notes	DAG; 600.129; 601.338; 600.121			Approved	no
	Call Number	Admin @ si @ GGG2018b			Serial	3176
Permanent link to this record



	Author	Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes
	Title	Learning Graph Distances with Message Passing Neural Networks			Type	Conference Article
	Year	2018	Publication	24th International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	2239-2244
	Keywords	★Best Paper Award★
	Abstract	Graph representations have been widely used in pattern recognition thanks to their powerful representation formalism and rich theoretical background. A number of error-tolerant graph matching algorithms such as graph edit distance have been proposed for computing a distance between two labelled graphs. However, they typically suffer from a high computational complexity, which makes it difficult to apply these matching algorithms in a real scenario. In this paper, we propose an efficient graph distance based on the emerging field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure and learns a metric with a siamese network approach. The performance of the proposed graph distance is validated in two application cases, graph classification and graph retrieval of handwritten words, and shows a promising performance when compared with (approximate) graph edit distance benchmarks.
	Address	Beijing; China; August 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	DAG; 600.097; 603.057; 601.302; 600.121			Approved	no
	Call Number	Admin @ si @ RFL2018			Serial	3168
Permanent link to this record



	Author	Marco Buzzelli; Joost Van de Weijer; Raimondo Schettini
	Title	Learning Illuminant Estimation from Object Recognition			Type	Conference Article
	Year	2018	Publication	25th International Conference on Image Processing	Abbreviated Journal
	Volume		Issue		Pages	3234 - 3238
	Keywords	Illuminant estimation; computational color constancy; semi-supervised learning; deep learning; convolutional neural networks
	Abstract	In this paper we present a deep learning method to estimate the illuminant of an image. Our model is not trained with illuminant annotations, but with the objective of improving performance on an auxiliary task such as object recognition. To the best of our knowledge, this is the first example of a deep learning architecture for illuminant estimation that is trained without ground truth illuminants. We evaluate our solution on standard datasets for color constancy, and compare it with state of the art methods. Our proposal is shown to outperform most deep learning methods in a cross-dataset evaluation setup, and to present competitive results in a comparison with parametric solutions.
	Address	Athens; Greece; October 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIP
	Notes	LAMP; 600.109; 600.120			Approved	no
	Call Number	Admin @ si @ BWS2018			Serial	3157
Permanent link to this record



	Author	I. Sorodoc; S. Pezzelle; A. Herbelot; Mariella Dimiccoli; R. Bernardi
	Title	Learning quantification from images: A structured neural architecture			Type	Journal Article
	Year	2018	Publication	Natural Language Engineering	Abbreviated Journal	NLE
	Volume	24	Issue	3	Pages	363-392
	Keywords
	Abstract	Major advances have recently been made in merging language and vision representations. Most tasks considered so far have confined themselves to the processing of objects and lexicalised relations amongst objects (content words). We know, however, that humans (even pre-school children) can abstract over raw multimodal data to perform certain types of higher level reasoning, expressed in natural language by function words. A case in point is given by their ability to learn quantifiers, i.e. expressions like few, some and all. From formal semantics and cognitive linguistics, we know that quantifiers are relations over sets which, as a simplification, we can see as proportions. For instance, in most fish are red, most encodes the proportion of fish which are red fish. In this paper, we study how well current neural network strategies model such relations. We propose a task where, given an image and a query expressed by an object–property pair, the system must return a quantifier expressing which proportions of the queried object have the queried property. Our contributions are twofold. First, we show that the best performance on this task involves coupling state-of-the-art attention mechanisms with a network architecture mirroring the logical structure assigned to quantifiers by classic linguistic formalisation. Second, we introduce a new balanced dataset of image scenarios associated with quantification queries, which we hope will foster further research in this area.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB; no menciona			Approved	no
	Call Number	Admin @ si @ SPH2018			Serial	3021
Permanent link to this record



	Author	Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
	Title	Learning to Learn from Web Data through Deep Semantic Embeddings			Type	Conference Article
	Year	2018	Publication	15th European Conference on Computer Vision Workshops	Abbreviated Journal
	Volume	11134	Issue		Pages	514-529
	Keywords
	Abstract	In this paper we propose to learn a multimodal image and text embedding from Web and Social Media data, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the pipeline can learn from images with associated text without supervision and perform a thourough analysis of five different text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text based image retrieval task, and we clearly outperform state of the art in the MIRFlickr dataset when training in the target data. Further we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
	Address	Munich; Alemanya; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCVW
	Notes	DAG; 600.129; 601.338; 600.121			Approved	no
	Call Number	Admin @ si @ GGG2018a			Serial	3175
Permanent link to this record



	Author	Joan Serrat; Felipe Lumbreras; Idoia Ruiz
	Title	Learning to measure for preshipment garment sizing			Type	Journal Article
	Year	2018	Publication	Measurement	Abbreviated Journal	MEASURE
	Volume	130	Issue		Pages	327-339
	Keywords	Apparel; Computer vision; Structured prediction; Regression
	Abstract	Clothing is still manually manufactured for the most part nowadays, resulting in discrepancies between nominal and real dimensions, and potentially ill-fitting garments. Hence, it is common in the apparel industry to manually perform measures at preshipment time. We present an automatic method to obtain such measures from a single image of a garment that speeds up this task. It is generic and extensible in the sense that it does not depend explicitly on the garment shape or type. Instead, it learns through a probabilistic graphical model to identify the different contour parts. Subsequently, a set of Lasso regressors, one per desired measure, can predict the actual values of the measures. We present results on a dataset of 130 images of jackets and 98 of pants, of varying sizes and styles, obtaining 1.17 and 1.22 cm of mean absolute error, respectively.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS; MSIAU; 600.122; 600.118			Approved	no
	Call Number	Admin @ si @ SLR2018			Serial	3128
Permanent link to this record



	Author	Xialei Liu; Joost Van de Weijer; Andrew Bagdanov
	Title	Leveraging Unlabeled Data for Crowd Counting by Learning to Rank			Type	Conference Article
	Year	2018	Publication	31st IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	7661 - 7669
	Keywords	Task analysis; Training; Computer vision; Visualization; Estimation; Head; Context modeling
	Abstract	We propose a novel crowd counting approach that leverages abundantly available unlabeled crowd imagery in a learning-to-rank framework. To induce a ranking of cropped images , we use the observation that any sub-image of a crowded scene image is guaranteed to contain the same number or fewer persons than the super-image. This allows us to address the problem of limited size of existing datasets for crowd counting. We collect two crowd scene datasets from Google using keyword searches and queryby-example image retrieval, respectively. We demonstrate how to efficiently learn from these unlabeled datasets by incorporating learning-to-rank in a multi-task network which simultaneously ranks images and estimates crowd density maps. Experiments on two of the most challenging crowd counting datasets show that our approach obtains state-ofthe-art results.
	Address	Salt Lake City; USA; June 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPR
	Notes	LAMP; 600.109; 600.106; 600.120			Approved	no
	Call Number	Admin @ si @ LWB2018			Serial	3159
Permanent link to this record



	Author	Fernando Vilariño; Dimosthenis Karatzas; Alberto Valcarce
	Title	Libraries as New Innovation Hubs: The Library Living Lab			Type	Conference Article
	Year	2018	Publication	30th ISPIM Innovation Conference	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Libraries are in deep transformation both in EU and around the world, and they are thriving within a great window of opportunity for innovation. In this paper, we show how the Library Living Lab in Barcelona participated of this changing scenario and contributed to create the Bibliolab program, where more than 200 public libraries give voice to their users in a global user-centric innovation initiative, using technology as enabling factor. The Library Living Lab is a real 4-helix implementation where Universities, Research Centers, Public Administration, Companies and the Neighbors are joint together to explore how technology transforms the cultural experience of people. This case is an example of scalability and provides reference tools for policy making, sustainability, user engage methodologies and governance. We provide specific examples of new prototypes and services that help to understand how to redefine the role of the Library as a real hub for social innovation.
	Address	Stockholm; May 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ISPIM
	Notes	DAG; MV; 600.097; 600.121; 600.129;SIAI			Approved	no
	Call Number	Admin @ si @ VKV2018b			Serial	3154
Permanent link to this record



	Author	Ozan Caglayan; Adrien Bardet; Fethi Bougares; Loic Barrault; Kai Wang; Marc Masana; Luis Herranz; Joost Van de Weijer
	Title	LIUM-CVC Submissions for WMT18 Multimodal Translation Task			Type	Conference Article
	Year	2018	Publication	3rd Conference on Machine Translation	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	This paper describes the multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT18 Shared Task on Multimodal Translation. This year we propose several modifications to our previou multimodal attention architecture in order to better integrate convolutional features and refine them using encoder-side information. Our final constrained submissions ranked first for English→French and second for English→German language pairs among the constrained submissions according to the automatic evaluation metric METEOR.
	Address	Brussels; Belgium; October 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WMT
	Notes	LAMP; 600.106; 600.120			Approved	no
	Call Number	Admin @ si @ CBB2018			Serial	3240
Permanent link to this record



	Author	Sergio Escalera; Jordi Gonzalez; Hugo Jair Escalante; Xavier Baro; Isabelle Guyon
	Title	Looking at People Special Issue			Type	Journal Article
	Year	2018	Publication	International Journal of Computer Vision	Abbreviated Journal	IJCV
	Volume	126	Issue	2-4	Pages	141-143
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; ISE; 600.119			Approved	no
	Call Number	Admin @ si @ EGJ2018			Serial	3093
Permanent link to this record



	Author	Md. Mostafa Kamal Sarker; Hatem A. Rashwan; Hatem A. Rashwan; Estefania Talavera; Syeda Furruka Banu; Petia Radeva; Domenec Puig
	Title	MACNet: Multi-scale Atrous Convolution Networks for Food Places Classification in Egocentric Photo-streams			Type	Conference Article
	Year	2018	Publication	European Conference on Computer Vision workshops	Abbreviated Journal
	Volume		Issue		Pages	423-433
	Keywords
	Abstract	First-person (wearable) camera continually captures unscripted interactions of the camera user with objects, people, and scenes reflecting his personal and relational tendencies. One of the preferences of people is their interaction with food events. The regulation of food intake and its duration has a great importance to protect against diseases. Consequently, this work aims to develop a smart model that is able to determine the recurrences of a person on food places during a day. This model is based on a deep end-to-end model for automatic food places recognition by analyzing egocentric photo-streams. In this paper, we apply multi-scale Atrous convolution networks to extract the key features related to food places of the input images. The proposed model is evaluated on an in-house private dataset called “EgoFoodPlaces”. Experimental results shows promising results of food places classification recognition in egocentric photo-streams.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LCNS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCVW
	Notes	MILAB; no menciona			Approved	no
	Call Number	Admin @ si @ SRR2018b			Serial	3185
Permanent link to this record



	Author	David Aldavert; Marçal Rusiñol
	Title	Manuscript text line detection and segmentation using second-order derivatives analysis			Type	Conference Article
	Year	2018	Publication	13th IAPR International Workshop on Document Analysis Systems	Abbreviated Journal
	Volume		Issue		Pages	293 - 298
	Keywords	text line detection; text line segmentation; text region detection; second-order derivatives
	Abstract	In this paper, we explore the use of second-order derivatives to detect text lines on handwritten document images. Taking advantage that the second derivative gives a minimum response when a dark linear element over a bright background has the same orientation as the filter, we use this operator to create a map with the local orientation and strength of putative text lines in the document. Then, we detect line segments by selecting and merging the filter responses that have a similar orientation and scale. Finally, text lines are found by merging the segments that are within the same text region. The proposed segmentation algorithm, is learning-free while showing a performance similar to the state of the art methods in publicly available datasets.
	Address	Viena; Austria; April 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	DAS
	Notes	DAG; 600.084; 600.129; 302.065; 600.121			Approved	no
	Call Number	Admin @ si @ AlR2018a			Serial	3104
Permanent link to this record