Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	2326–2340 of 3413 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

[141–150] << 151 152 153 154 155 156 157 158 159 160 >> [161–170]

List View

Citations

Details

	Records
	Author	David Vazquez; Jiaolong Xu; Sebastian Ramos; Antonio Lopez; Daniel Ponsa
	Title	Weakly Supervised Automatic Annotation of Pedestrian Bounding Boxes			Type	Conference Article
	Year	2013	Publication	CVPR Workshop on Ground Truth – What is a good dataset?	Abbreviated Journal
	Volume		Issue		Pages	706 - 711
	Keywords	Pedestrian Detection; Domain Adaptation
	Abstract	Among the components of a pedestrian detector, its trained pedestrian classifier is crucial for achieving the desired performance. The initial task of the training process consists in collecting samples of pedestrians and background, which involves tiresome manual annotation of pedestrian bounding boxes (BBs). Thus, recent works have assessed the use of automatically collected samples from photo-realistic virtual worlds. However, learning from virtual-world samples and testing in real-world images may suffer the dataset shift problem. Accordingly, in this paper we assess an strategy to collect samples from the real world and retrain with them, thus avoiding the dataset shift, but in such a way that no BBs of real-world pedestrians have to be provided. In particular, we train a pedestrian classifier based on virtual-world samples (no human annotation required). Then, using such a classifier we collect pedestrian samples from real-world images by detection. After, a human oracle rejects the false detections efficiently (weak annotation). Finally, a new classifier is trained with the accepted detections. We show that this classifier is competitive with respect to the counterpart trained with samples collected by manually annotating hundreds of pedestrian BBs.
	Address	Portland; Oregon; June 2013
	Corporate Author				Thesis
	Publisher	IEEE	Place of Publication		Editor
	Language	English	Summary Language	English	Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	ADAS; 600.054; 600.057; 601.217			Approved	no
	Call Number	ADAS @ adas @ VXR2013a			Serial	2219
Permanent link to this record



	Author	Jiaolong Xu; David Vazquez; Sebastian Ramos; Antonio Lopez; Daniel Ponsa
	Title	Adapting a Pedestrian Detector by Boosting LDA Exemplar Classifiers			Type	Conference Article
	Year	2013	Publication	CVPR Workshop on Ground Truth – What is a good dataset?	Abbreviated Journal
	Volume		Issue		Pages	688 - 693
	Keywords	Pedestrian Detection; Domain Adaptation
	Abstract	Training vision-based pedestrian detectors using synthetic datasets (virtual world) is a useful technique to collect automatically the training examples with their pixel-wise ground truth. However, as it is often the case, these detectors must operate in real-world images, experiencing a significant drop of their performance. In fact, this effect also occurs among different real-world datasets, i.e. detectors' accuracy drops when the training data (source domain) and the application scenario (target domain) have inherent differences. Therefore, in order to avoid this problem, it is required to adapt the detector trained with synthetic data to operate in the real-world scenario. In this paper, we propose a domain adaptation approach based on boosting LDA exemplar classifiers from both virtual and real worlds. We evaluate our proposal on multiple real-world pedestrian detection datasets. The results show that our method can efficiently adapt the exemplar classifiers from virtual to real world, avoiding drops in average precision over the 15%.
	Address	Portland; oregon; June 2013
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language	English	Summary Language	English	Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	ADAS; 600.054; 600.057; 601.217			Approved	yes
	Call Number	XVR2013; ADAS @ adas @ xvr2013a			Serial	2220
Permanent link to this record



	Author	Carlo Gatta; Adriana Romero; Joost Van de Weijer
	Title	Unrolling loopy top-down semantic feedback in convolutional deep networks			Type	Conference Article
	Year	2014	Publication	Workshop on Deep Vision: Deep Learning for Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	498-505
	Keywords
	Abstract	In this paper, we propose a novel way to perform top-down semantic feedback in convolutional deep networks for efficient and accurate image parsing. We also show how to add global appearance/semantic features, which have shown to improve image parsing performance in state-of-the-art methods, and was not present in previous convolutional approaches. The proposed method is characterised by an efficient training and a sufficiently fast testing. We use the well known SIFTflow dataset to numerically show the advantages provided by our contributions, and to compare with state-of-the-art image parsing convolutional based approaches.
	Address	Columbus; Ohio; June 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	LAMP; MILAB; 601.160; 600.079			Approved	no
	Call Number	Admin @ si @ GRW2014			Serial	2490
Permanent link to this record



	Author	Bogdan Raducanu; Alireza Bosaghzadeh; Fadi Dornaika
	Title	Multi-observation Face Recognition in Videos based on Label Propagation			Type	Conference Article
	Year	2015	Publication	6th Workshop on Analysis and Modeling of Faces and Gestures AMFG2015	Abbreviated Journal
	Volume		Issue		Pages	10-17
	Keywords
	Abstract	In order to deal with the huge amount of content generated by social media, especially for indexing and retrieval purposes, the focus shifted from single object recognition to multi-observation object recognition. Of particular interest is the problem of face recognition (used as primary cue for persons’ identity assessment), since it is highly required by popular social media search engines like Facebook and Youtube. Recently, several approaches for graph-based label propagation were proposed. However, the associated graphs were constructed in an ad-hoc manner (e.g., using the KNN graph) that cannot cope properly with the rapid and frequent changes in data appearance, a phenomenon intrinsically related with video sequences. In this paper, we propose a novel approach for efficient and adaptive graph construction, based on a two-phase scheme: (i) the first phase is used to adaptively find the neighbors of a sample and also to find the adequate weights for the minimization function of the second phase; (ii) in the second phase, the selected neighbors along with their corresponding weights are used to locally and collaboratively estimate the sparse affinity matrix weights. Experimental results performed on Honda Video Database (HVDB) and a subset of video sequences extracted from the popular TV-series ’Friends’ show a distinct advantage of the proposed method over the existing standard graph construction methods.
	Address	Boston; USA; June 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	LAMP; 600.068; 600.072;			Approved	no
	Call Number	Admin @ si @ RBD2015			Serial	2627
Permanent link to this record



	Author	Santiago Segui; Oriol Pujol; Jordi Vitria
	Title	Learning to count with deep object features			Type	Conference Article
	Year	2015	Publication	Deep Vision: Deep Learning in Computer Vision, CVPR 2015 Workshop	Abbreviated Journal
	Volume		Issue		Pages	90-96
	Keywords
	Abstract	Learning to count is a learning strategy that has been recently proposed in the literature for dealing with problems where estimating the number of object instances in a scene is the final objective. In this framework, the task of learning to detect and localize individual object instances is seen as a harder task that can be evaded by casting the problem as that of computing a regression value from hand-crafted image features. In this paper we explore the features that are learned when training a counting convolutional neural network in order to understand their underlying representation. To this end we define a counting problem for MNIST data and show that the internal representation of the network is able to classify digits in spite of the fact that no direct supervision was provided for them during training. We also present preliminary results about a deep network that is able to count the number of pedestrians in a scene.
	Address	Boston; USA; June 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	MILAB; HuPBA; OR;MV			Approved	no
	Call Number	Admin @ si @ SPV2015			Serial	2636
Permanent link to this record



	Author	Xavier Baro; Jordi Gonzalez; Junior Fabian; Miguel Angel Bautista; Marc Oliu; Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera
	Title	ChaLearn Looking at People 2015 challenges: action spotting and cultural event recognition			Type	Conference Article
	Year	2015	Publication	2015 IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW)	Abbreviated Journal
	Volume		Issue		Pages	1-9
	Keywords
	Abstract	Following previous series on Looking at People (LAP) challenges [6, 5, 4], ChaLearn ran two competitions to be presented at CVPR 2015: action/interaction spotting and cultural event recognition in RGB data. We ran a second round on human activity recognition on RGB data sequences. In terms of cultural event recognition, tens of categories have to be recognized. This involves scene understanding and human analysis. This paper summarizes the two performed challenges and obtained results. Details of the ChaLearn LAP competitions can be found at http://gesture.chalearn.org/.
	Address	Boston; EEUU; June 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	HuPBA;MV			Approved	no
	Call Number				Serial	2652
Permanent link to this record



	Author	Andres Traumann; Sergio Escalera; Gholamreza Anbarjafari
	Title	A New Retexturing Method for Virtual Fitting Room Using Kinect 2 Camera			Type	Conference Article
	Year	2015	Publication	2015 IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW)	Abbreviated Journal
	Volume		Issue		Pages	75-79
	Keywords
	Abstract
	Address	Boston; EEUU; June 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ TEA2015			Serial	2653
Permanent link to this record



	Author	Ramin Irani; Kamal Nasrollahi; Chris Bahnsen; D.H. Lundtoft; Thomas B. Moeslund; Marc O. Simon; Ciprian Corneanu; Sergio Escalera; Tanja L. Pedersen; Maria-Louise Klitgaard; Laura Petrini
	Title	Spatio-temporal Analysis of RGB-D-T Facial Images for Multimodal Pain Level Recognition			Type	Conference Article
	Year	2015	Publication	2015 IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW)	Abbreviated Journal
	Volume		Issue		Pages	88-95
	Keywords
	Abstract	Pain is a vital sign of human health and its automatic detection can be of crucial importance in many different contexts, including medical scenarios. While most available computer vision techniques are based on RGB, in this paper, we investigate the effect of combining RGB, depth, and thermal facial images for pain detection and pain intensity level recognition. For this purpose, we extract energies released by facial pixels using a spatiotemporal filter. Experiments on a group of 12 elderly people applying the multimodal approach show that the proposed method successfully detects pain and recognizes between three intensity levels in 82% of the analyzed frames improving more than 6% over RGB only analysis in similar conditions.
	Address	Boston; EEUU; June 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ INB2015			Serial	2654
Permanent link to this record



	Author	Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera; Albert Clapes; Kamal Nasrollahi; Michael Holte; Thomas B. Moeslund
	Title	Keep it Accurate and Diverse: Enhancing Action Recognition Performance by Ensemble Learning			Type	Conference Article
	Year	2015	Publication	IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW)	Abbreviated Journal
	Volume		Issue		Pages	22-29
	Keywords
	Abstract	The performance of different action recognition techniques has recently been studied by several computer vision researchers. However, the potential improvement in classification through classifier fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of action learning techniques, each performing the recognition task from a different perspective. The underlying idea is that instead of aiming a very sophisticated and powerful representation/learning technique, we can learn action categories using a set of relatively simple and diverse classifiers, each trained with different feature set. In addition, combining the outputs of several learners can reduce the risk of an unfortunate selection of a learner on an unseen action recognition scenario. This leads to having a more robust and general-applicable framework. In order to improve the recognition performance, a powerful combination strategy is utilized based on the Dempster-Shafer theory, which can effectively make use of diversity of base learners trained on different sources of information. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers’ output, showing enhanced performance of the proposed methodology.
	Address	Boston; EEUU; June 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ BGE2015			Serial	2655
Permanent link to this record



	Author	Jun Wan; Yibing Zhao; Shuai Zhou; Isabelle Guyon; Sergio Escalera
	Title	ChaLearn Looking at People RGB-D Isolated and Continuous Datasets for Gesture Recognition			Type	Conference Article
	Year	2016	Publication	29th IEEE Conference on Computer Vision and Pattern Recognition Worshops	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In this paper, we present two large video multi-modal datasets for RGB and RGB-D gesture recognition: the ChaLearn LAP RGB-D Isolated Gesture Dataset (IsoGD)and the Continuous Gesture Dataset (ConGD). Both datasets are derived from the ChaLearn Gesture Dataset (CGD) that has a total of more than 50000 gestures for the “one-shot-learning” competition. To increase the potential of the old dataset, we designed new well curated datasets composed of 249 gesture labels, and including 47933 gestures manually labeled the begin and end frames in sequences.Using these datasets we will open two competitions on the CodaLab platform so that researchers can test and compare their methods for “user independent” gesture recognition. The first challenge is designed for gesture spotting and recognition in continuous sequences of gestures while the second one is designed for gesture classification from segmented data. The baseline method based on the bag of visual words model is also presented.
	Address	Las Vegas; USA; July 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	HuPBA;MILAB;			Approved	no
	Call Number	Admin @ si @ WZZ2016			Serial	2771
Permanent link to this record



	Author	Cristhian A. Aguilera-Carrasco; F. Aguilera; Angel Sappa; C. Aguilera; Ricardo Toledo
	Title	Learning cross-spectral similarity measures with deep convolutional neural networks			Type	Conference Article
	Year	2016	Publication	29th IEEE Conference on Computer Vision and Pattern Recognition Worshops	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	The simultaneous use of images from different spectracan be helpful to improve the performance of many computer vision tasks. The core idea behind the usage of crossspectral approaches is to take advantage of the strengths of each spectral band providing a richer representation of a scene, which cannot be obtained with just images from one spectral band. In this work we tackle the cross-spectral image similarity problem by using Convolutional Neural Networks (CNNs). We explore three different CNN architectures to compare the similarity of cross-spectral image patches. Specifically, we train each network with images from the visible and the near-infrared spectrum, and then test the result with two public cross-spectral datasets. Experimental results show that CNN approaches outperform the current state-of-art on both cross-spectral datasets. Additionally, our experiments show that some CNN architectures are capable of generalizing between different crossspectral domains.
	Address	Las vegas; USA; June 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	ADAS; 600.086; 600.076			Approved	no
	Call Number	Admin @ si @AAS2016			Serial	2809
Permanent link to this record



	Author	Sergio Escalera; Mercedes Torres-Torres; Brais Martinez; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Georgios Tzimiropoulos; Ciprian Corneanu; Marc Oliu Simón; Mohammad Ali Bagheri; Michel Valstar
	Title	ChaLearn Looking at People and Faces of the World: Face AnalysisWorkshop and Challenge 2016			Type	Conference Article
	Year	2016	Publication	29th IEEE Conference on Computer Vision and Pattern Recognition Workshops	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	We present the 2016 ChaLearn Looking at People and Faces of the World Challenge and Workshop, which ran three competitions on the common theme of face analysis from still images. The first one, Looking at People, addressed age estimation, while the second and third competitions, Faces of the World, addressed accessory classification and smile and gender classification, respectively. We present two crowd-sourcing methodologies used to collect manual annotations. A custom-build application was used to collect and label data about the apparent age of people (as opposed to the real age). For the Faces of the World data, the citizen-science Zooniverse platform was used. This paper summarizes the three challenges and the data used, as well as the results achieved by the participants of the competitions. Details of the ChaLearn LAP FotW competitions can be found at http://gesture.chalearn.org.
	Address	Las Vegas; USA; June 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	HuPBA;MV;			Approved	no
	Call Number	ETM2016			Serial	2849
Permanent link to this record



	Author	Simon Jégou; Michal Drozdzal; David Vazquez; Adriana Romero; Yoshua Bengio
	Title	The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation			Type	Conference Article
	Year	2017	Publication	IEEE Conference on Computer Vision and Pattern Recognition Workshops	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Semantic Segmentation
	Abstract	State-of-the-art approaches for semantic image segmentation are built on Convolutional Neural Networks (CNNs). The typical segmentation architecture is composed of (a) a downsampling path responsible for extracting coarse semantic features, followed by (b) an upsampling path trained to recover the input image resolution at the output of the model and, optionally, (c) a post-processing module (e.g. Conditional Random Fields) to refine the model predictions. Recently, a new CNN architecture, Densely Connected Convolutional Networks (DenseNets), has shown excellent results on image classification tasks. The idea of DenseNets is based on the observation that if each layer is directly connected to every other layer in a feed-forward fashion then the network will be more accurate and easier to train. In this paper, we extend DenseNets to deal with the problem of semantic segmentation. We achieve state-of-the-art results on urban scene benchmark datasets such as CamVid and Gatech, without any further post-processing module nor pretraining. Moreover, due to smart construction of the model, our approach has much less parameters than currently published best entries for these datasets.
	Address	Honolulu; USA; July 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	MILAB; ADAS; 600.076; 600.085; 601.281			Approved	no
	Call Number	ADAS @ adas @ JDV2016			Serial	2866
Permanent link to this record



	Author	Patricia Suarez; Angel Sappa; Boris X. Vintimilla
	Title	Infrared Image Colorization based on a Triplet DCGAN Architecture			Type	Conference Article
	Year	2017	Publication	IEEE Conference on Computer Vision and Pattern Recognition Workshops	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	This paper proposes a novel approach for colorizing near infrared (NIR) images using Deep Convolutional Generative Adversarial Network (GAN) architectures. The proposed approach is based on the usage of a triplet model for learning each color channel independently, in a more homogeneous way. It allows a fast convergence during the training, obtaining a greater similarity between the given NIR image and the corresponding ground truth. The proposed approach has been evaluated with a large data set of NIR images and compared with a recent approach, which is also based on a GAN architecture but in this case all the color channels are obtained at the same time.
	Address	Honolulu; Hawaii; USA; July 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	ADAS; 600.086; 600.118			Approved	no
	Call Number	Admin @ si @ SSV2017b			Serial	2920
Permanent link to this record



	Author	Arka Ujjal Dey; Suman Ghosh; Ernest Valveny
	Title	Don't only Feel Read: Using Scene text to understand advertisements			Type	Conference Article
	Year	2018	Publication	IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	We propose a framework for automated classification of Advertisement Images, using not just Visual features but also Textual cues extracted from embedded text. Our approach takes inspiration from the assumption that Ad images contain meaningful textual content, that can provide discriminative semantic interpretetion, and can thus aid in classifcation tasks. To this end, we develop a framework using off-the-shelf components, and demonstrate the effectiveness of Textual cues in semantic Classfication tasks.
	Address	Salt Lake City; Utah; USA; June 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	DAG; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ DGV2018			Serial	3551
Permanent link to this record