Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	121–135 of 141 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >>

List View

Citations

Details

	Records
	Author	Ivet Rafegas; Maria Vanrell
	Title	Color spaces emerging from deep convolutional networks			Type	Conference Article
	Year	2016	Publication	24th Color and Imaging Conference	Abbreviated Journal
	Volume		Issue		Pages	225-230
	Keywords
	Abstract	Award for the best interactive session Defining color spaces that provide a good encoding of spatio-chromatic properties of color surfaces is an open problem in color science [8, 22]. Related to this, in computer vision the fusion of color with local image features has been studied and evaluated [16]. In human vision research, the cells which are selective to specific color hues along the visual pathway are also a focus of attention [7, 14]. In line with these research aims, in this paper we study how color is encoded in a deep Convolutional Neural Network (CNN) that has been trained on more than one million natural images for object recognition. These convolutional nets achieve impressive performance in computer vision, and rival the representations in human brain. In this paper we explore how color is represented in a CNN architecture that can give some intuition about efficient spatio-chromatic representations. In convolutional layers the activation of a neuron is related to a spatial filter, that combines spatio-chromatic representations. We use an inverted version of it to explore the properties. Using a series of unsupervised methods we classify different type of neurons depending on the color axes they define and we propose an index of color-selectivity of a neuron. We estimate the main color axes that emerge from this trained net and we prove that colorselectivity of neurons decreases from early to deeper layers.
	Address	San Diego; USA; November 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CIC
	Notes	CIC			Approved	no
	Call Number	Admin @ si @ RaV2016a			Serial	2894
Permanent link to this record



	Author	Arash Akbarinia; C. Alejandro Parraga
	Title	Dynamically Adjusted Surround Contrast Enhances Boundary Detection, European Conference on Visual Perception			Type	Conference Article
	Year	2016	Publication	European Conference on Visual Perception	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	Barcelona; Spain; August 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECVP
	Notes	NEUROBIT			Approved	no
	Call Number	Admin @ si @ AkP2016b			Serial	2900
Permanent link to this record



	Author	C. Alejandro Parraga; Arash Akbarinia
	Title	Colour Constancy as a Product of Dynamic Centre-Surround Adaptation			Type	Conference Article
	Year	2016	Publication	16th Annual meeting in Vision Sciences Society	Abbreviated Journal
	Volume	16	Issue	12	Pages
	Keywords
	Abstract	Colour constancy refers to the human visual system's ability to preserve the perceived colour of objects despite changes in the illumination. Its exact mechanisms are unknown, although a number of systems ranging from retinal to cortical and memory are thought to play important roles. The strength of the perceptual shift necessary to preserve these colours is usually estimated by the vectorial distances from an ideal match (or canonical illuminant). In this work we explore how much of the colour constancy phenomenon could be explained by well-known physiological properties of V1 and V2 neurons whose receptive fields (RF) vary according to the contrast and orientation of surround stimuli. Indeed, it has been shown that both RF size and the normalization occurring between centre and surround in cortical neurons depend on the local properties of surrounding stimuli. Our stating point is the construction of a computational model which includes this dynamical centre-surround adaptation by means of two overlapping asymmetric Gaussian kernels whose variances are adjusted to the contrast of surrounding pixels to represent the changes in RF size of cortical neurons and the weights of their respective contributions are altered according to differences in centre-surround contrast and orientation. The final output of the model is obtained after convolving an image with this dynamical operator and an estimation of the illuminant is obtained by considering the contrast of the far surround. We tested our algorithm on naturalistic stimuli from several benchmark datasets. Our results show that although our model does not require any training, its performance against the state-of-the-art is highly competitive, even outperforming learning-based algorithms in some cases. Indeed, these results are very encouraging if we consider that they were obtained with the same parameters for all datasets (i.e. just like the human visual system operates).
	Address	Florida; USA; May 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	VSS
	Notes	NEUROBIT			Approved	no
	Call Number	Admin @ si @ PaA2016b			Serial	2901
Permanent link to this record



	Author	Marco Bellantonio; Mohammad A. Haque; Pau Rodriguez; Kamal Nasrollahi; Taisi Telve; Sergio Escalera; Jordi Gonzalez; Thomas B. Moeslund; Pejman Rasti; Golamreza Anbarjafari
	Title	Spatio-Temporal Pain Recognition in CNN-based Super-Resolved Facial Images			Type	Conference Article
	Year	2016	Publication	23rd International Conference on Pattern Recognition	Abbreviated Journal
	Volume	10165	Issue		Pages
	Keywords
	Abstract	Automatic pain detection is a long expected solution to a prevalent medical problem of pain management. This is more relevant when the subject of pain is young children or patients with limited ability to communicate about their pain experience. Computer vision-based analysis of facial pain expression provides a way of efficient pain detection. When deep machine learning methods came into the scene, automatic pain detection exhibited even better performance. In this paper, we figured out three important factors to exploit in automatic pain detection: spatial information available regarding to pain in each of the facial video frames, temporal axis information regarding to pain expression pattern in a subject video sequence, and variation of face resolution. We employed a combination of convolutional neural network and recurrent neural network to setup a deep hybrid pain detection framework that is able to exploit both spatial and temporal pain information from facial video. In order to analyze the effect of different facial resolutions, we introduce a super-resolution algorithm to generate facial video frames with different resolution setups. We investigated the performance on the publicly available UNBC-McMaster Shoulder Pain database. As a contribution, the paper provides novel and important information regarding to the performance of a hybrid deep learning framework for pain detection in facial images of different resolution.
	Address	Cancun; Mexico; December 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	HuPBA; ISE; 600.098; 600.119			Approved	no
	Call Number	Admin @ si @ BHR2016			Serial	2902
Permanent link to this record



	Author	Arnau Baro; Pau Riba; Alicia Fornes
	Title	Towards the recognition of compound music notes in handwritten music scores			Type	Conference Article
	Year	2016	Publication	15th international conference on Frontiers in Handwriting Recognition	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	The recognition of handwritten music scores still remains an open problem. The existing approaches can only deal with very simple handwritten scores mainly because of the variability in the handwriting style and the variability in the composition of groups of music notes (i.e. compound music notes). In this work we focus on this second problem and propose a method based on perceptual grouping for the recognition of compound music notes. Our method has been tested using several handwritten music scores of the CVC-MUSCIMA database and compared with a commercial Optical Music Recognition (OMR) software. Given that our method is learning-free, the obtained results are promising.
	Address	Shenzhen; China; October 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	2167-6445	ISBN		Medium
	Area		Expedition		Conference	ICFHR
	Notes	DAG; 600.097			Approved	no
	Call Number	Admin @ si @ BRF2016			Serial	2903
Permanent link to this record



	Author	Yaxing Wang; L. Zhang; Joost Van de Weijer
	Title	Ensembles of generative adversarial networks			Type	Conference Article
	Year	2016	Publication	30th Annual Conference on Neural Information Processing Systems Worshops	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Ensembles are a popular way to improve results of discriminative CNNs. The combination of several networks trained starting from different initializations improves results significantly. In this paper we investigate the usage of ensembles of GANs. The specific nature of GANs opens up several new ways to construct ensembles. The first one is based on the fact that in the minimax game which is played to optimize the GAN objective the generator network keeps on changing even after the network can be considered optimal. As such ensembles of GANs can be constructed based on the same network initialization but just taking models which have different amount of iterations. These so-called self ensembles are much faster to train than traditional ensembles. The second method, called cascade GANs, redirects part of the training data which is badly modeled by the first GAN to another GAN. In experiments on the CIFAR10 dataset we show that ensembles of GANs obtain model probability distributions which better model the data distribution. In addition, we show that these improved results can be obtained at little additional computational cost.
	Address	Barcelona; Spain; December 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	NIPSW
	Notes	LAMP; 600.068			Approved	no
	Call Number	Admin @ si @ WZW2016			Serial	2905
Permanent link to this record



	Author	Guim Perarnau; Joost Van de Weijer; Bogdan Raducanu; Jose Manuel Alvarez
	Title	Invertible conditional gans for image editing			Type	Conference Article
	Year	2016	Publication	30th Annual Conference on Neural Information Processing Systems Worshops	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Generative Adversarial Networks (GANs) have recently demonstrated to successfully approximate complex data distributions. A relevant extension of this model is conditional GANs (cGANs), where the introduction of external information allows to determine specific representations of the generated images. In this work, we evaluate encoders to inverse the mapping of a cGAN, i.e., mapping a real image into a latent space and a conditional representation. This allows, for example, to reconstruct and modify real images of faces conditioning on arbitrary attributes. Additionally, we evaluate the design of cGANs. The combination of an encoder with a cGAN, which we call Invertible cGAN (IcGAN), enables to re-generate real images with deterministic complex modifications.
	Address	Barcelona; Spain; December 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	NIPSW
	Notes	LAMP; ADAS; 600.068			Approved	no
	Call Number	Admin @ si @ PWR2016			Serial	2906
Permanent link to this record



	Author	Oriol Vicente; Alicia Fornes; Ramon Valdes
	Title	The Digital Humanities Network of the UABCie: a smart structure of research and social transference for the digital humanities			Type	Conference Article
	Year	2016	Publication	Digital Humanities Centres: Experiences and Perspectives	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	Warsaw; Poland; December 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	DHLABS
	Notes	DAG; 600.097			Approved	no
	Call Number	Admin @ si @ VFV2016			Serial	2908
Permanent link to this record



	Author	Veronica Romero; Alicia Fornes; Enrique Vidal; Joan Andreu Sanchez
	Title	Using the MGGI Methodology for Category-based Language Modeling in Handwritten Marriage Licenses Books			Type	Conference Article
	Year	2016	Publication	15th international conference on Frontiers in Handwriting Recognition	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Handwritten marriage licenses books have been used for centuries by ecclesiastical and secular institutions to register marriages. The information contained in these historical documents is useful for demography studies and genealogical research, among others. Despite the generally simple structure of the text in these documents, automatic transcription and semantic information extraction is difficult due to the distinct and evolutionary vocabulary, which is composed mainly of proper names that change along the time. In previous works we studied the use of category-based language models to both improve the automatic transcription accuracy and make easier the extraction of semantic information. Here we analyze the main causes of the semantic errors observed in previous results and apply a Grammatical Inference technique known as MGGI to improve the semantic accuracy of the language model obtained. Using this language model, full handwritten text recognition experiments have been carried out, with results supporting the interest of the proposed approach.
	Address	Shenzhen; China; October 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICFHR
	Notes	DAG; 600.097; 602.006			Approved	no
	Call Number	Admin @ si @ RFV2016			Serial	2909
Permanent link to this record



	Author	Iiris Lusi; Sergio Escalera; Gholamreza Anbarjafari
	Title	Human Head Pose Estimation on SASE database using Random Hough Regression Forests			Type	Conference Article
	Year	2016	Publication	23rd International Conference on Pattern Recognition Workshops	Abbreviated Journal
	Volume	10165	Issue		Pages
	Keywords
	Abstract	In recent years head pose estimation has become an important task in face analysis scenarios. Given the availability of high resolution 3D sensors, the design of a high resolution head pose database would be beneficial for the community. In this paper, Random Hough Forests are used to estimate 3D head pose and location on a new 3D head database, SASE, which represents the baseline performance on the new data for an upcoming international head pose estimation competition. The data in SASE is acquired with a Microsoft Kinect 2 camera, including the RGB and depth information of 50 subjects with a large sample of head poses, allowing us to test methods for real-life scenarios. We briefly review the database while showing baseline head pose estimation results based on Random Hough Forests.
	Address	Cancun; Mexico; December 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPRW
	Notes	HuPBA;			Approved	no
	Call Number	Admin @ si @ LEA2016b			Serial	2910
Permanent link to this record



	Author	Xavier Baro; Sergio Escalera; Isabelle Guyon; Julio C. S. Jacques Junior; Lukasz Romaszko; Lisheng Sun; Sebastien Treguer; Evelyne Viegas
	Title	Coompetitions in machine learning: case studies			Type	Conference Article
	Year	2016	Publication	30th Annual Conference on Neural Information Processing Systems Worshops	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	Barcelona; Spain; December 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	NIPSW
	Notes	HuPBA			Approved	no
	Call Number	Admin @ si @ BEG2016			Serial	2911
Permanent link to this record



	Author	Carles Sanchez; Debora Gil; T. Gache; N. Koufos; Marta Diez-Ferrer; Antoni Rosell
	Title	SENSA: a System for Endoscopic Stenosis Assessment			Type	Conference Article
	Year	2016	Publication	28th Conference of the international Society for Medical Innovation and Technology	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Documenting the severity of a static or dynamic Central Airway Obstruction (CAO) is crucial to establish proper diagnosis and treatment, predict possible treatment effects and better follow-up the patients. The subjective visual evaluation of a stenosis during video-bronchoscopy still remains the most common way to assess a CAO in spite of a consensus among experts for a need to standardize all calculations [1]. The Computer Vision Center in cooperation with the «Hospital de Bellvitge», has developed a System for Endoscopic Stenosis Assessment (SENSA), which computes CAO directly by analyzing standard bronchoscopic data without the need of using other imaging tecnologies.
	Address	Rotterdam; The Netherlands; October 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	SMIT
	Notes	IAM;			Approved	no
	Call Number	Admin @ si @ SGG2016			Serial	2942
Permanent link to this record



	Author	Victor Ponce
	Title	Evolutionary Bags of Space-Time Features for Human Analysis			Type	Book Whole
	Year	2016	Publication	PhD Thesis Universitat de Barcelona, UOC and CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Computer algorithms; Digital image processing; Digital video; Analysis of variance; Dynamic programming; Evolutionary computation; Gesture
	Abstract	The representation (or feature) learning has been an emerging concept in the last years, since it collects a set of techniques that are present in any theoretical or practical methodology referring to artificial intelligence. In computer vision, a very common representation has adopted the form of the well-known Bag of Visual Words. This representation appears implicitly in most approaches where images are described, and is also present in a huge number of areas and domains: image content retrieval, pedestrian detection, human-computer interaction, surveillance, e-health, and social computing, amongst others. The early stages of this dissertation provide an approach for learning visual representations inside evolutionary algorithms, which consists of evolving weighting schemes to improve the BoVW representations for the task of recognizing categories of videos and images. Thus, we demonstrate the applicability of the most common weighting schemes, which are often used in text mining but are less frequently found in computer vision tasks. Beyond learning these visual representations, we provide an approach based on fusion strategies for learning spatiotemporal representations, from multimodal data obtained by depth sensors. Besides, we specially aim at the evolutionary and dynamic modelling, where the temporal factor is present in the nature of the data, such as video sequences of gestures and actions. Indeed, we explore the effects of probabilistic modelling for those approaches based on dynamic programming, so as to handle the temporal deformation and variance amongst video sequences of different categories. Finally, we integrate dynamic programming and generative models into an evolutionary computation framework, with the aim of learning Bags of SubGestures (BoSG) representations and hence to improve the generalization capability of standard gesture recognition approaches. The results obtained in the experimentation demonstrate, first, that evolutionary algorithms are useful for improving the representation of BoVW approaches in several datasets for recognizing categories in still images and video sequences. On the other hand, our experimentation reveals that both, the use of dynamic programming and generative models to align video sequences, and the representations obtained from applying fusion strategies in multimodal data, entail an enhancement on the performance when recognizing some gesture categories. Furthermore, the combination of evolutionary algorithms with models based on dynamic programming and generative approaches results, when aiming at the classification of video categories on large video datasets, in a considerable improvement over standard gesture and action recognition approaches. Finally, we demonstrate the applications of these representations in several domains for human analysis: classification of images where humans may be present, action and gesture recognition for general applications, and in particular for conversational settings within the field of restorative justice
	Address	June 2016
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Sergio Escalera;Xavier Baro;Hugo Jair Escalante
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA			Approved	no
	Call Number	Pon2016			Serial	2814
Permanent link to this record



	Author	Simone Balocco; Maria Zuluaga; Guillaume Zahnd; Su-Lin Lee; Stefanie Demirci
	Title	Computing and Visualization for Intravascular Imaging and Computer Assisted Stenting			Type	Book Whole
	Year	2016	Publication	Computing and Visualization for Intravascular Imaging and Computer-Assisted Stenting	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher	Elsevier	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	9780128110188	Medium
	Area		Expedition		Conference
	Notes	MILAB			Approved	no
	Call Number	Admin @ si @ BZZ2016			Serial	2821
Permanent link to this record



	Author	German Ros
	Title	Visual Scene Understanding for Autonomous Vehicles: Understanding Where and What			Type	Book Whole
	Year	2016	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Making Ground Autonomous Vehicles (GAVs) a reality as a service for the society is one of the major scientific and technological challenges of this century. The potential benefits of autonomous vehicles include reducing accidents, improving traffic congestion and better usage of road infrastructures, among others. These vehicles must operate in our cities, towns and highways, dealing with many different types of situations while respecting traffic rules and protecting human lives. GAVs are expected to deal with all types of scenarios and situations, coping with an uncertain and chaotic world. Therefore, in order to fulfill these demanding requirements GAVs need to be endowed with the capability of understanding their surrounding at many different levels, by means of affordable sensors and artificial intelligence. This capacity to understand the surroundings and the current situation that the vehicle is involved in is called scene understanding. In this work we investigate novel techniques to bring scene understanding to autonomous vehicles by combining the use of cameras as the main source of information—due to their versatility and affordability—and algorithms based on computer vision and machine learning. We investigate different degrees of understanding of the scene, starting from basic geometric knowledge about where is the vehicle within the scene. A robust and efficient estimation of the vehicle location and pose with respect to a map is one of the most fundamental steps towards autonomous driving. We study this problem from the point of view of robustness and computational efficiency, proposing key insights to improve current solutions. Then we advance to higher levels of abstraction to discover what is in the scene, by recognizing and parsing all the elements present on a driving scene, such as roads, sidewalks, pedestrians, etc. We investigate this problem known as semantic segmentation, proposing new approaches to improve recognition accuracy and computational efficiency. We cover these points by focusing on key aspects such as: (i) how to leverage computation moving semantics to an offline process, (ii) how to train compact architectures based on deconvolutional networks to achieve their maximum potential, (iii) how to use virtual worlds in combination with domain adaptation to produce accurate models in a cost-effective fashion, and (iv) how to use transfer learning techniques to prepare models to new situations. We finally extend the previous level of knowledge enabling systems to reasoning about what has change in a scene with respect to a previous visit, which in return allows for efficient and cost-effective map updating.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Angel Sappa;Julio Guerrero;Antonio Lopez
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-945373-1-8	Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ Ros2016			Serial	2860
Permanent link to this record