|   | 
Details
   web
Records
Author Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera; Huamin Ren; Thomas B. Moeslund; Elham Etemad
Title Locality Regularized Group Sparse Coding for Action Recognition Type Journal Article
Year 2017 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU
Volume 158 Issue Pages 106-114
Keywords Bag of words; Feature encoding; Locality constrained coding; Group sparse coding; Alternating direction method of multipliers; Action recognition
Abstract Bag of visual words (BoVW) models are widely utilized in image/ video representation and recognition. The cornerstone of these models is the encoding stage, in which local features are decomposed over a codebook in order to obtain a representation of features. In this paper, we propose a new encoding algorithm by jointly encoding the set of local descriptors of each sample and considering the locality structure of descriptors. The proposed method takes advantages of locality coding such as its stability and robustness to noise in descriptors, as well as the strengths of the group coding strategy by taking into account the potential relation among descriptors of a sample. To efficiently implement our proposed method, we consider the Alternating Direction Method of Multipliers (ADMM) framework, which results in quadratic complexity in the problem size. The method is employed for a challenging classification problem: action recognition by depth cameras. Experimental results demonstrate the outperformance of our methodology compared to the state-of-the-art on the considered datasets.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ BGE2017 Serial 3014
Permanent link to this record
 

 
Author Miguel Angel Bautista; Oriol Pujol; Fernando De la Torre; Sergio Escalera
Title Error-Correcting Factorization Type Journal Article
Year 2018 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 40 Issue Pages 2388-2401
Keywords
Abstract Error Correcting Output Codes (ECOC) is a successful technique in multi-class classification, which is a core problem in Pattern Recognition and Machine Learning. A major advantage of ECOC over other methods is that the multi- class problem is decoupled into a set of binary problems that are solved independently. However, literature defines a general error-correcting capability for ECOCs without analyzing how it distributes among classes, hindering a deeper analysis of pair-wise error-correction. To address these limitations this paper proposes an Error-Correcting Factorization (ECF) method, our contribution is three fold: (I) We propose a novel representation of the error-correction capability, called the design matrix, that enables us to build an ECOC on the basis of allocating correction to pairs of classes. (II) We derive the optimal code length of an ECOC using rank properties of the design matrix. (III) ECF is formulated as a discrete optimization problem, and a relaxed solution is found using an efficient constrained block coordinate descent approach. (IV) Enabled by the flexibility introduced with the design matrix we propose to allocate the error-correction on classes that are prone to confusion. Experimental results in several databases show that when allocating the error-correction to confusable classes ECF outperforms state-of-the-art approaches.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0162-8828 ISBN Medium
Area Expedition Conference
Notes HuPBA; no menciona Approved no
Call Number Admin @ si @ BPT2018 Serial 3015
Permanent link to this record
 

 
Author Patricia Suarez; Angel Sappa; Boris X. Vintimilla
Title Colorizing Infrared Images through a Triplet Conditional DCGAN Architecture Type Conference Article
Year 2017 Publication 19th international conference on image analysis and processing Abbreviated Journal
Volume Issue Pages
Keywords CNN in Multispectral Imaging; Image Colorization
Abstract This paper focuses on near infrared (NIR) image colorization by using a Conditional Deep Convolutional Generative Adversarial Network (CDCGAN) architecture model. The proposed architecture is based on the usage of a conditional probabilistic generative model. Firstly, it learns to colorize the given input image, by using a triplet model architecture that tackle every channel in an independent way. In the proposed model, the nal layer of red channel consider the infrared image to enhance the details, resulting in a sharp RGB image. Then, in the second stage, a discriminative model is used to estimate the probability that the generated image came from the training dataset, rather than the image automatically generated. Experimental results with a large set of real images are provided showing the validity of the proposed approach. Additionally, the proposed approach is compared with a state of the art approach showing better results.
Address Catania; Italy; September 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIAP
Notes ADAS; MSIAU; 600.086; 600.122; 600.118 Approved no
Call Number Admin @ si @ SSV2017c Serial 3016
Permanent link to this record
 

 
Author Meysam Madadi
Title Human Segmentation, Pose Estimation and Applications Type Book Whole
Year 2017 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Automatic analyzing humans in photographs or videos has great potential applications in computer vision, including medical diagnosis, sports, entertainment, movie editing and surveillance, just to name a few. Body, face and hand are the most studied components of humans. Body has many variabilities in shape and clothing along with high degrees of freedom in pose. Face has many muscles causing many visible deformity, beside variable shape and hair style. Hand is a small object, moving fast and has high degrees of freedom. Adding human characteristics to all aforementioned variabilities makes human analysis quite a challenging task.
In this thesis, we developed human segmentation in different modalities. In a first scenario, we segmented human body and hand in depth images using example-based shape warping. We developed a shape descriptor based on shape context and class probabilities of shape regions to extract nearest neighbors. We then considered rigid affine alignment vs. nonrigid iterative shape warping. In a second scenario, we segmented face in RGB images using convolutional neural networks (CNN). We modeled conditional random field with recurrent neural networks. In our model pair-wise kernels are not fixed and learned during training. We trained the network end-to-end using adversarial networks which improved hair segmentation by a high margin.
We also worked on 3D hand pose estimation in depth images. In a generative approach, we fitted a finger model separately for each finger based on our example-based rigid hand segmentation. We minimized an energy function based on overlapping area, depth discrepancy and finger collisions. We also applied linear models in joint trajectory space to refine occluded joints based on visible joints error and invisible joints trajectory smoothness. In a CNN-based approach, we developed a tree-structure network to train specific features for each finger and fused them for global pose consistency. We also formulated physical and appearance constraints as loss functions.
Finally, we developed a number of applications consisting of human soft biometrics measurement and garment retexturing. We also generated some datasets in this thesis consisting of human segmentation, synthetic hand pose, garment retexturing and Italian gestures.
Address October 2017
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Sergio Escalera;Jordi Gonzalez
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-945373-3-2 Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ Mad2017 Serial 3017
Permanent link to this record
 

 
Author Onur Ferhat
Title Analysis of Head-Pose Invariant, Natural Light Gaze Estimation Methods Type Book Whole
Year 2017 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Eye tracker devices have traditionally been only used inside laboratories, requiring trained professionals and elaborate setup mechanisms. However, in the recent years the scientific work on easier–to–use eye trackers which require no special hardware—other than the omnipresent front facing cameras in computers, tablets, and mobiles—is aiming at making this technology common–place. These types of trackers have several extra challenges that make the problem harder, such as low resolution images provided by a regular webcam, the changing ambient lighting conditions, personal appearance differences, changes in head pose, and so on. Recent research in the field has focused on all these challenges in order to provide better gaze estimation performances in a real world setup.

In this work, we aim at tackling the gaze tracking problem in a single camera setup. We first analyze all the previous work in the field, identifying the strengths and weaknesses of each tried idea. We start our work on the gaze tracker with an appearance–based gaze estimation method, which is the simplest idea that creates a direct mapping between a rectangular image patch extracted around the eye in a camera image, and the gaze point (or gaze direction). Here, we do an extensive analysis of the factors that affect the performance of this tracker in several experimental setups, in order to address these problems in future works. In the second part of our work, we propose a feature–based gaze estimation method, which encodes the eye region image into a compact representation. We argue that this type of representation is better suited to dealing with head pose and lighting condition changes, as it both reduces the dimensionality of the input (i.e. eye image) and breaks the direct connection between image pixel intensities and the gaze estimation. Lastly, we use a face alignment algorithm to have robust face pose estimation, using a 3D model customized to the subject using the tracker. We combine this with a convolutional neural network trained on a large dataset of images to build a face pose invariant gaze tracker.
Address September 2017
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Fernando Vilariño
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-945373-5-6 Medium
Area Expedition Conference
Notes MV Approved no
Call Number Admin @ si @ Fer2017 Serial 3018
Permanent link to this record
 

 
Author Arash Akbarinia
Title Computational Model of Visual Perception: From Colour to Form Type Book Whole
Year 2017 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract The original idea of this project was to study the role of colour in the challenging task of object recognition. We started by extending previous research on colour naming showing that it is feasible to capture colour terms through parsimonious ellipsoids. Although, the results of our model exceeded state-of-the-art in two benchmark datasets, we realised that the two phenomena of metameric lights and colour constancy must be addressed prior to any further colour processing. Our investigation of metameric pairs reached the conclusion that they are infrequent in real world scenarios. Contrary to that, the illumination of a scene often changes dramatically. We addressed this issue by proposing a colour constancy model inspired by the dynamical centre-surround adaptation of neurons in the visual cortex. This was implemented through two overlapping asymmetric Gaussians whose variances and heights are adjusted according to the local contrast of pixels. We complemented this model with a generic contrast-variant pooling mechanism that inversely connect the percentage of pooled signal to the local contrast of a region. The results of our experiments on four benchmark datasets were indeed promising: the proposed model, although simple, outperformed even learning-based approaches in many cases. Encouraged by the success of our contrast-variant surround modulation, we extended this approach to detect boundaries of objects. We proposed an edge detection model based on the first derivative of the Gaussian kernel. We incorporated four types of surround: full, far, iso- and orthogonal-orientation. Furthermore, we accounted for the pooling mechanism at higher cortical areas and the shape feedback sent to lower areas. Our results in three benchmark datasets showed significant improvement over non-learning algorithms.
To summarise, we demonstrated that biologically-inspired models offer promising solutions to computer vision problems, such as, colour naming, colour constancy and edge detection. We believe that the greatest contribution of this Ph.D dissertation is modelling the concept of dynamic surround modulation that shows the significance of contrast-variant surround integration. The models proposed here are grounded on only a portion of what we know about the human visual system. Therefore, it is only natural to complement them accordingly in future works.
Address October 2017
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor C. Alejandro Parraga
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-945373-4-9 Medium
Area Expedition Conference
Notes NEUROBIT Approved no
Call Number Admin @ si @ Akb2017 Serial 3019
Permanent link to this record
 

 
Author Cristhian Aguilera
Title Local feature description in cross-spectral imagery Type Book Whole
Year 2017 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Over the last few years, the number of consumer computer vision applications has increased dramatically. Today, computer vision solutions can be found in video game consoles, smartphone applications, driving assistance – just to name a few. Ideally, we require the performance of those applications, particularly those that are safety critical to remain constant under any external environment factors, such as changes in illumination or weather conditions. However, this is not always possible or very difficult to obtain by only using visible imagery, due to the inherent limitations of the images from that spectral band. For that reason, the use of images from different or multiple spectral bands is becoming more appealing.
The aforementioned possible advantages of using images from multiples spectral bands on various vision applications make multi-spectral image processing a relevant topic for research and development. Like in visible image processing, multi-spectral image processing needs tools and algorithms to handle information from various spectral bands. Furthermore, traditional tools such as local feature detection, which is the basis of many vision tasks such as visual odometry, image registration, or structure from motion, must be adjusted or reformulated to operate under new conditions. Traditional feature detection, description, and matching methods tend to underperform in multi-spectral settings, in comparison to mono-spectral settings, due to the natural differences between each spectral band.
The work in this thesis is focused on the local feature description problem when cross-spectral images are considered. In this context, this dissertation has three main contributions. Firstly, the work starts by proposing the usage of a combination of frequency and spatial information, in a multi-scale scheme, as feature description. Evaluations of this proposal, based on classical hand-made feature descriptors, and comparisons with state of the art cross-spectral approaches help to find and understand limitations of such strategy. Secondly, different convolutional neural network (CNN) based architectures are evaluated when used to describe cross-spectral image patches. Results showed that CNN-based methods, designed to work with visible monocular images, could be successfully applied to the description of images from two different spectral bands, with just minor modifications. In this framework, a novel CNN-based network model, specifically intended to describe image patches from two different spectral bands, is proposed. This network, referred to as Q-Net, outperforms state of the art in the cross-spectral domain, including both previous hand-made solutions as well as L2 CNN-based architectures. The third contribution of this dissertation is in the cross-spectral feature description application domain. The multispectral odometry problem is tackled showing a real application of cross-spectral descriptors
In addition to the three main contributions mentioned above, in this dissertation, two different multi-spectral datasets are generated and shared with the community to be used as benchmarks for further studies.
Address October 2017
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Angel Sappa
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-945373-6-3 Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ Agu2017 Serial 3020
Permanent link to this record
 

 
Author I. Sorodoc; S. Pezzelle; A. Herbelot; Mariella Dimiccoli; R. Bernardi
Title Learning quantification from images: A structured neural architecture Type Journal Article
Year 2018 Publication Natural Language Engineering Abbreviated Journal NLE
Volume 24 Issue 3 Pages 363-392
Keywords
Abstract Major advances have recently been made in merging language and vision representations. Most tasks considered so far have confined themselves to the processing of objects and lexicalised relations amongst objects (content words). We know, however, that humans (even pre-school children) can abstract over raw multimodal data to perform certain types of higher level reasoning, expressed in natural language by function words. A case in point is given by their ability to learn quantifiers, i.e. expressions like few, some and all. From formal semantics and cognitive linguistics, we know that quantifiers are relations over sets which, as a simplification, we can see as proportions. For instance, in most fish are red, most encodes the proportion of fish which are red fish. In this paper, we study how well current neural network strategies model such relations. We propose a task where, given an image and a query expressed by an object–property pair, the system must return a quantifier expressing which proportions of the queried object have the queried property. Our contributions are twofold. First, we show that the best performance on this task involves coupling state-of-the-art attention mechanisms with a network architecture mirroring the logical structure assigned to quantifiers by classic linguistic formalisation. Second, we introduce a new balanced dataset of image scenarios associated with quantification queries, which we hope will foster further research in this area.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no menciona Approved no
Call Number Admin @ si @ SPH2018 Serial 3021
Permanent link to this record
 

 
Author Maedeh Aghaei; Mariella Dimiccoli; C. Canton-Ferrer; Petia Radeva
Title Towards social pattern characterization from egocentric photo-streams Type Journal Article
Year 2018 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU
Volume 171 Issue Pages 104-117
Keywords Social pattern characterization; Social signal extraction; Lifelogging; Convolutional and recurrent neural networks
Abstract Following the increasingly popular trend of social interaction analysis in egocentric vision, this article presents a comprehensive pipeline for automatic social pattern characterization of a wearable photo-camera user. The proposed framework relies merely on the visual analysis of egocentric photo-streams and consists of three major steps. The first step is to detect social interactions of the user where the impact of several social signals on the task is explored. The detected social events are inspected in the second step for categorization into different social meetings. These two steps act at event-level where each potential social event is modeled as a multi-dimensional time-series, whose dimensions correspond to a set of relevant features for each task; finally, LSTM is employed to classify the time-series. The last step of the framework is to characterize social patterns of the user. Our goal is to quantify the duration, the diversity and the frequency of the user social relations in various social situations. This goal is achieved by the discovery of recurrences of the same people across the whole set of social events related to the user. Experimental evaluation over EgoSocialStyle – the proposed dataset in this work, and EGO-GROUP demonstrates promising results on the task of social pattern characterization from egocentric photo-streams.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no proj Approved no
Call Number Admin @ si @ ADC2018 Serial 3022
Permanent link to this record
 

 
Author Alejandro Cartas; Mariella Dimiccoli; Petia Radeva
Title Batch-based activity recognition from egocentric photo-streams Type Conference Article
Year 2017 Publication 1st International workshop on Egocentric Perception, Interaction and Computing Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Activity recognition from long unstructured egocentric photo-streams has several applications in assistive technology such as health monitoring and frailty detection, just to name a few. However, one of its main technical challenges is to deal with the low frame rate of wearable photo-cameras, which causes abrupt appearance changes between consecutive frames. In consequence, important discriminatory low-level features from motion such as optical flow cannot be estimated. In this paper, we present a batch-driven approach for training a deep learning architecture that strongly rely on Long short-term units to tackle this problem. We propose two different implementations of the same approach that process a photo-stream sequence using batches of fixed size with the goal of capturing the temporal evolution of high-level features. The main difference between these implementations is that one explicitly models consecutive batches by overlapping them. Experimental results over a public dataset acquired by three users demonstrate the validity of the proposed architectures to exploit the temporal evolution of convolutional features over time without relying on event boundaries.
Address Venice; Italy; October 2017;
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV - EPIC
Notes MILAB; no menciona Approved no
Call Number Admin @ si @ CDR2017 Serial 3023
Permanent link to this record
 

 
Author Aniol Lidon; Marc Bolaños; Mariella Dimiccoli; Petia Radeva; Maite Garolera; Xavier Giro
Title Semantic Summarization of Egocentric Photo-Stream Events Type Conference Article
Year 2017 Publication 2nd Workshop on Lifelogging Tools and Applications Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address San Francisco; USA; October 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4503-5503-2 Medium
Area Expedition Conference ACMW (LTA)
Notes MILAB; no proj Approved no
Call Number Admin @ si @ LBD2017 Serial 3024
Permanent link to this record
 

 
Author Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva
Title All the people around me: face clustering in egocentric photo streams Type Conference Article
Year 2017 Publication 24th International Conference on Image Processing Abbreviated Journal
Volume Issue Pages
Keywords face discovery; face clustering; deepmatching; bag-of-tracklets; egocentric photo-streams
Abstract arxiv1703.01790
Given an unconstrained stream of images captured by a wearable photo-camera (2fpm), we propose an unsupervised bottom-up approach for automatic clustering appearing faces into the individual identities present in these data. The problem is challenging since images are acquired under real world conditions; hence the visible appearance of the people in the images undergoes intensive variations. Our proposed pipeline consists of first arranging the photo-stream into events, later, localizing the appearance of multiple people in them, and
finally, grouping various appearances of the same person across different events. Experimental results performed on a dataset acquired by wearing a photo-camera during one month, demonstrate the effectiveness of the proposed approach for the considered purpose.
Address Beijing; China; September 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIP
Notes MILAB; no menciona Approved no
Call Number Admin @ si @ EDR2017 Serial 3025
Permanent link to this record
 

 
Author Laura Igual; Santiago Segui
Title Introduction to Data Science – A Python Approach to Concepts, Techniques and Applications. Undergraduate Topics in Computer Science Type Book Whole
Year 2017 Publication Abbreviated Journal
Volume Issue Pages 1-215
Keywords
Abstract
Address
Corporate Author Thesis
Publisher 978-3-319-50016-4 Place of Publication Editor
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-3-319-50016-4 Medium
Area Expedition Conference
Notes MILAB Approved no
Call Number Admin @ si @ IgS2017 Serial 3027
Permanent link to this record
 

 
Author Mireia Forns-Nadal; Federico Sem; Anna Mane; Laura Igual; Dani Guinart; Oscar Vilarroya
Title Increased Nucleus Accumbens Volume in First-Episode Psychosis Type Journal Article
Year 2017 Publication Psychiatry Research-Neuroimaging Abbreviated Journal PRN
Volume 263 Issue Pages 57-60
Keywords
Abstract Nucleus accumbens has been reported as a key structure in the neurobiology of schizophrenia. Studies analyzing structural abnormalities have shown conflicting results, possibly related to confounding factors. We investigated the nucleus accumbens volume using manual delimitation in first-episode psychosis (FEP) controlling for age, cannabis use and medication. Thirty-one FEP subjects who were naive or minimally exposed to antipsychotics and a control group were MRI scanned and clinically assessed from baseline to 6 months of follow-up. FEP showed increased relative and total accumbens volumes. Clinical correlations with negative symptoms, duration of untreated psychosis and cannabis use were not significant.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no menciona Approved no
Call Number Admin @ si @ FSM2017 Serial 3028
Permanent link to this record
 

 
Author Fernando Vilariño; Dan Norton
Title Using mutimedia tools to spread poetry collections Type Conference Article
Year 2017 Publication Internet librarian International Conference Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address London; UK; October 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title (down)
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ILI
Notes MV; 600.097;SIAI Approved no
Call Number Admin @ si @ ViN2017 Serial 3031
Permanent link to this record