|
Yagmur Gucluturk, Umut Guclu, Marc Perez, Hugo Jair Escalante, Xavier Baro, Isabelle Guyon, et al. (2017). Visualizing Apparent Personality Analysis with Deep Residual Networks. In Chalearn Workshop on Action, Gesture, and Emotion Recognition: Large Scale Multimodal Gesture Recognition and Real versus Fake expressed emotions at ICCV (pp. 3101–3109).
Abstract: Automatic prediction of personality traits is a subjective task that has recently received much attention. Specifically, automatic apparent personality trait prediction from multimodal data has emerged as a hot topic within the filed of computer vision and, more particularly, the so called “looking
at people” sub-field. Considering “apparent” personality traits as opposed to real ones considerably reduces the subjectivity of the task. The real world applications are encountered in a wide range of domains, including entertainment, health, human computer interaction, recruitment and security. Predictive models of personality traits are useful for individuals in many scenarios (e.g., preparing for job interviews, preparing for public speaking). However, these predictions in and of themselves might be deemed to be untrustworthy without human understandable supportive evidence. Through a series of experiments on a recently released benchmark dataset for automatic apparent personality trait prediction, this paper characterizes the audio and
visual information that is used by a state-of-the-art model while making its predictions, so as to provide such supportive evidence by explaining predictions made. Additionally, the paper describes a new web application, which gives feedback on apparent personality traits of its users by combining
model predictions with their explanations.
|
|
|
Maryam Asadi-Aghbolaghi, Hugo Bertiche, Vicent Roig, Shohreh Kasaei, & Sergio Escalera. (2017). Action Recognition from RGB-D Data: Comparison and Fusion of Spatio-temporal Handcrafted Features and Deep Strategies. In Chalearn Workshop on Action, Gesture, and Emotion Recognition: Large Scale Multimodal Gesture Recognition and Real versus Fake expressed emotions at ICCV.
|
|
|
Albert Clapes, Tinne Tuytelaars, & Sergio Escalera. (2017). Darwintrees for action recognition. In Chalearn Workshop on Action, Gesture, and Emotion Recognition: Large Scale Multimodal Gesture Recognition and Real versus Fake expressed emotions at ICCV.
|
|
|
Mireia Sole, Joan Blanco, Debora Gil, G. Fonseka, Richard Frodsham, Francesca Vidal, et al. (2017). Noves perspectives en l estudi de la territorialitat cromosomica de cel·lules germinals masculines: estudis tridimensionals. JBR - Biologia de la Reproduccio, 73–78.
Abstract: In somatic cells, chromosomes occupy specific nuclear regions called chromosome territories which are involved in the
maintenance and regulation of the genome. Preliminary data in male germ cells also suggest the importance of chromosome
territoriality in cell functionality. Nevertheless, the specific characteristics of testicular tissue (presence of different
cell types with different morphological characteristics, in different stages of development and with different ploidy)
makes difficult to achieve conclusive results. In this study we have developed a methodology to approach the threedimensional
study of all chromosome territories in male germ cells from C57BL/6J mice (Mus musculus). The method
includes the following steps: i) Optimized cell fixation to obtain an optimal preservation of the three-dimensionality cell
morphology, ii) Chromosome identification by FISH (Chromoprobe Multiprobe® OctoChrome™ Murine System; Cytocell)
and confocal microscopy (TCS-SP5, Leica Microsystems), iii) Cell type identification by immunofluorescence
iv) Image analysis using Matlab scripts, v) Numerical data extraction related to chromosome features, chromosome
radial position and chromosome relative position. This methodology allows the unequivocally identification and the
analysis of the chromosome territories of all spermatogenic stages. Results will provide information about the features
that determine chromosomal position, preferred associations between chromosomes, and the relationship between chromosome
positioning and genome regulation.
|
|
|
Umut Guclu, Yagmur Gucluturk, Meysam Madadi, Sergio Escalera, Xavier Baro, Jordi Gonzalez, et al. (2017). End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks.
Abstract: arXiv:1703.03305
Recent years have seen a sharp increase in the number of related yet distinct advances in semantic segmentation. Here, we tackle this problem by leveraging the respective strengths of these advances. That is, we formulate a conditional random field over a four-connected graph as end-to-end trainable convolutional and recurrent networks, and estimate them via an adversarial process. Importantly, our model learns not only unary potentials but also pairwise
potentials, while aggregating multi-scale contexts and controlling higher-order inconsistencies.
We evaluate our model on two standard benchmark datasets for semantic face segmentation, achieving state-of-the-art results on both of them.
|
|
|
Veronica Romero, Alicia Fornes, Enrique Vidal, & Joan Andreu Sanchez. (2017). Information Extraction in Handwritten Marriage Licenses Books Using the MGGI Methodology. In L.A. Alexandre, J.Salvador Sanchez, & Joao M. F. Rodriguez (Eds.), 8th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 10255, pp. 287–294). LNCS.
Abstract: Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demographic and genealogical research. For example, marriage license books have been used for centuries by ecclesiastical and secular institutions to register marriages. These books follow a simple structure of the text in the records with a evolutionary vocabulary, mainly composed of proper names that change along the time. This distinct vocabulary makes automatic transcription and semantic information extraction difficult tasks. In previous works we studied the use of category-based language models and how a Grammatical Inference technique known as MGGI could improve the accuracy of these tasks. In this work we analyze the main causes of the semantic errors observed in previous results and apply a better implementation of the MGGI technique to solve these problems. Using the resulting language model, transcription and information extraction experiments have been carried out, and the results support our proposed approach.
Keywords: Handwritten Text Recognition; Information extraction; Language modeling; MGGI; Categories-based language model
|
|
|
Marc Bolaños, Alvaro Peris, Francisco Casacuberta, & Petia Radeva. (2017). VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering. In 8th Iberian Conference on Pattern Recognition and Image Analysis.
Abstract: In this paper, we address the problem of visual question answering by proposing a novel model, called VIBIKNet. Our model is based on integrating Kernelized Convolutional Neural Networks and Long-Short Term Memory units to generate an answer given a question about an image. We prove that VIBIKNet is an optimal trade-off between accuracy and computational load, in terms of memory and time consumption. We validate our method on the VQA challenge dataset and compare it to the top performing methods in order to illustrate its performance and speed.
Keywords: Visual Qestion Aswering; Convolutional Neural Networks; Long short-term memory networks
|
|
|
Hana Jarraya, Oriol Ramos Terrades, & Josep Llados. (2017). Graph Embedding through Probabilistic Graphical Model applied to Symbolic Graphs. In 8th Iberian Conference on Pattern Recognition and Image Analysis.
Abstract: We propose a new Graph Embedding (GEM) method that takes advantages of structural pattern representation. It models an Attributed Graph (AG) as a Probabilistic Graphical Model (PGM). Then, it learns the parameters of this PGM presented by a vector. This vector is a signature of AG in a lower dimensional vectorial space. We apply Structured Support Vector Machines (SSVM) to process classification task. As first tentative, results on the GREC dataset are encouraging enough to go further on this direction.
Keywords: Attributed Graph; Probabilistic Graphical Model; Graph Embedding; Structured Support Vector Machines
|
|
|
Xavier Soria, Angel Sappa, & Arash Akbarinia. (2017). Multispectral Single-Sensor RGB-NIR Imaging: New Challenges and Opportunities. In 7th International Conference on Image Processing Theory, Tools & Applications.
Abstract: Multispectral images captured with a single sensor camera have become an attractive alternative for numerous computer vision applications. However, in order to fully exploit their potentials, the color restoration problem (RGB representation) should be addressed. This problem is more evident in outdoor scenarios containing vegetation, living beings, or specular materials. The problem of color distortion emerges from the sensitivity of sensors due to the overlap of visible and near infrared spectral bands. This paper empirically evaluates the variability of the near infrared (NIR) information with respect to the changes of light throughout the day. A tiny neural network is proposed to restore the RGB color representation from the given RGBN (Red, Green, Blue, NIR) images. In order to evaluate the proposed algorithm, different experiments on a RGBN outdoor dataset are conducted, which include various challenging cases. The obtained result shows the challenge and the importance of addressing color restoration in single sensor multispectral images.
Keywords: Color restoration; Neural networks; Singlesensor cameras; Multispectral images; RGB-NIR dataset
|
|
|
Debora Gil, Oriol Ramos Terrades, Elisa Minchole, Carles Sanchez, Noelia Cubero de Frutos, Marta Diez-Ferrer, et al. (2017). Classification of Confocal Endomicroscopy Patterns for Diagnosis of Lung Cancer. In 6th Workshop on Clinical Image-based Procedures: Translational Research in Medical Imaging (Vol. 10550, pp. 151–159). LNCS.
Abstract: Confocal Laser Endomicroscopy (CLE) is an emerging imaging technique that allows the in-vivo acquisition of cell patterns of potentially malignant lesions. Such patterns could discriminate between inflammatory and neoplastic lesions and, thus, serve as a first in-vivo biopsy to discard cases that do not actually require a cell biopsy.
The goal of this work is to explore whether CLE images obtained during videobronchoscopy contain enough visual information to discriminate between benign and malign peripheral lesions for lung cancer diagnosis. To do so, we have performed a pilot comparative study with 12 patients (6 adenocarcinoma and 6 benign-inflammatory) using 2 different methods for CLE pattern analysis: visual analysis by 3 experts and a novel methodology that uses graph methods to find patterns in pre-trained feature spaces. Our preliminary results indicate that although visual analysis can only achieve a 60.2% of accuracy, the accuracy of the proposed unsupervised image pattern classification raises to 84.6%.
We conclude that CLE images visual information allow in-vivo detection of neoplastic lesions and graph structural analysis applied to deep-learning feature spaces can achieve competitive results.
|
|
|
Ishaan Gulrajani, Kundan Kumar, Faruk Ahmed, Adrien Ali Taiga, Francesco Visin, David Vazquez, et al. (2017). PixelVAE: A Latent Variable Model for Natural Images. In 5th International Conference on Learning Representations.
Abstract: Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representation and generate samples that preserve global structure but tend to suffer from image blurriness. PixelCNNs model sharp contours and details very well, but lack an explicit latent representation and have difficulty modeling large-scale structure in a computationally efficient way. In this paper, we present PixelVAE, a VAE model with an autoregressive decoder based on PixelCNN. The resulting architecture achieves state-of-the-art log-likelihood on binarized MNIST. We extend PixelVAE to a hierarchy of multiple latent variables at different scales; this hierarchical model achieves competitive likelihood on 64x64 ImageNet and generates high-quality samples on LSUN bedrooms.
Keywords: Deep Learning; Unsupervised Learning
|
|
|
Pau Rodriguez, Jordi Gonzalez, Jordi Cucurull, Josep M. Gonfaus, & Xavier Roca. (2017). Regularizing CNNs with Locally Constrained Decorrelations. In 5th International Conference on Learning Representations.
|
|
|
Quentin Angermann, Jorge Bernal, Cristina Sanchez Montes, Gloria Fernandez Esparrach, Xavier Gray, Olivier Romain, et al. (2017). Towards Real-Time Polyp Detection in Colonoscopy Videos: Adapting Still Frame-Based Methodologies for Video Sequences Analysis. In 4th International Workshop on Computer Assisted and Robotic Endoscopy (pp. 29–41).
Abstract: Colorectal cancer is the second cause of cancer death in United States: precursor lesions (polyps) detection is key for patient survival. Though colonoscopy is the gold standard screening tool, some polyps are still missed. Several computational systems have been proposed but none of them are used in the clinical room mainly due to computational constraints. Besides, most of them are built over still frame databases, decreasing their performance on video analysis due to the lack of output stability and not coping with associated variability on image quality and polyp appearance. We propose a strategy to adapt these methods to video analysis by adding a spatio-temporal stability module and studying a combination of features to capture polyp appearance variability. We validate our strategy, incorporated on a real-time detection method, on a public video database. Resulting method detects all
polyps under real time constraints, increasing its performance due to our
adaptation strategy.
Keywords: Polyp detection; colonoscopy; real time; spatio temporal coherence
|
|
|
Sergio Alloza, Flavio Escribano, Sergi Delgado, Ciprian Corneanu, & Sergio Escalera. (2017). XBadges. Identifying and training soft skills with commercial video games Improving persistence, risk taking & spatial reasoning with commercial video games and facial and emotional recognition system. In 4th Congreso de la Sociedad Española para las Ciencias del Videojuego (Vol. 1957, pp. 13–28).
Abstract: XBadges is a research project based on the hypothesis that commercial video games (nonserious games) can train soft skills. We measure persistence, patial reasoning and risk taking before and after subjects paticipate in controlled game playing sessions.
In addition, we have developed an automatic facial expression recognition system capable of inferring their emotions while playing, allowing us to study the role of emotions in soft skills acquisition. We have used Flappy Bird, Pacman and Tetris for assessing changes in persistence, risk taking and spatial reasoning respectively.
Results show how playing Tetris significantly improves spatial reasoning and how playing Pacman significantly improves prudence in certain areas of behavior. As for emotions, they reveal that being concentrated helps to improve performance and skills acquisition. Frustration is also shown as a key element. With the results obtained we are able to glimpse multiple applications in areas which need soft skills development.
Keywords: Video Games; Soft Skills; Training; Skilling Development; Emotions; Cognitive Abilities; Flappy Bird; Pacman; Tetris
|
|
|
Arash Akbarinia, C. Alejandro Parraga, Marta Exposito, Bogdan Raducanu, & Xavier Otazu. (2017). Can biological solutions help computers detect symmetry? In 40th European Conference on Visual Perception.
|
|