|
Pau Riba, Josep Llados, & Alicia Fornes. (2017). Error-tolerant coarse-to-fine matching model for hierarchical graphs. In Pasquale Foggia, Cheng-Lin Liu, & Mario Vento (Eds.), 11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition (Vol. 10310, pp. 107–117). Springer International Publishing.
Abstract: Graph-based representations are effective tools to capture structural information from visual elements. However, retrieving a query graph from a large database of graphs implies a high computational complexity. Moreover, these representations are very sensitive to noise or small changes. In this work, a novel hierarchical graph representation is designed. Using graph clustering techniques adapted from graph-based social media analysis, we propose to generate a hierarchy able to deal with different levels of abstraction while keeping information about the topology. For the proposed representations, a coarse-to-fine matching method is defined. These approaches are validated using real scenarios such as classification of colour images and handwritten word spotting.
Keywords: Graph matching; Hierarchical graph; Graph-based representation; Coarse-to-fine matching
|
|
|
Marc Bolaños, Alvaro Peris, Francisco Casacuberta, & Petia Radeva. (2017). VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering. In 8th Iberian Conference on Pattern Recognition and Image Analysis.
Abstract: In this paper, we address the problem of visual question answering by proposing a novel model, called VIBIKNet. Our model is based on integrating Kernelized Convolutional Neural Networks and Long-Short Term Memory units to generate an answer given a question about an image. We prove that VIBIKNet is an optimal trade-off between accuracy and computational load, in terms of memory and time consumption. We validate our method on the VQA challenge dataset and compare it to the top performing methods in order to illustrate its performance and speed.
Keywords: Visual Qestion Aswering; Convolutional Neural Networks; Long short-term memory networks
|
|
|
Veronica Romero, Alicia Fornes, Enrique Vidal, & Joan Andreu Sanchez. (2017). Information Extraction in Handwritten Marriage Licenses Books Using the MGGI Methodology. In L.A. Alexandre, J.Salvador Sanchez, & Joao M. F. Rodriguez (Eds.), 8th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 10255, pp. 287–294). LNCS.
Abstract: Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demographic and genealogical research. For example, marriage license books have been used for centuries by ecclesiastical and secular institutions to register marriages. These books follow a simple structure of the text in the records with a evolutionary vocabulary, mainly composed of proper names that change along the time. This distinct vocabulary makes automatic transcription and semantic information extraction difficult tasks. In previous works we studied the use of category-based language models and how a Grammatical Inference technique known as MGGI could improve the accuracy of these tasks. In this work we analyze the main causes of the semantic errors observed in previous results and apply a better implementation of the MGGI technique to solve these problems. Using the resulting language model, transcription and information extraction experiments have been carried out, and the results support our proposed approach.
Keywords: Handwritten Text Recognition; Information extraction; Language modeling; MGGI; Categories-based language model
|
|
|
Antonio Lopez, Jiaolong Xu, Jose Luis Gomez, David Vazquez, & German Ros. (2017). From Virtual to Real World Visual Perception using Domain Adaptation -- The DPM as Example. In Gabriela Csurka (Ed.), Domain Adaptation in Computer Vision Applications (pp. 243–258). Springer.
Abstract: Supervised learning tends to produce more accurate classifiers than unsupervised learning in general. This implies that training data is preferred with annotations. When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck. The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images. A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required. This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous. As an alternative we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision. However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA). In this chapter we revisit the DA of a deformable part-based model (DPM) as an exemplifying case of virtual- to-real-world DA. As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data. While doing so, we investigate questions such as: how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world.
Keywords: Domain Adaptation
|
|
|
Maryam Asadi-Aghbolaghi, Albert Clapes, Marco Bellantonio, Hugo Jair Escalante, Victor Ponce, Xavier Baro, et al. (2017). Deep Learning for Action and Gesture Recognition in Image Sequences: A Survey. In Gesture Recognition (pp. 539–578).
Abstract: Interest in automatic action and gesture recognition has grown considerably in the last few years. This is due in part to the large number of application domains for this type of technology. As in many other computer vision areas, deep learning based methods have quickly become a reference methodology for obtaining state-of-the-art performance in both tasks. This chapter is a survey of current deep learning based methodologies for action and gesture recognition in sequences of images. The survey reviews both fundamental and cutting edge methodologies reported in the last few years. We introduce a taxonomy that summarizes important aspects of deep learning for approaching both tasks. Details of the proposed architectures, fusion strategies, main datasets, and competitions are reviewed. Also, we summarize and discuss the main works proposed so far with particular interest on how they treat the temporal dimension of data, their highlighting features, and opportunities and challenges for future research. To the best of our knowledge this is the first survey in the topic. We foresee this survey will become a reference in this ever dynamic field of research.
Keywords: Action recognition; Gesture recognition; Deep learning architectures; Fusion strategies
|
|
|
Hana Jarraya, Muhammad Muzzamil Luqman, & Jean-Yves Ramel. (2017). Improving Fuzzy Multilevel Graph Embedding Technique by Employing Topological Node Features: An Application to Graphics Recognition. In B. Lamiroy, & R Dueire Lins (Eds.), Graphics Recognition. Current Trends and Challenges (Vol. 9657). LNCS. Springer.
|
|
|
Daniel Hernandez, Antonio Espinosa, David Vazquez, Antonio Lopez, & Juan Carlos Moure. (2017). Embedded Real-time Stixel Computation. In GPU Technology Conference.
Keywords: GPU; CUDA; Stixels; Autonomous Driving
|
|
|
David Vazquez, Jorge Bernal, F. Javier Sanchez, Gloria Fernandez Esparrach, Antonio Lopez, Adriana Romero, et al. (2017). A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images. In 31st International Congress and Exhibition on Computer Assisted Radiology and Surgery.
Abstract: Colorectal cancer (CRC) is the third cause of cancer death worldwide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss-rate and inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing Decision Support Systems (DSS) aiming to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. We provide new baselines on this dataset by training standard fully convolutional networks (FCN) for semantic segmentation and significantly outperforming, without any further post-processing, prior results in endoluminal scene segmentation.
Keywords: Deep Learning; Medical Imaging
|
|
|
Pau Rodriguez, Jordi Gonzalez, Jordi Cucurull, Josep M. Gonfaus, & Xavier Roca. (2017). Regularizing CNNs with Locally Constrained Decorrelations. In 5th International Conference on Learning Representations.
|
|
|
Mireia Sole, Joan Blanco, Debora Gil, Oliver Valero, G. Fonseka, M. Lawrie, et al. (2017). Chromosome Territories in Mice Spermatogenesis: A new three-dimensional methodology of study. In 11th European CytoGenesis Conference.
|
|
|
Antonio Lopez, Atsushi Imiya, Tomas Pajdla, & Jose Manuel Alvarez. (2017). Computer Vision in Vehicle Technology: Land, Sea & Air. John Wiley & Sons, Ltd.
Abstract: Summary This chapter examines different vision-based commercial solutions for real-live problems related to vehicles. It is worth mentioning the recent astonishing performance of deep convolutional neural networks (DCNNs) in difficult visual tasks such as image classification, object recognition/localization/detection, and semantic segmentation. In fact,
different DCNN architectures are already being explored for low-level tasks such as optical flow and disparity computation, and higher level ones such as place recognition.
|
|
|
Quentin Angermann, Jorge Bernal, Cristina Sanchez Montes, Maroua Hammami, Gloria Fernandez Esparrach, Xavier Dray, et al. (2017). Real-Time Polyp Detection in Colonoscopy Videos: A Preliminary Study For Adapting Still Frame-based Methodology To Video Sequences Analysis. In 31st International Congress and Exhibition on Computer Assisted Radiology and Surgery.
|
|
|
Lasse Martensson, Anders Hast, & Alicia Fornes. (2017). Word Spotting as a Tool for Scribal Attribution. In 2nd Conference of the association of Digital Humanities in the Nordic Countries (pp. 87–89).
|
|
|
Mireia Sole, Joan Blanco, Debora Gil, G. Fonseka, Richard Frodsham, Oliver Valero, et al. (2017). Is there a pattern of Chromosome territoriality along mice spermatogenesis? In 3rd Spanish MeioNet Meeting Abstract Book (pp. 55–56).
|
|
|
Mireia Sole, Joan Blanco, Debora Gil, G. Fonseka, Richard Frodsham, Oliver Valero, et al. (2017). Unraveling the enigmas of chromosome territoriality during spermatogenesis. In IX Jornada del Departament de Biologia Cel•lular, Fisiologia i Immunologia.
|
|