|
David Geronimo. (2010). A Global Approach to Vision-Based Pedestrian Detection for Advanced Driver Assistance Systems (Antonio Lopez, Krystian Mikolajczyk, Jaume Amores, Dariu M. Gavrila, Oriol Pujol, & Felipe Lumbreras, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: At the beginning of the 21th century, traffic accidents have become a major problem not only for developed countries but also for emerging ones. As in other scientific areas in which Artificial Intelligence is becoming a key actor, advanced driver assistance systems, and concretely pedestrian protection systems based on Computer Vision, are becoming a strong topic of research aimed at improving the safety of pedestrians. However, the challenge is of considerable complexity due to the varying appearance of humans (e.g., clothes, size, aspect ratio, shape, etc.), the dynamic nature of on-board systems and the unstructured moving environments that urban scenarios represent. In addition, the required performance is demanding both in terms of computational time and detection rates. In this thesis, instead of focusing on improving specific tasks as it is frequent in the literature, we present a global approach to the problem. Such a global overview starts by the proposal of a generic architecture to be used as a framework both to review the literature and to organize the studied techniques along the thesis. We then focus the research on tasks such as foreground segmentation, object classification and refinement following a general viewpoint and exploring aspects that are not usually analyzed. In order to perform the experiments, we also present a novel pedestrian dataset that consists of three subsets, each one addressed to the evaluation of a different specific task in the system. The results presented in this thesis not only end with a proposal of a pedestrian detection system but also go one step beyond by pointing out new insights, formalizing existing and proposed algorithms, introducing new techniques and evaluating their performance, which we hope will provide new foundations for future research in the area.
|
|
|
Carolina Malagelada, F.De Lorio, Fernando Azpiroz, Santiago Segui, Petia Radeva, Anna Accarino, et al. (2010). Intestinal Dysmotility in Patients with Functional Intestinal Disorders Demonstrated by Computer Vision Analysis of Capsule Endoscopy Images. In 18th United European Gastroenterology Week (Vol. 56, pp. A19–20).
|
|
|
Miguel Angel Bautista, Xavier Baro, Oriol Pujol, Petia Radeva, Jordi Vitria, & Sergio Escalera. (2010). Compact Evolutive Design of Error-Correcting Output Codes. In Supervised and Unsupervised Ensemble Methods and their Applications in the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (pp. 119–128).
Abstract: The classication of large number of object categories is a challenging trend in the Machine Learning eld. In literature, this is often addressed using an ensemble of classiers. In this scope, the Error-Correcting Output Codes framework has demonstrated to be a powerful tool for the combination of classiers. However, most of the state-of-the-art ECOC approaches use a linear or exponential number of classiers, making the discrimination of a large number of classes unfeasible. In this paper, we explore and propose a minimal design of ECOC in terms of the number of classiers. Evolutionary computation is used for tuning the parameters of the classiers and looking for the best Minimal ECOC code conguration. The results over several public UCI data sets and a challenging multi-class Computer Vision problem show that the proposed methodology obtains comparable and even better results than state-of-the-art ECOC methodologies with far less number of dichotomizers.
Keywords: Ensemble of Dichotomizers; Error-Correcting Output Codes; Evolutionary optimization
|
|
|
Neus Salvatella, E Fernandez-Nofrerias, Francesco Ciompi, Oriol Rodriguez-Leor, Xavier Carrillo, R. Hemetsberger, et al. (2010). Canvis de volum a la arteria radial despres de la administracio de dos tractaments vasodilatadors. Avaluacio mitjançant ecografia intravascular. In 22nd Congres Societat Catalana de Cardiologia, (179).
|
|
|
Oriol Rodriguez-Leor, R. Hemetsberger, Francesco Ciompi, E Fernandez-Nofrerias, Angel Serrano, M. Bernet, et al. (2010). Caracteritzacio automatica de la placa mitjançant analisis del espectre de radiofreqüencia en estudi de ecografia intracoronaria: resultat de la fusio de dades invivo i exvivo. In 22nd Congres Societat Catalana de Cardiologia, (131).
|
|
|
Pierluigi Casale, Oriol Pujol, & Petia Radeva. (2010). Embedding Random Projections in Regularized Gradient Boosting Machines. In Supervised and Unsupervised Ensemble Methods and their Applications in the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (44–53).
|
|
|
Sergio Escalera, Petia Radeva, Jordi Vitria, Xavier Baro, & Bogdan Raducanu. (2010). Modelling and Analyzing Multimodal Dyadic Interactions Using Social Networks. In 12th International Conference on Multimodal Interfaces and 7th Workshop on Machine Learning for Multimodal Interaction..
Abstract: Social network analysis became a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore the characteristics of a social network extracted from
multimodal dyadic interactions. First, speech detection is performed through an audio/visual fusion scheme based on stacked sequential learning. In the audio domain, speech is detected through clusterization of audio features. Clusters
are modelled by means of an One-state Hidden Markov Model containing a diagonal covariance Gaussian Mixture Model. In the visual domain, speech detection is performed through differential-based feature extraction from the segmented
mouth region, and a dynamic programming matching procedure. Second, in order to model the dyadic interactions, we employed the Influence Model whose states
encode the previous integrated audio/visual data. Third, the social network is extracted based on the estimated influences. For our study, we used a set of videos belonging to New York Times’ Blogging Heads opinion blog. The results
are reported both in terms of accuracy of the audio/visual data fusion and centrality measures used to characterize the social network.
Keywords: Social interaction; Multimodal fusion, Influence model; Social network analysis
|
|
|
Monica Piñol. (2010). Adaptative Vocabulary Tree for Image Classification using Reinforcement Learning (Vol. 162). Master's thesis, , .
|
|
|
David Geronimo, & Antonio Lopez. (2010). Deteccion de Peatones para Sistemas Avanzados de Asistencia al Conductor.
Abstract: Los sistemas de asistencia al conductor, y particularmente los sistemas de protección de peatones, representan uno de los campos de investigación más activos dedicados a la mejora de la seguridad vial. El mayor desafío es el desarrollo de sistemas a bordo fiables de detección de peatones. En esta revisión del estado de la técnica de la detección de peatones, se divide el problema en diferentes etapas, cada una con responsabilidades propias dentro del sistema. Esta división facilita el posterior análisis y discusión de cada uno de los métodos en la literatura, favoreciendo la comparación entre ellos. Finalmente se discuten los temas más importantes de este campo poniendo especial énfasis en las necesidades actuales y los desafíos futuros.
|
|
|
Joan Serrat, & Antonio Lopez. (2010). Deteccion automatica de lineas de carril para la asistencia a la conduccion.
Abstract: La detección por cámara de las líneas de carril en las carreteras puede ser una solución asequible a los riesgos de conducción generados por los adelantamientos o las salidas de carril. Este trabajo propone un sistema que funciona en tiempo real y que obtiene muy buenos resultados. El sistema está preparado para identificar las líneas en condiciones de visibilidad poco favorables, como puede ser la conducción nocturna o con otros vehículos que dificulten la visión.
|
|
|
David Geronimo, & Antonio Lopez. (2010). Sistema de deteccion de peatones.
Abstract: Durante la próxima década, los sistemas de protección de peatones jugarán un papel fundamental en el reto de mejorar la seguridad viaria. El objetivo principal de estos sistemas, detectar peatones en entornos urbanos, implica procesar imágenes de escenas exteriores desde una plataforma móvil para buscar objetos de aspecto variable como son las personas. Dadas estas dificultades, estos sistemas hacen uso de las últimas técnicas de visión por computador. Esta propuesta consiste en un sistema de tres módulos basado tanto en información 2D como en 3D. El primer módulo utiliza información 3D para hacer una estimación de los parámetros de la carretera y seleccionar regiones de interés que serán analizadas después. El segundo módulo utiliza un clasificador de ventanas 2D para etiquetar las mencionadas regiones como peatón o no peatón. El módulo final vuelve a utilizar de nuevo la información 3D para verificar las regiones clasificadas y, con información 2D, refinar los resultados finales. Los resultados experimentales son positivos tanto en rendimiento como en tiempo de cómputo.
|
|
|
Partha Pratim Roy, Umapada Pal, & Josep Llados. (2010). Query Driven Word Retrieval in Graphical Documents. In 9th IAPR International Workshop on Document Analysis Systems (191–198).
Abstract: In this paper, we present an approach towards the retrieval of words from graphical document images. In graphical documents, due to presence of multi-oriented characters in non-structured layout, word indexing is a challenging task. The proposed approach uses recognition results of individual components to form character pairs with the neighboring components. An indexing scheme is designed to store the spatial description of components and to access them efficiently. Given a query text word (ascii/unicode format), the character pairs present in it are searched in the document. Next the retrieved character pairs are linked sequentially to form character string. Dynamic programming is applied to find different instances of query words. A string edit distance is used here to match the query word as the objective function. Recognition of multi-scale and multi-oriented character component is done using Support Vector Machine classifier. To consider multi-oriented character strings the features used in the SVM are invariant to character orientation. Experimental results show that the method is efficient to locate a query word from multi-oriented text in graphical documents.
|
|
|
Marçal Rusiñol, & Josep Llados. (2010). Efficient Logo Retrieval Through Hashing Shape Context Descriptors. In 9th IAPR International Workshop on Document Analysis Systems (215–222).
Abstract: In this paper, we present an approach towards the retrieval of words from graphical document images. In graphical documents, due to presence of multi-oriented characters in non-structured layout, word indexing is a challenging task. The proposed approach uses recognition results of individual components to form character pairs with the neighboring components. An indexing scheme is designed to store the spatial description of components and to access them efficiently. Given a query text word (ascii/unicode format), the character pairs present in it are searched in the document. Next the retrieved character pairs are linked sequentially to form character string. Dynamic programming is applied to find different instances of query words. A string edit distance is used here to match the query word as the objective function. Recognition of multi-scale and multi-oriented character component is done using Support Vector Machine classifier. To consider multi-oriented character strings the features used in the SVM are invariant to character orientation. Experimental results show that the method is efficient to locate a query word from multi-oriented text in graphical documents.
|
|
|
Sebastien Mace, Herve Locteau, Ernest Valveny, & Salvatore Tabbone. (2010). A system to detect rooms in architectural floor plan images. In 9th IAPR International Workshop on Document Analysis Systems (167–174).
Abstract: In this article, a system to detect rooms in architectural floor plan images is described. We first present a primitive extraction algorithm for line detection. It is based on an original coupling of classical Hough transform with image vectorization in order to perform robust and efficient line detection. We show how the lines that satisfy some graphical arrangements are combined into walls. We also present the way we detect some door hypothesis thanks to the extraction of arcs. Walls and door hypothesis are then used by our room segmentation strategy; it consists in recursively decomposing the image until getting nearly convex regions. The notion of convexity is difficult to quantify, and the selection of separation lines between regions can also be rough. We take advantage of knowledge associated to architectural floor plans in order to obtain mostly rectangular rooms. Qualitative and quantitative evaluations performed on a corpus of real documents show promising results.
|
|
|
Albert Gordo, Alicia Fornes, Ernest Valveny, & Josep Llados. (2010). A Bag of Notes Approach to Writer Identification in Old Handwritten Music Scores. In 9th IAPR International Workshop on Document Analysis Systems (247–254).
Abstract: Determining the authorship of a document, namely writer identification, can be an important source of information for document categorization. Contrary to text documents, the identification of the writer of graphical documents is still a challenge. In this paper we present a robust approach for writer identification in a particular kind of graphical documents, old music scores. This approach adapts the bag of visual terms method for coping with graphic documents. The identification is performed only using the graphical music notation. For this purpose, we generate a graphic vocabulary without recognizing any music symbols, and consequently, avoiding the difficulties in the recognition of hand-drawn symbols in old and degraded documents. The proposed method has been tested on a database of old music scores from the 17th to 19th centuries, achieving very high identification rates.
|
|