|
Muhammad Muzzamil Luqman, Jean-Yves Ramel, & Josep Llados. (2013). Multilevel Analysis of Attributed Graphs for Explicit Graph Embedding in Vector Spaces. In Graph Embedding for Pattern Analysis (pp. 1–26). Springer New York.
Abstract: Ability to recognize patterns is among the most crucial capabilities of human beings for their survival, which enables them to employ their sophisticated neural and cognitive systems [1], for processing complex audio, visual, smell, touch, and taste signals. Man is the most complex and the best existing system of pattern recognition. Without any explicit thinking, we continuously compare, classify, and identify huge amount of signal data everyday [2], starting from the time we get up in the morning till the last second we fall asleep. This includes recognizing the face of a friend in a crowd, a spoken word embedded in noise, the proper key to lock the door, smell of coffee, the voice of a favorite singer, the recognition of alphabetic characters, and millions of more tasks that we perform on regular basis.
|
|
|
Mohammad Ali Bagheri, Qigang Gao, & Sergio Escalera. (2013). Logo recognition Based on the Dempster-Shafer Fusion of Multiple Classifiers. In 26th Canadian Conference on Artificial Intelligence (Vol. 7884, pp. 1–12). Springer Berlin Heidelberg.
Abstract: Best paper award
The performance of different feature extraction and shape description methods in trademark image recognition systems have been studied by several researchers. However, the potential improvement in classification through feature fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of three classifiers, each trained on different feature sets. Three promising shape description techniques, including Zernike moments, generic Fourier descriptors, and shape signature are used to extract informative features from logo images, and each set of features is fed into an individual classifier. In order to reduce recognition error, a powerful combination strategy based on the Dempster-Shafer theory is utilized to fuse the three classifiers trained on different sources of information. This combination strategy can effectively make use of diversity of base learners generated with different set of features. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers’ output, showing significant performance improvements of the proposed methodology.
Keywords: Logo recognition; ensemble classification; Dempster-Shafer fusion; Zernike moments; generic Fourier descriptor; shape signature
|
|
|
Michal Drozdzal, Santiago Segui, Petia Radeva, Carolina Malagelada, Fernando Azpiroz, & Jordi Vitria. (2013). An Application for Efficient Error-Free Labeling of Medical Images. In Multimodal Interaction in Image and Video Applications (Vol. 48, pp. 1–16). Springer Berlin Heidelberg.
Abstract: In this chapter we describe an application for efficient error-free labeling of medical images. In this scenario, the compilation of a complete training set for building a realistic model of a given class of samples is not an easy task, making the process tedious and time consuming. For this reason, there is a need for interactive labeling applications that minimize the effort of the user while providing error-free labeling. We propose a new algorithm that is based on data similarity in feature space. This method actively explores data in order to find the best label-aligned clustering and exploits it to reduce the labeler effort, that is measured by the number of “clicks. Moreover, error-free labeling is guaranteed by the fact that all data and their labels proposals are visually revised by en expert.
|
|
|
Alex Pardo, Albert Clapes, Sergio Escalera, & Oriol Pujol. (2013). Actions in Context: System for people with Dementia. In 2nd International Workshop on Citizen Sensor Networks (Citisen2013) at the European Conference on Complex Systems (pp. 3–14). Springer International Publishing.
Abstract: In the next forty years, the number of people living with dementia is expected to triple. In the last stages, people affected by this disease become dependent. This hinders the autonomy of the patient and has a huge social impact in time, money and effort. Given this scenario, we propose an ubiquitous system capable of recognizing daily specific actions. The system fuses and synchronizes data obtained from two complementary modalities – ambient and egocentric. The ambient approach consists in a fixed RGB-Depth camera for user and object recognition and user-object interaction, whereas the egocentric point of view is given by a personal area network (PAN) formed by a few wearable sensors and a smartphone, used for gesture recognition. The system processes multi-modal data in real-time, performing paralleled task recognition and modality synchronization, showing high performance recognizing subjects, objects, and interactions, showing its reliability to be applied in real case scenarios.
Keywords: Multi-modal data Fusion; Computer vision; Wearable sensors; Gesture recognition; Dementia
|
|
|
Ernest Valveny, Oriol Ramos Terrades, Joan Mas, & Marçal Rusiñol. (2013). Interactive Document Retrieval and Classification. In Angel Sappa, & Jordi Vitria (Eds.), Multimodal Interaction in Image and Video Applications (Vol. 48, pp. 17–30). Springer Berlin Heidelberg.
Abstract: In this chapter we describe a system for document retrieval and classification following the interactive-predictive framework. In particular, the system addresses two different scenarios of document analysis: document classification based on visual appearance and logo detection. These two classical problems of document analysis are formulated following the interactive-predictive model, taking the user interaction into account to make easier the process of annotating and labelling the documents. A system implementing this model in a real scenario is presented and analyzed. This system also takes advantage of active learning techniques to speed up the task of labelling the documents.
|
|
|
Juan Ramon Terven Salinas, Joaquin Salas, & Bogdan Raducanu. (2013). Estado del Arte en Sistemas de Vision Artificial para Personas Invidentes. KS - Komputer Sapiens, 20–25.
|
|
|
Ariel Amato, Angel Sappa, Alicia Fornes, Felipe Lumbreras, & Josep Llados. (2013). Divide and Conquer: Atomizing and Parallelizing A Task in A Mobile Crowdsourcing Platform. In 2nd International ACM Workshop on Crowdsourcing for Multimedia (pp. 21–22).
Abstract: In this paper we present some conclusions about the advantages of having an efficient task formulation when a crowdsourcing platform is used. In particular we show how the task atomization and distribution can help to obtain results in an efficient way. Our proposal is based on a recursive splitting of the original task into a set of smaller and simpler tasks. As a result both more accurate and faster solutions are obtained. Our evaluation is performed on a set of ancient documents that need to be digitized.
|
|
|
Sergio Escalera. (2013). Multi-Modal Human Behaviour Analysis from Visual Data Sources. ERCIM - ERCIM News journal, 21–22.
Abstract: The Human Pose Recovery and Behaviour Analysis group (HuPBA), University of Barcelona, is developing a line of research on multi-modal analysis of humans in visual data. The novel technology is being applied in several scenarios with high social impact, including sign language recognition, assisted technology and supported diagnosis for the elderly and people with mental/physical disabilities, fitness conditioning, and Human Computer Interaction.
|
|
|
Kaida Xiao, Chenyang Fu, D.Mylonas, Dimosthenis Karatzas, & S. Wuerger. (2013). Unique Hue Data for Colour Appearance Models. Part ii: Chromatic Adaptation Transform. CRA - Color Research & Application, 38(1), 22–29.
Abstract: Unique hue settings of 185 observers under three room-lighting conditions were used to evaluate the accuracy of full and mixed chromatic adaptation transform models of CIECAM02 in terms of unique hue reproduction. Perceptual hue shifts in CIECAM02 were evaluated for both models with no clear difference using the current Commission Internationale de l'Éclairage (CIE) recommendation for mixed chromatic adaptation ratio. Using our large dataset of unique hue data as a benchmark, an optimised parameter is proposed for chromatic adaptation under mixed illumination conditions that produces more accurate results in unique hue reproduction. © 2011 Wiley Periodicals, Inc. Col Res Appl, 2013
|
|
|
S.Grau, Ana Puig, Sergio Escalera, & Maria Salamo. (2013). Intelligent Interactive Volume Classification. In Pacific Graphics (Vol. 32, pp. 23–28).
Abstract: This paper defines an intelligent and interactive framework to classify multiple regions of interest from the original data on demand, without requiring any preprocessing or previous segmentation. The proposed intelligent and interactive approach is divided in three stages: visualize, training and testing. First, users visualize and label some samples directly on slices of the volume. Training and testing are based on a framework of Error Correcting Output Codes and Adaboost classifiers that learn to classify each region the user has painted. Later, at the testing stage, each classifier is directly applied on the rest of samples and combined to perform multi-class labeling, being used in the final rendering. We also parallelized the training stage using a GPU-based implementation for
obtaining a rapid interaction and classification.
|
|
|
Joost Van de Weijer, & Fahad Shahbaz Khan. (2013). Fusing Color and Shape for Bag-of-Words Based Object Recognition. In 4th Computational Color Imaging Workshop (Vol. 7786, pp. 25–34). Springer Berlin Heidelberg.
Abstract: In this article we provide an analysis of existing methods for the incorporation of color in bag-of-words based image representations. We propose a list of desired properties on which bases fusing methods can be compared. We discuss existing methods and indicate shortcomings of the two well-known fusing methods, namely early and late fusion. Several recent works have addressed these shortcomings by exploiting top-down information in the bag-of-words pipeline: color attention which is motivated from human vision, and Portmanteau vocabularies which are based on information theoretic compression of product vocabularies. We point out several remaining challenges in cue fusion and provide directions for future research.
Keywords: Object Recognition; color features; bag-of-words; image classification
|
|
|
Carles Sanchez, Jorge Bernal, Debora Gil, & F. Javier Sanchez. (2013). On-line lumen centre detection in gastrointestinal and respiratory endoscopy. In Klaus Miguel Angel and Drechsler Stefan and González Ballester Raj and Wesarg Cristina and Shekhar Marius George and Oyarzun Laura M. and L. Erdt (Ed.), Second International Workshop Clinical Image-Based Procedures (Vol. 8361, pp. 31–38). LNCS. Springer International Publishing.
Abstract: We present in this paper a novel lumen centre detection for gastrointestinal and respiratory endoscopic images. The proposed method is based on the appearance and geometry of the lumen, which we defined as the darkest image region which centre is a hub of image gradients. Experimental results validated on the first public annotated gastro-respiratory database prove the reliability of the method for a wide range of images (with precision over 95 %).
Keywords: Lumen centre detection; Bronchoscopy; Colonoscopy
|
|
|
Joost Van de Weijer, Fahad Shahbaz Khan, & Marc Masana. (2013). Interactive Visual and Semantic Image Retrieval. In Angel Sappa, & Jordi Vitria (Eds.), Multimodal Interaction in Image and Video Applications (Vol. 48, pp. 31–35). Springer Berlin Heidelberg.
Abstract: One direct consequence of recent advances in digital visual data generation and the direct availability of this information through the World-Wide Web, is a urgent demand for efficient image retrieval systems. The objective of image retrieval is to allow users to efficiently browse through this abundance of images. Due to the non-expert nature of the majority of the internet users, such systems should be user friendly, and therefore avoid complex user interfaces. In this chapter we investigate how high-level information provided by recently developed object recognition techniques can improve interactive image retrieval. Wel apply a bagof- word based image representation method to automatically classify images in a number of categories. These additional labels are then applied to improve the image retrieval system. Next to these high-level semantic labels, we also apply a low-level image description to describe the composition and color scheme of the scene. Both descriptions are incorporated in a user feedback image retrieval setting. The main objective is to show that automatic labeling of images with semantic labels can improve image retrieval results.
|
|
|
David Fernandez, Simone Marinai, Josep Llados, & Alicia Fornes. (2013). Contextual Word Spotting in Historical Manuscripts using Markov Logic Networks. In 2nd International Workshop on Historical Document Imaging and Processing (pp. 36–43).
Abstract: Natural languages can often be modelled by suitable grammars whose knowledge can improve the word spotting results. The implicit contextual information is even more useful when dealing with information that is intrinsically described as one collection of records. In this paper, we present one approach to word spotting which uses the contextual information of records to improve the results. The method relies on Markov Logic Networks to probabilistically model the relational organization of handwritten records. The performance has been evaluated on the Barcelona Marriages Dataset that contains structured handwritten records that summarize marriage information.
|
|
|
Miquel Ferrer, I. Bardaji, Ernest Valveny, Dimosthenis Karatzas, & Horst Bunke. (2013). Median Graph Computation by Means of Graph Embedding into Vector Spaces. In Yun Fu, & Yungian Ma (Eds.), Graph Embedding for Pattern Analysis (pp. 45–72). Springer New York.
Abstract: In pattern recognition [8, 14], a key issue to be addressed when designing a system is how to represent input patterns. Feature vectors is a common option. That is, a set of numerical features describing relevant properties of the pattern are computed and arranged in a vector form. The main advantages of this kind of representation are computational simplicity and a well sound mathematical foundation. Thus, a large number of operations are available to work with vectors and a large repository of algorithms for pattern analysis and classification exist. However, the simple structure of feature vectors might not be the best option for complex patterns where nonnumerical features or relations between different parts of the pattern become relevant.
|
|