Adriana Romero, Petia Radeva, & Carlo Gatta. (2014). No more meta-parameter tuning in unsupervised sparse feature learning.
Abstract: CoRR abs/1402.5766
We propose a meta-parameter free, off-the-shelf, simple and fast unsupervised feature learning algorithm, which exploits a new way of optimizing for sparsity. Experiments on STL-10 show that the method presents state-of-the-art performance and provides discriminative features that generalize well.
|
Jorge Bernal, Joan M. Nuñez, F. Javier Sanchez, & Fernando Vilariño. (2014). Polyp Segmentation Method in Colonoscopy Videos by means of MSA-DOVA Energy Maps Calculation. In 3rd MICCAI Workshop on Clinical Image-based Procedures: Translational Research in Medical Imaging (Vol. 8680, pp. 41–49).
Abstract: In this paper we present a novel polyp region segmentation method for colonoscopy videos. Our method uses valley information associated to polyp boundaries in order to provide an initial segmentation. This first segmentation is refined to eliminate boundary discontinuities caused by image artifacts or other elements of the scene. Experimental results over a publicly annotated database show that our method outperforms both general and specific segmentation methods by providing more accurate regions rich in polyp content. We also prove how image preprocessing is needed to improve final polyp region segmentation.
Keywords: Image segmentation; Polyps; Colonoscopy; Valley information; Energy maps
|
P. Ricaurte, C. Chilan, Cristhian A. Aguilera-Carrasco, Boris X. Vintimilla, & Angel Sappa. (2014). Performance Evaluation of Feature Point Descriptors in the Infrared Domain. In 9th International Conference on Computer Vision Theory and Applications (Vol. 1, pp. 545–550).
Abstract: This paper presents a comparative evaluation of classical feature point descriptors when they are used in the long-wave infrared spectral band. Robustness to changes in rotation, scaling, blur, and additive noise are evaluated using a state of the art framework. Statistical results using an outdoor image data set are presented together with a discussion about the differences with respect to the results obtained when images from the visible spectrum are considered.
Keywords: Infrared Imaging; Feature Point Descriptors
|
Naveen Onkarappa, Cristhian A. Aguilera-Carrasco, Boris X. Vintimilla, & Angel Sappa. (2014). Cross-spectral Stereo Correspondence using Dense Flow Fields. In 9th International Conference on Computer Vision Theory and Applications (Vol. 3, pp. 613–617).
Abstract: This manuscript addresses the cross-spectral stereo correspondence problem. It proposes the usage of a dense flow field based representation instead of the original cross-spectral images, which have a low correlation. In this way, working in the flow field space, classical cost functions can be used as similarity measures. Preliminary experimental results on urban environments have been obtained showing the validity of the proposed approach.
Keywords: Cross-spectral Stereo Correspondence; Dense Optical Flow; Infrared and Visible Spectrum
|
Ariel Amato, Felipe Lumbreras, & Angel Sappa. (2014). A General-purpose Crowdsourcing Platform for Mobile Devices. In 9th International Conference on Computer Vision Theory and Applications (Vol. 3, pp. 211–215).
Abstract: This paper presents details of a general purpose micro-task on-demand platform based on the crowdsourcing philosophy. This platform was specifically developed for mobile devices in order to exploit the strengths of such devices; namely: i) massivity, ii) ubiquity and iii) embedded sensors. The combined use of mobile platforms and the crowdsourcing model allows to tackle from the simplest to the most complex tasks. Users experience is the highlighted feature of this platform (this fact is extended to both task-proposer and tasksolver). Proper tools according with a specific task are provided to a task-solver in order to perform his/her job in a simpler, faster and appealing way. Moreover, a task can be easily submitted by just selecting predefined templates, which cover a wide range of possible applications. Examples of its usage in computer vision and computer games are provided illustrating the potentiality of the platform.
Keywords: Crowdsourcing Platform; Mobile Crowdsourcing
|
Christophe Rigaud, Dimosthenis Karatzas, Jean-Christophe Burie, & Jean-Marc Ogier. (2014). Color descriptor for content-based drawing retrieval. In 11th IAPR International Workshop on Document Analysis and Systems (pp. 267–271).
Abstract: Human detection in computer vision field is an active field of research. Extending this to human-like drawings such as the main characters in comic book stories is not trivial. Comics analysis is a very recent field of research at the intersection of graphics, texts, objects and people recognition. The detection of the main comic characters is an essential step towards a fully automatic comic book understanding. This paper presents a color-based approach for comics character retrieval using content-based drawing retrieval and color palette.
|
Clement Guerin, Christophe Rigaud, Karell Bertet, Jean-Christophe Burie, Arnaud Revel, & Jean-Marc Ogier. (2014). Réduction de l’espace de recherche pour les personnages de bandes dessinées. In 19th National Congress Reconnaissance de Formes et l'Intelligence Artificielle.
Abstract: Les bandes dessinées représentent un patrimoine culturel important dans de nombreux pays et leur numérisation massive offre la possibilité d'effectuer des recherches dans le contenu des images. À ce jour, ce sont principalement les structures des pages et leurs contenus textuels qui ont été étudiés, peu de travaux portent sur le contenu graphique. Nous proposons de nous appuyer sur des éléments déjà étudiés tels que la position des cases et des bulles, pour réduire l'espace de recherche et localiser les personnages en fonction de la queue des bulles. L'évaluation de nos différentes contributions à partir de la base eBDtheque montre un taux de détection des queues de bulle de 81.2%, de localisation des personnages allant jusqu'à 85% et un gain d'espace de recherche de plus de 50%.
Keywords: contextual search; document analysis; comics characters
|
Christophe Rigaud, & Clement Guerin. (2014). Localisation contextuelle des personnages de bandes dessinées. In Colloque International Francophone sur l'Écrit et le Document.
Abstract: Les auteurs proposent une méthode de localisation des personnages dans des cases de bandes dessinées en s'appuyant sur les caractéristiques des bulles de dialogue. L'évaluation montre un taux de localisation des personnages allant jusqu'à 65%.
|
Alicia Fornes, & Gemma Sanchez. (2014). Analysis and Recognition of Music Scores. In D. Doermann, & K. Tombre (Eds.), Handbook of Document Image Processing and Recognition (Vol. E, pp. 749–774). Springer London.
Abstract: The analysis and recognition of music scores has attracted the interest of researchers for decades. Optical Music Recognition (OMR) is a classical research field of Document Image Analysis and Recognition (DIAR), whose aim is to extract information from music scores. Music scores contain both graphical and textual information, and for this reason, techniques are closely related to graphics recognition and text recognition. Since music scores use a particular diagrammatic notation that follow the rules of music theory, many approaches make use of context information to guide the recognition and solve ambiguities. This chapter overviews the main Optical Music Recognition (OMR) approaches. Firstly, the different methods are grouped according to the OMR stages, namely, staff removal, music symbol recognition, and syntactical analysis. Secondly, specific approaches for old and handwritten music scores are reviewed. Finally, online approaches and commercial systems are also commented.
|
Michal Drozdzal. (2014). Sequential image analysis for computer-aided wireless endoscopy (Petia Radeva, Ed.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: Wireless Capsule Endoscopy (WCE) is a technique for inner-visualization of the entire small intestine and, thus, offers an interesting perspective on intestinal motility. The two major drawbacks of this technique are: 1) huge amount of data acquired by WCE makes the motility analysis tedious and 2) since the capsule is the first tool that offers complete inner-visualization of the small intestine,the exact importance of the observed events is still an open issue. Therefore, in this thesis, a novel computer-aided system for intestinal motility analysis is presented. The goal of the system is to provide an easily-comprehensible visual description of motility-related intestinal events to a physician. In order to do so, several tools based either on computer vision concepts or on machine learning techniques are presented. A method for transforming 3D video signal to a holistic image of intestinal motility, called motility bar, is proposed. The method calculates the optimal mapping from video into image from the intestinal motility point of view.
To characterize intestinal motility, methods for automatic extraction of motility information from WCE are presented. Two of them are based on the motility bar and two of them are based on frame-per-frame analysis. In particular, four algorithms dealing with the problems of intestinal contraction detection, lumen size estimation, intestinal content characterization and wrinkle frame detection are proposed and validated. The results of the algorithms are converted into sequential features using an online statistical test. This test is designed to work with multivariate data streams. To this end, we propose a novel formulation of concentration inequality that is introduced into a robust adaptive windowing algorithm for multivariate data streams. The algorithm is used to obtain robust representation of segments with constant intestinal motility activity. The obtained sequential features are shown to be discriminative in the problem of abnormal motility characterization.
Finally, we tackle the problem of efficient labeling. To this end, we incorporate active learning concepts to the problems present in WCE data and propose two approaches. The first one is based the concepts of sequential learning and the second one adapts the partition-based active learning to an error-free labeling scheme. All these steps are sufficient to provide an extensive visual description of intestinal motility that can be used by an expert as decision support system.
|
Carlo Gatta, Adriana Romero, & Joost Van de Weijer. (2014). Unrolling loopy top-down semantic feedback in convolutional deep networks. In Workshop on Deep Vision: Deep Learning for Computer Vision (pp. 498–505).
Abstract: In this paper, we propose a novel way to perform top-down semantic feedback in convolutional deep networks for efficient and accurate image parsing. We also show how to add global appearance/semantic features, which have shown to improve image parsing performance in state-of-the-art methods, and was not present in previous convolutional approaches. The proposed method is characterised by an efficient training and a sufficiently fast testing. We use the well known SIFTflow dataset to numerically show the advantages provided by our contributions, and to compare with state-of-the-art image parsing convolutional based approaches.
|
Dimosthenis Karatzas, Sergi Robles, & Lluis Gomez. (2014). An on-line platform for ground truthing and performance evaluation of text extraction systems. In 11th IAPR International Workshop on Document Analysis and Systems (pp. 242–246).
Abstract: This paper presents a set of on-line software tools for creating ground truth and calculating performance evaluation metrics for text extraction tasks such as localization, segmentation and recognition. The platform supports the definition of comprehensive ground truth information at different text representation levels while it offers centralised management and quality control of the ground truthing effort. It implements a range of state of the art performance evaluation algorithms and offers functionality for the definition of evaluation scenarios, on-line calculation of various performance metrics and visualisation of the results. The
presented platform, which comprises the backbone of the ICDAR 2011 (challenge 1) and 2013 (challenges 1 and 2) Robust Reading competitions, is now made available for public use.
|
Lluis Gomez, & Dimosthenis Karatzas. (2014). MSER-based Real-Time Text Detection and Tracking. In 22nd International Conference on Pattern Recognition (pp. 3110–3115).
Abstract: We present a hybrid algorithm for detection and tracking of text in natural scenes that goes beyond the fulldetection approaches in terms of time performance optimization.
A state-of-the-art scene text detection module based on Maximally Stable Extremal Regions (MSER) is used to detect text asynchronously, while on a separate thread detected text objects are tracked by MSER propagation. The cooperation of these two modules yields real time video processing at high frame rates even on low-resource devices.
|
Alejandro Tabas, Emili Balaguer-Ballester, & Laura Igual. (2014). Spatial Discriminant ICA for RS-fMRI characterisation. In 4th International Workshop on Pattern Recognition in Neuroimaging (pp. 1–4).
Abstract: Resting-State fMRI (RS-fMRI) is a brain imaging technique useful for exploring functional connectivity. A major point of interest in RS-fMRI analysis is to isolate connectivity patterns characterising disorders such as for instance ADHD. Such characterisation is usually performed in two steps: first, all connectivity patterns in the data are extracted by means of Independent Component Analysis (ICA); second, standard statistical tests are performed over the extracted patterns to find differences between control and clinical groups. In this work we introduce a novel, single-step, approach for this problem termed Spatial Discriminant ICA. The algorithm can efficiently isolate networks of functional connectivity characterising a clinical group by combining ICA and a new variant of the Fisher’s Linear Discriminant also introduced in this work. As the characterisation is carried out in a single step, it potentially provides for a richer characterisation of inter-class differences. The algorithm is tested using synthetic and real fMRI data, showing promising results in both experiments.
|
Oualid M. Benkarim, Petia Radeva, & Laura Igual. (2014). Label Consistent Multiclass Discriminative Dictionary Learning for MRI Segmentation. In 8th Conference on Articulated Motion and Deformable Objects (Vol. 8563, pp. 138–147). LNCS. Springer International Publishing.
Abstract: The automatic segmentation of multiple subcortical structures in brain Magnetic Resonance Images (MRI) still remains a challenging task. In this paper, we address this problem using sparse representation and discriminative dictionary learning, which have shown promising results in compression, image denoising and recently in MRI segmentation. Particularly, we use multiclass dictionaries learned from a set of brain atlases to simultaneously segment multiple subcortical structures.
We also impose dictionary atoms to be specialized in one given class using label consistent K-SVD, which can alleviate the bias produced by unbalanced libraries, present when dealing with small structures. The proposed method is compared with other state of the art approaches for the segmentation of the Basal Ganglia of 35 subjects of a public dataset.
The promising results of the segmentation method show the eciency of the multiclass discriminative dictionary learning algorithms in MRI segmentation problems.
Keywords: MRI segmentation; sparse representation; discriminative dic- tionary learning; multiclass classication
|