David Fernandez, Jon Almazan, Nuria Cirera, Alicia Fornes, & Josep Llados. (2014). BH2M: the Barcelona Historical Handwritten Marriages database. In 22nd International Conference on Pattern Recognition (pp. 256–261).
Abstract: This paper presents an image database of historical handwritten marriages records stored in the archives of Barcelona cathedral, and the corresponding meta-data addressed to evaluate the performance of document analysis algorithms. The contribution of this paper is twofold. First, it presents a complete ground truth which covers the whole pipeline of handwriting
recognition research, from layout analysis to recognition and understanding. Second, it is the first dataset in the emerging area of genealogical document analysis, where documents are manuscripts pseudo-structured with specific lexicons and the interest is beyond pure transcriptions but context dependent.
|
Pau Riba, Jon Almazan, Alicia Fornes, David Fernandez, Ernest Valveny, & Josep Llados. (2014). e-Crowds: a mobile platform for browsing and searching in historical demographyrelated manuscripts. In 14th International Conference on Frontiers in Handwriting Recognition (pp. 228–233).
Abstract: This paper presents a prototype system running on portable devices for browsing and word searching through historical handwritten document collections. The platform adapts the paradigm of eBook reading, where the narrative is not necessarily sequential, but centered on the user actions. The novelty is to replace digitally born books by digitized historical manuscripts of marriage licenses, so document analysis tasks are required in the browser. With an active reading paradigm, the user can cast queries of people names, so he/she can implicitly follow genealogical links. In addition, the system allows combined searches: the user can refine a search by adding more words to search. As a second contribution, the retrieval functionality involves as a core technology a word spotting module with an unified approach, which allows combined query searches, and also two input modalities: query-by-example, and query-by-string.
|
Marc Serra, Olivier Penacchio, Robert Benavente, Maria Vanrell, & Dimitris Samaras. (2014). The Photometry of Intrinsic Images. In 27th IEEE Conference on Computer Vision and Pattern Recognition (pp. 1494–1501).
Abstract: Intrinsic characterization of scenes is often the best way to overcome the illumination variability artifacts that complicate most computer vision problems, from 3D reconstruction to object or material recognition. This paper examines the deficiency of existing intrinsic image models to accurately account for the effects of illuminant color and sensor characteristics in the estimation of intrinsic images and presents a generic framework which incorporates insights from color constancy research to the intrinsic image decomposition problem. The proposed mathematical formulation includes information about the color of the illuminant and the effects of the camera sensors, both of which modify the observed color of the reflectance of the objects in the scene during the acquisition process. By modeling these effects, we get a “truly intrinsic” reflectance image, which we call absolute reflectance, which is invariant to changes of illuminant or camera sensors. This model allows us to represent a wide range of intrinsic image decompositions depending on the specific assumptions on the geometric properties of the scene configuration and the spectral properties of the light source and the acquisition system, thus unifying previous models in a single general framework. We demonstrate that even partial information about sensors improves significantly the estimated reflectance images, thus making our method applicable for a wide range of sensors. We validate our general intrinsic image framework experimentally with both synthetic data and natural images.
|
Carlo Gatta, Adriana Romero, & Joost Van de Weijer. (2014). Unrolling loopy top-down semantic feedback in convolutional deep networks. In Workshop on Deep Vision: Deep Learning for Computer Vision (pp. 498–505).
Abstract: In this paper, we propose a novel way to perform top-down semantic feedback in convolutional deep networks for efficient and accurate image parsing. We also show how to add global appearance/semantic features, which have shown to improve image parsing performance in state-of-the-art methods, and was not present in previous convolutional approaches. The proposed method is characterised by an efficient training and a sufficiently fast testing. We use the well known SIFTflow dataset to numerically show the advantages provided by our contributions, and to compare with state-of-the-art image parsing convolutional based approaches.
|
Jorge Bernal, Joan M. Nuñez, F. Javier Sanchez, & Fernando Vilariño. (2014). Polyp Segmentation Method in Colonoscopy Videos by means of MSA-DOVA Energy Maps Calculation. In 3rd MICCAI Workshop on Clinical Image-based Procedures: Translational Research in Medical Imaging (Vol. 8680, pp. 41–49).
Abstract: In this paper we present a novel polyp region segmentation method for colonoscopy videos. Our method uses valley information associated to polyp boundaries in order to provide an initial segmentation. This first segmentation is refined to eliminate boundary discontinuities caused by image artifacts or other elements of the scene. Experimental results over a publicly annotated database show that our method outperforms both general and specific segmentation methods by providing more accurate regions rich in polyp content. We also prove how image preprocessing is needed to improve final polyp region segmentation.
Keywords: Image segmentation; Polyps; Colonoscopy; Valley information; Energy maps
|
Patricia Marquez, H. Kause, A. Fuster, Aura Hernandez-Sabate, L. Florack, Debora Gil, et al. (2014). Factors Affecting Optical Flow Performance in Tagging Magnetic Resonance Imaging. In 17th International Conference on Medical Image Computing and Computer Assisted Intervention (Vol. 8896, pp. 231–238). LNCS. Springer International Publishing.
Abstract: Changes in cardiac deformation patterns are correlated with cardiac pathologies. Deformation can be extracted from tagging Magnetic Resonance Imaging (tMRI) using Optical Flow (OF) techniques. For applications of OF in a clinical setting it is important to assess to what extent the performance of a particular OF method is stable across dierent clinical acquisition artifacts. This paper presents a statistical validation framework, based on ANOVA, to assess the motion and appearance factors that have the largest in uence on OF accuracy drop.
In order to validate this framework, we created a database of simulated tMRI data including the most common artifacts of MRI and test three dierent OF methods, including HARP.
Keywords: Optical flow; Performance Evaluation; Synthetic Database; ANOVA; Tagging Magnetic Resonance Imaging
|
Jorge Bernal, Debora Gil, Carles Sanchez, & F. Javier Sanchez. (2014). Discarding Non Informative Regions for Efficient Colonoscopy Image Analysis. In 1st MICCAI Workshop on Computer-Assisted and Robotic Endoscopy (Vol. 8899, pp. 1–10). LNCS. Springer International Publishing.
Abstract: In this paper we present a novel polyp region segmentation method for colonoscopy videos. Our method uses valley information associated to polyp boundaries in order to provide an initial segmentation. This first segmentation is refined to eliminate boundary discontinuities caused by image artifacts or other elements of the scene. Experimental results over a publicly annotated database show that our method outperforms both general and specific segmentation methods by providing more accurate regions rich in polyp content. We also prove how image preprocessing is needed to improve final polyp region segmentation.
Keywords: Image Segmentation; Polyps, Colonoscopy; Valley Information; Energy Maps
|
Joan M. Nuñez, Jorge Bernal, Miquel Ferrer, & Fernando Vilariño. (2014). Impact of Keypoint Detection on Graph-based Characterization of Blood Vessels in Colonoscopy Videos. In CARE workshop.
Abstract: We explore the potential of the use of blood vessels as anatomical landmarks for developing image registration methods in colonoscopy images. An unequivocal representation of blood vessels could be used to guide follow-up methods to track lesions over different interventions. We propose a graph-based representation to characterize network structures, such as blood vessels, based on the use of intersections and endpoints. We present a study consisting of the assessment of the minimal performance a keypoint detector should achieve so that the structure can still be recognized. Experimental results prove that, even by achieving a loss of 35% of the keypoints, the descriptive power of the associated graphs to the vessel pattern is still high enough to recognize blood vessels.
Keywords: Colonoscopy; Graph Matching; Biometrics; Vessel; Intersection
|
Sergio Vera, Debora Gil, & Miguel Angel Gonzalez Ballester. (2014). Anatomical parameterization for volumetric meshing of the liver. In SPIE – Medical Imaging (Vol. 9036).
Abstract: A coordinate system describing the interior of organs is a powerful tool for a systematic localization of injured tissue. If the same coordinate values are assigned to specific anatomical landmarks, the coordinate system allows integration of data across different medical image modalities. Harmonic mappings have been used to produce parametric coordinate systems over the surface of anatomical shapes, given their flexibility to set values
at specific locations through boundary conditions. However, most of the existing implementations in medical imaging restrict to either anatomical surfaces, or the depth coordinate with boundary conditions is given at sites
of limited geometric diversity. In this paper we present a method for anatomical volumetric parameterization that extends current harmonic parameterizations to the interior anatomy using information provided by the
volume medial surface. We have applied the methodology to define a common reference system for the liver shape and functional anatomy. This reference system sets a solid base for creating anatomical models of the patient’s liver, and allows comparing livers from several patients in a common framework of reference.
Keywords: Coordinate System; Anatomy Modeling; Parameterization
|
Hongxing Gao, Marçal Rusiñol, Dimosthenis Karatzas, & Josep Llados. (2014). Fast Structural Matching for Document Image Retrieval through Spatial Databases. In Document Recognition and Retrieval XXI (Vol. 9021).
Abstract: The structure of document images plays a signicant role in document analysis thus considerable eorts have been made towards extracting and understanding document structure, usually in the form of layout analysis approaches. In this paper, we rst employ Distance Transform based MSER (DTMSER) to eciently extract stable document structural elements in terms of a dendrogram of key-regions. Then a fast structural matching method is proposed to query the structure of document (dendrogram) based on a spatial database which facilitates the formulation of advanced spatial queries. The experiments demonstrate a signicant improvement in a document retrieval scenario when compared to the use of typical Bag of Words (BoW) and pyramidal BoW descriptors.
Keywords: Document image retrieval; distance transform; MSER; spatial database
|
Francesco Brughi, Debora Gil, Llorenç Badiella, Eva Jove Casabella, & Oriol Ramos Terrades. (2014). Exploring the impact of inter-query variability on the performance of retrieval systems. In 11th International Conference on Image Analysis and Recognition (Vol. 8814, 413–420). LNCS. Springer International Publishing.
Abstract: This paper introduces a framework for evaluating the performance of information retrieval systems. Current evaluation metrics provide an average score that does not consider performance variability across the query set. In this manner, conclusions lack of any statistical significance, yielding poor inference to cases outside the query set and possibly unfair comparisons. We propose to apply statistical methods in order to obtain a more informative measure for problems in which different query classes can be identified. In this context, we assess the performance variability on two levels: overall variability across the whole query set and specific query class-related variability. To this end, we estimate confidence bands for precision-recall curves, and we apply ANOVA in order to assess the significance of the performance across different query classes.
|
Albert Gordo, Florent Perronnin, Yunchao Gong, & Svetlana Lazebnik. (2014). Asymmetric Distances for Binary Embeddings. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(1), 33–47.
Abstract: In large-scale query-by-example retrieval, embedding image signatures in a binary space offers two benefits: data compression and search efficiency. While most embedding algorithms binarize both query and database signatures, it has been noted that this is not strictly a requirement. Indeed, asymmetric schemes which binarize the database signatures but not the query still enjoy the same two benefits but may provide superior accuracy. In this work, we propose two general asymmetric distances which are applicable to a wide variety of embedding techniques including Locality Sensitive Hashing (LSH), Locality Sensitive Binary Codes (LSBC), Spectral Hashing (SH), PCA Embedding (PCAE), PCA Embedding with random rotations (PCAE-RR), and PCA Embedding with iterative quantization (PCAE-ITQ). We experiment on four public benchmarks containing up to 1M images and show that the proposed asymmetric distances consistently lead to large improvements over the symmetric Hamming distance for all binary embedding techniques.
|
Jiaolong Xu, Sebastian Ramos, David Vazquez, & Antonio Lopez. (2014). Domain Adaptation of Deformable Part-Based Models. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(12), 2367–2380.
Abstract: The accuracy of object classifiers can significantly drop when the training data (source domain) and the application scenario (target domain) have inherent differences. Therefore, adapting the classifiers to the scenario in which they must operate is of paramount importance. We present novel domain adaptation (DA) methods for object detection. As proof of concept, we focus on adapting the state-of-the-art deformable part-based model (DPM) for pedestrian detection. We introduce an adaptive structural SVM (A-SSVM) that adapts a pre-learned classifier between different domains. By taking into account the inherent structure in feature space (e.g., the parts in a DPM), we propose a structure-aware A-SSVM (SA-SSVM). Neither A-SSVM nor SA-SSVM needs to revisit the source-domain training data to perform the adaptation. Rather, a low number of target-domain training examples (e.g., pedestrians) are used. To address the scenario where there are no target-domain annotated samples, we propose a self-adaptive DPM based on a self-paced learning (SPL) strategy and a Gaussian Process Regression (GPR). Two types of adaptation tasks are assessed: from both synthetic pedestrians and general persons (PASCAL VOC) to pedestrians imaged from an on-board camera. Results show that our proposals avoid accuracy drops as high as 15 points when comparing adapted and non-adapted detectors.
Keywords: Domain Adaptation; Pedestrian Detection
|
Santiago Segui, Michal Drozdzal, Ekaterina Zaytseva, Fernando Azpiroz, Petia Radeva, & Jordi Vitria. (2014). Detection of wrinkle frames in endoluminal videos using betweenness centrality measures for images. TITB - IEEE Transactions on Information Technology in Biomedicine, 18(6), 1831–1838.
Abstract: Intestinal contractions are one of the most important events to diagnose motility pathologies of the small intestine. When visualized by wireless capsule endoscopy (WCE), the sequence of frames that represents a contraction is characterized by a clear wrinkle structure in the central frames that corresponds to the folding of the intestinal wall. In this paper we present a new method to robustly detect wrinkle frames in full WCE videos by using a new mid-level image descriptor that is based on a centrality measure proposed for graphs. We present an extended validation, carried out in a very large database, that shows that the proposed method achieves state of the art performance for this task.
Keywords: Wireless Capsule Endoscopy; Small Bowel Motility Dysfunction; Contraction Detection; Structured Prediction; Betweenness Centrality
|
Josep Llados, & Marçal Rusiñol. (2014). Graphics Recognition Techniques. In D. Doermann, & K. Tombre (Eds.), Handbook of Document Image Processing and Recognition (Vol. D, pp. 489–521). Springer London.
Abstract: This chapter describes the most relevant approaches for the analysis of graphical documents. The graphics recognition pipeline can be splitted into three tasks. The low level or lexical task extracts the basic units composing the document. The syntactic level is focused on the structure, i.e., how graphical entities are constructed, and involves the location and classification of the symbols present in the document. The third level is a functional or semantic level, i.e., it models what the graphical symbols do and what they mean in the context where they appear. This chapter covers the lexical level, while the next two chapters are devoted to the syntactic and semantic level, respectively. The main problems reviewed in this chapter are raster-to-vector conversion (vectorization algorithms) and the separation of text and graphics components. The research and industrial communities have provided standard methods achieving reasonable performance levels. Hence, graphics recognition techniques can be considered to be in a mature state from a scientific point of view. Additionally this chapter provides insights on some related problems, namely, the extraction and recognition of dimensions in engineering drawings, and the recognition of hatched and tiled patterns. Both problems are usually associated, even integrated, in the vectorization process.
Keywords: Dimension recognition; Graphics recognition; Graphic-rich documents; Polygonal approximation; Raster-to-vector conversion; Texture-based primitive extraction; Text-graphics separation
|