Partha Pratim Roy, Umapada Pal, Josep Llados, & Mathieu Nicolas Delalandre. (2012). Multi-oriented touching text character segmentation in graphical documents using dynamic programming. PR - Pattern Recognition, 45(5), 1972–1983.
Abstract: 2,292 JCR
The touching character segmentation problem becomes complex when touching strings are multi-oriented. Moreover in graphical documents sometimes characters in a single-touching string have different orientations. Segmentation of such complex touching is more challenging. In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characters. When two or more characters touch, they generate a big cavity region in the background portion. Based on the convex hull information, at first, we use this background information to find some initial points for segmentation of a touching string into possible primitives (a primitive consists of a single character or part of a character). Next, the primitives are merged to get optimum segmentation. A dynamic programming algorithm is applied for this purpose using the total likelihood of characters as the objective function. A SVM classifier is used to find the likelihood of a character. To consider multi-oriented touching strings the features used in the SVM are invariant to character orientation. Experiments were performed in different databases of real and synthetic touching characters and the results show that the method is efficient in segmenting touching characters of arbitrary orientations and sizes.
|
Partha Pratim Roy, Umapada Pal, & Josep Llados. (2012). Text line extraction in graphical documents using background and foreground. IJDAR - International Journal on Document Analysis and Recognition, 15(3), 227–241.
Abstract: 0,405 JCR
In graphical documents (e.g., maps, engineering drawings), artistic documents etc., the text lines are annotated in multiple orientations or curvilinear way to illustrate different locations or symbols. For the optical character recognition of such documents, individual text lines from the documents need to be extracted. In this paper, we propose a novel method to segment such text lines and the method is based on the foreground and background information of the text components. To effectively utilize the background information, a water reservoir concept is used here. In the proposed scheme, at first, individual components are detected and grouped into character clusters in a hierarchical way using size and positional information. Next, the clusters are extended in two extreme sides to determine potential candidate regions. Finally, with the help of these candidate regions,
individual lines are extracted. The experimental results are presented on different datasets of graphical documents, camera-based warped documents, noisy images containing seals, etc. The results demonstrate that our approach is robust and invariant to size and orientation of the text lines present in
the document.
|
Thanh Ha Do, Salvatore Tabbone, & Oriol Ramos Terrades. (2012). Text/graphic separation using a sparse representation with multi-learned dictionaries. In 21st International Conference on Pattern Recognition.
Abstract: In this paper, we propose a new approach to extract text regions from graphical documents. In our method, we first empirically construct two sequences of learned dictionaries for the text and graphical parts respectively. Then, we compute the sparse representations of all different sizes and non-overlapped document patches in these learned dictionaries. Based on these representations, each patch can be classified into the text or graphic category by comparing its reconstruction errors. Same-sized patches in one category are then merged together to define the corresponding text or graphic layers which are combined to createfinal text/graphic layer. Finally, in a post-processing step, text regions are further filtered out by using some learned thresholds.
Keywords: Graphics Recognition; Layout Analysis; Document Understandin
|
Thanh Ha Do, Salvatore Tabbone, & Oriol Ramos Terrades. (2012). Noise suppression over bi-level graphical documents using a sparse representation. In Colloque International Francophone sur l'Écrit et le Document.
|
Adriana Romero, Simeon Petkov, Carlo Gatta, M.Sabate, & Petia Radeva. (2012). Efficient automatic segmentation of vessels. In 16th Conference on Medical Image Understanding and Analysis.
|
Pedro Martins, Carlo Gatta, & Paulo Carvalho. (2012). Feature-driven Maximally Stable Extremal Regions. In 7th International Conference on Computer Vision Theory and Applications (pp. 490–497).
|
Pedro Martins, Paulo Carvalho, & Carlo Gatta. (2012). Context Aware Keypoint Extraction for Robust Image Representation. In 23rd British Machine Vision Conference (100.pp. 1–100.12).
|
Antonio Hernandez, Carlo Gatta, Sergio Escalera, Laura Igual, Victoria Martin-Yuste, Manel Sabate, et al. (2012). Accurate coronary centerline extraction, caliber estimation and catheter detection in angiographies. TITB - IEEE Transactions on Information Technology in Biomedicine, 16(6), 1332–1340.
Abstract: Segmentation of coronary arteries in X-Ray angiography is a fundamental tool to evaluate arterial diseases and choose proper coronary treatment. The accurate segmentation of coronary arteries has become an important topic for the registration of different modalities which allows physicians rapid access to different medical imaging information from Computed Tomography (CT) scans or Magnetic Resonance Imaging (MRI). In this paper, we propose an accurate fully automatic algorithm based on Graph-cuts for vessel centerline extraction, caliber estimation, and catheter detection. Vesselness, geodesic paths, and a new multi-scale edgeness map are combined to customize the Graph-cuts approach to the segmentation of tubular structures, by means of a global optimization of the Graph-cuts energy function. Moreover, a novel supervised learning methodology that integrates local and contextual information is proposed for automatic catheter detection. We evaluate the method performance on three datasets coming from different imaging systems. The method performs as good as the expert observer w.r.t. centerline detection and caliber estimation. Moreover, the method discriminates between arteries and catheter with an accuracy of 96.5%, sensitivity of 72%, and precision of 97.4%.
|
Josep M. Gonfaus, Theo Gevers, Arjan Gijsenij, Xavier Roca, & Jordi Gonzalez. (2012). Edge Classification using Photo-Geo metric features. In 21st International Conference on Pattern Recognition (pp. 1497–1500).
Abstract: Edges are caused by several imaging cues such as shadow, material and illumination transitions. Classification methods have been proposed which are solely based on photometric information, ignoring geometry to classify the physical nature of edges in images. In this paper, the aim is to present a novel strategy to handle both photometric and geometric information for edge classification. Photometric information is obtained through the use of quasi-invariants while geometric information is derived from the orientation and contrast of edges. Different combination frameworks are compared with a new principled approach that captures both information into the same descriptor. From large scale experiments on different datasets, it is shown that, in addition to photometric information, the geometry of edges is an important visual cue to distinguish between different edge types. It is concluded that by combining both cues the performance improves by more than 7% for shadows and highlights.
|
Laura Igual, Joan Carles Soliva, Sergio Escalera, Roger Gimeno, Oscar Vilarroya, & Petia Radeva. (2012). Automatic Brain Caudate Nuclei Segmentation and Classification in Diagnostic of Attention-Deficit/Hyperactivity Disorder. CMIG - Computerized Medical Imaging and Graphics, 36(8), 591–600.
Abstract: We present a fully automatic diagnostic imaging test for Attention-Deficit/Hyperactivity Disorder diagnosis assistance based on previously found evidences of caudate nucleus volumetric abnormalities. The proposed method consists of different steps: a new automatic method for external and internal segmentation of caudate based on Machine Learning methodologies; the definition of a set of new volume relation features, 3D Dissociated Dipoles, used for caudate representation and classification. We separately validate the contributions using real data from a pediatric population and show precise internal caudate segmentation and discrimination power of the diagnostic test, showing significant performance improvements in comparison to other state-of-the-art methods.
Keywords: Automatic caudate segmentation; Attention-Deficit/Hyperactivity Disorder; Diagnostic test; Machine learning; Decision stumps; Dissociated dipoles
|
Francesco Ciompi. (2012). Multi-Class Learning for Vessel Characterization in Intravascular Ultrasound (Petia Radeva, & Oriol Pujol, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: In this thesis we tackle the problem of automatic characterization of human coronary vessel in Intravascular Ultrasound (IVUS) image modality. The basis for the whole characterization process is machine learning applied to multi-class problems. In all the presented approaches, the Error-Correcting Output Codes (ECOC) framework is used as central element for the design of multi-class classifiers.
Two main topics are tackled in this thesis. First, the automatic detection of the vessel borders is presented. For this purpose, a novel context-aware classifier for multi-class classification of the vessel morphology is presented, namely ECOC-DRF. Based on ECOC-DRF, the lumen border and the media-adventitia border in IVUS are robustly detected by means of a novel holistic approach, achieving an error comparable with inter-observer variability and with state of the art methods.
The two vessel borders define the atheroma area of the vessel. In this area, tissue characterization is required. For this purpose, we present a framework for automatic plaque characterization by processing both texture in IVUS images and spectral information in raw Radio Frequency data. Furthermore, a novel method for fusing in-vivo and in-vitro IVUS data for plaque characterization is presented, namely pSFFS. The method demonstrates to effectively fuse data generating a classifier that improves the tissue characterization in both in-vitro and in-vivo datasets.
A novel method for automatic video summarization in IVUS sequences is also presented. The method aims to detect the key frames of the sequence, i.e., the frames representative of morphological changes. This novel method represents the basis for video summarization in IVUS as well as the markers for the partition of the vessel into morphological and clinically interesting events.
Finally, multi-class learning based on ECOC is applied to lung tissue characterization in Computed Tomography. The novel proposed approach, based on supervised and unsupervised learning, achieves accurate tissue classification on a large and heterogeneous dataset.
|
Antonio Hernandez, Miguel Reyes, Victor Ponce, & Sergio Escalera. (2012). GrabCut-Based Human Segmentation in Video Sequences. SENS - Sensors, 12(11), 15376–15393.
Abstract: In this paper, we present a fully-automatic Spatio-Temporal GrabCut human segmentation methodology that combines tracking and segmentation. GrabCut initialization is performed by a HOG-based subject detection, face detection, and skin color model. Spatial information is included by Mean Shift clustering whereas temporal coherence is considered by the historical of Gaussian Mixture Models. Moreover, full face and pose recovery is obtained by combining human segmentation with Active Appearance Models and Conditional Random Fields. Results over public datasets and in a new Human Limb dataset show a robust segmentation and recovery of both face and pose using the presented methodology.
Keywords: segmentation; human pose recovery; GrabCut; GraphCut; Active Appearance Models; Conditional Random Field
|
Arnau Ramisa, David Aldavert, Shrihari Vasudevan, Ricardo Toledo, & Ramon Lopez de Mantaras. (2012). Evaluation of Three Vision Based Object Perception Methods for a Mobile Robot. JIRC - Journal of Intelligent and Robotic Systems, 68(2), 185–208.
Abstract: This paper addresses visual object perception applied to mobile robotics. Being able to perceive household objects in unstructured environments is a key capability in order to make robots suitable to perform complex tasks in home environments. However, finding a solution for this task is daunting: it requires the ability to handle the variability in image formation in a moving camera with tight time constraints. The paper brings to attention some of the issues with applying three state of the art object recognition and detection methods in a mobile robotics scenario, and proposes methods to deal with windowing/segmentation. Thus, this work aims at evaluating the state-of-the-art in object perception in an attempt to develop a lightweight solution for mobile robotics use/research in typical indoor settings.
|
Petia Radeva, Michal Drozdzal, Santiago Segui, Laura Igual, Carolina Malagelada, Fernando Azpiroz, et al. (2012). Active labeling: Application to wireless endoscopy analysis. In High Performance Computing and Simulation, International Conference on (pp. 174–181).
Abstract: Today, robust learners trained in a real supervised machine learning application should count with a rich collection of positive and negative examples. Although in many applications, it is not difficult to obtain huge amount of data, labeling those data can be a very expensive process, especially when dealing with data of high variability and complexity. A good example of such cases are data from medical imaging applications where annotating anomalies like tumors, polyps, atherosclerotic plaque or informative frames in wireless endoscopy need highly trained experts. Building a representative set of training data from medical videos (e.g. Wireless Capsule Endoscopy) means that thousands of frames to be labeled by an expert. It is quite normal that data in new videos come different and thus are not represented by the training set. In this paper, we review the main approaches on active learning and illustrate how active learning can help to reduce expert effort in constructing the training sets. We show that applying active learning criteria, the number of human interventions can be significantly reduced. The proposed system allows the annotation of informative/non-informative frames of Wireless Capsule Endoscopy video containing more than 30000 frames each one with less than 100 expert ”clicks”.
|
Cristhian Aguilera, Fernando Barrera, Felipe Lumbreras, Angel Sappa, & Ricardo Toledo. (2012). Multispectral Image Feature Points. SENS - Sensors, 12(9), 12661–12672.
Abstract: Far-Infrared and Visible Spectrum images. It allows matching interest points on images of the same scene but acquired in different spectral bands. Initially, points of interest are detected on both images through a SIFT-like based scale space representation. Then, these points are characterized using an Edge Oriented Histogram (EOH) descriptor. Finally, points of interest from multispectral images are matched by finding nearest couples using the information from the descriptor. The provided experimental results and comparisons with similar methods show both the validity of the proposed approach as well as the improvements it offers with respect to the current state-of-the-art.
Keywords: multispectral image descriptor; color and infrared images; feature point descriptor
|