Home | [81–90] << 91 92 93 94 95 96 97 98 99 100 >> [101–110] |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Simone Balocco; Carlo Gatta; Francesco Ciompi; A. Wahle; Petia Radeva; S. Carlier; G. Unal; E. Sanidas; J. Mauri; X. Carillo; T. Kovarnik; C. Wang; H. Chen; T. P. Exarchos; D. I. Fotiadis; F. Destrempes; G. Cloutier; Oriol Pujol; Marina Alberti; E. G. Mendizabal-Ruiz; M. Rivera; T. Aksoy; R. W. Downe; I. A. Kakadiaris | ||||
Title | Standardized evaluation methodology and reference database for evaluating IVUS image segmentation | Type | Journal Article | ||
Year | 2014 | Publication | Computerized Medical Imaging and Graphics | Abbreviated Journal | CMIG |
Volume | 38 | Issue | 2 | Pages | 70-90 |
Keywords | IVUS (intravascular ultrasound); Evaluation framework; Algorithm comparison; Image segmentation | ||||
Abstract | This paper describes an evaluation framework that allows a standardized and quantitative comparison of IVUS lumen and media segmentation algorithms. This framework has been introduced at the MICCAI 2011 Computing and Visualization for (Intra)Vascular Imaging (CVII) workshop, comparing the results of eight teams that participated.
We describe the available data-base comprising of multi-center, multi-vendor and multi-frequency IVUS datasets, their acquisition, the creation of the reference standard and the evaluation measures. The approaches address segmentation of the lumen, the media, or both borders; semi- or fully-automatic operation; and 2-D vs. 3-D methodology. Three performance measures for quantitative analysis have been proposed. The results of the evaluation indicate that segmentation of the vessel lumen and media is possible with an accuracy that is comparable to manual annotation when semi-automatic methods are used, as well as encouraging results can be obtained also in case of fully-automatic segmentation. The analysis performed in this paper also highlights the challenges in IVUS segmentation that remains to be solved. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; LAMP; HuPBA; 600.046; 600.063; 600.079 | Approved | no | ||
Call Number | Admin @ si @ BGC2013 | Serial | 2314 | ||
Permanent link to this record | |||||
Author | Santiago Segui; Michal Drozdzal; Fernando Vilariño; Carolina Malagelada; Fernando Azpiroz; Petia Radeva; Jordi Vitria | ||||
Title | Categorization and Segmentation of Intestinal Content Frames for Wireless Capsule Endoscopy | Type | Journal Article | ||
Year | 2012 | Publication | IEEE Transactions on Information Technology in Biomedicine | Abbreviated Journal | TITB |
Volume | 16 | Issue | 6 | Pages | 1341-1352 |
Keywords | |||||
Abstract | Wireless capsule endoscopy (WCE) is a device that allows the direct visualization of gastrointestinal tract with minimal discomfort for the patient, but at the price of a large amount of time for screening. In order to reduce this time, several works have proposed to automatically remove all the frames showing intestinal content. These methods label frames as {intestinal content – clear} without discriminating between types of content (with different physiological meaning) or the portion of image covered. In addition, since the presence of intestinal content has been identified as an indicator of intestinal motility, its accurate quantification can show a potential clinical relevance. In this paper, we present a method for the robust detection and segmentation of intestinal content in WCE images, together with its further discrimination between turbid liquid and bubbles. Our proposal is based on a twofold system. First, frames presenting intestinal content are detected by a support vector machine classifier using color and textural information. Second, intestinal content frames are segmented into {turbid, bubbles, and clear} regions. We show a detailed validation using a large dataset. Our system outperforms previous methods and, for the first time, discriminates between turbid from bubbles media. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1089-7771 | ISBN | Medium | ||
Area | 800 | Expedition | Conference | ||
Notes | MILAB; MV; OR;SIAI | Approved | no | ||
Call Number | Admin @ si @ SDV2012 | Serial | 2124 | ||
Permanent link to this record | |||||
Author | Ferran Diego; Joan Serrat; Antonio Lopez | ||||
Title | Joint spatio-temporal alignment of sequences | Type | Journal Article | ||
Year | 2013 | Publication | IEEE Transactions on Multimedia | Abbreviated Journal | TMM |
Volume | 15 | Issue | 6 | Pages | 1377-1387 |
Keywords | video alignment | ||||
Abstract | Video alignment is important in different areas of computer vision such as wide baseline matching, action recognition, change detection, video copy detection and frame dropping prevention. Current video alignment methods usually deal with a relatively simple case of fixed or rigidly attached cameras or simultaneous acquisition. Therefore, in this paper we propose a joint video alignment for bringing two video sequences into a spatio-temporal alignment. Specifically, the novelty of the paper is to formulate the video alignment to fold the spatial and temporal alignment into a single alignment framework. This simultaneously satisfies a frame-correspondence and frame-alignment similarity; exploiting the knowledge among neighbor frames by a standard pairwise Markov random field (MRF). This new formulation is able to handle the alignment of sequences recorded at different times by independent moving cameras that follows a similar trajectory, and also generalizes the particular cases that of fixed geometric transformation and/or linear temporal mapping. We conduct experiments on different scenarios such as sequences recorded simultaneously or by moving cameras to validate the robustness of the proposed approach. The proposed method provides the highest video alignment accuracy compared to the state-of-the-art methods on sequences recorded from vehicles driving along the same track at different times. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-9210 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ DSL2013; ADAS @ adas @ | Serial | 2228 | ||
Permanent link to this record | |||||
Author | German Ros; J. Guerrero; Angel Sappa; Antonio Lopez | ||||
Title | VSLAM pose initialization via Lie groups and Lie algebras optimization | Type | Conference Article | ||
Year | 2013 | Publication | Proceedings of IEEE International Conference on Robotics and Automation | Abbreviated Journal | |
Volume | Issue | Pages | 5740 - 5747 | ||
Keywords | SLAM | ||||
Abstract | We present a novel technique for estimating initial 3D poses in the context of localization and Visual SLAM problems. The presented approach can deal with noise, outliers and a large amount of input data and still performs in real time in a standard CPU. Our method produces solutions with an accuracy comparable to those produced by RANSAC but can be much faster when the percentage of outliers is high or for large amounts of input data. On the current work we propose to formulate the pose estimation as an optimization problem on Lie groups, considering their manifold structure as well as their associated Lie algebras. This allows us to perform a fast and simple optimization at the same time that conserve all the constraints imposed by the Lie group SE(3). Additionally, we present several key design concepts related with the cost function and its Jacobian; aspects that are critical for the good performance of the algorithm. | ||||
Address | Karlsruhe; Germany; May 2013 | ||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1050-4729 | ISBN | 978-1-4673-5641-1 | Medium | |
Area | Expedition | Conference | ICRA | ||
Notes | ADAS; 600.054; 600.055; 600.057 | Approved | no | ||
Call Number | Admin @ si @ RGS2013a; ADAS @ adas @ | Serial | 2225 | ||
Permanent link to this record | |||||
Author | David Aldavert; Marçal Rusiñol; Ricardo Toledo; Josep Llados | ||||
Title | Integrating Visual and Textual Cues for Query-by-String Word Spotting | Type | Conference Article | ||
Year | 2013 | Publication | 12th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 511 - 515 | ||
Keywords | |||||
Abstract | In this paper, we present a word spotting framework that follows the query-by-string paradigm where word images are represented both by textual and visual representations. The textual representation is formulated in terms of character $n$-grams while the visual one is based on the bag-of-visual-words scheme. These two representations are merged together and projected to a sub-vector space. This transform allows to, given a textual query, retrieve word instances that were only represented by the visual modality. Moreover, this statistical representation can be used together with state-of-the-art indexation structures in order to deal with large-scale scenarios. The proposed method is evaluated using a collection of historical documents outperforming state-of-the-art performances. | ||||
Address | Washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-5363 | ISBN | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; ADAS; 600.045; 600.055; 600.061 | Approved | no | ||
Call Number | Admin @ si @ ART2013 | Serial | 2224 | ||
Permanent link to this record | |||||
Author | Antonio Hernandez; Miguel Angel Bautista; Xavier Perez Sala; Victor Ponce; Xavier Baro; Oriol Pujol; Cecilio Angulo; Sergio Escalera | ||||
Title | BoVDW: Bag-of-Visual-and-Depth-Words for Gesture Recognition | Type | Conference Article | ||
Year | 2012 | Publication | 21st International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | We present a Bag-of-Visual-and-Depth-Words (BoVDW) model for gesture recognition, an extension of the Bag-of-Visual-Words (BoVW) model, that benefits from the multimodal fusion of visual and depth features. State-of-the-art RGB and depth features, including a new proposed depth descriptor, are analysed and combined in a late fusion fashion. The method is integrated in a continuous gesture recognition pipeline, where Dynamic Time Warping (DTW) algorithm is used to perform prior segmentation of gestures. Results of the method in public data sets, within our gesture recognition pipeline, show better performance in comparison to a standard BoVW model. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1051-4651 | ISBN | 978-1-4673-2216-4 | Medium | |
Area | Expedition | Conference | ICPR | ||
Notes | HuPBA;MV | Approved | no | ||
Call Number | Admin @ si @ HBP2012 | Serial | 2122 | ||
Permanent link to this record | |||||
Author | Anjan Dutta; Jaume Gibert; Josep Llados; Horst Bunke; Umapada Pal | ||||
Title | Combination of Product Graph and Random Walk Kernel for Symbol Spotting in Graphical Documents | Type | Conference Article | ||
Year | 2012 | Publication | 21st International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1663-1666 | ||
Keywords | |||||
Abstract | This paper explores the utilization of product graph for spotting symbols on graphical documents. Product graph is intended to find the candidate subgraphs or components in the input graph containing the paths similar to the query graph. The acute angle between two edges and their length ratio are considered as the node labels. In a second step, each of the candidate subgraphs in the input graph is assigned with a distance measure computed by a random walk kernel. Actually it is the minimum of the distances of the component to all the components of the model graph. This distance measure is then used to eliminate dissimilar components. The remaining neighboring components are grouped and the grouped zone is considered as a retrieval zone of a symbol similar to the queried one. The entire method works online, i.e., it doesn't need any preprocessing step. The present paper reports the initial results of the method, which are very encouraging. | ||||
Address | Tsukuba, Japan | ||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1051-4651 | ISBN | 978-1-4673-2216-4 | Medium | |
Area | Expedition | Conference | ICPR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ DGL2012 | Serial | 2125 | ||
Permanent link to this record | |||||
Author | Josep Llados; Marçal Rusiñol; Alicia Fornes; David Fernandez; Anjan Dutta | ||||
Title | On the Influence of Word Representations for Handwritten Word Spotting in Historical Documents | Type | Journal Article | ||
Year | 2012 | Publication | International Journal of Pattern Recognition and Artificial Intelligence | Abbreviated Journal | IJPRAI |
Volume | 26 | Issue | 5 | Pages | 1263002-126027 |
Keywords | Handwriting recognition; word spotting; historical documents; feature representation; shape descriptors Read More: http://www.worldscientific.com/doi/abs/10.1142/S0218001412630025 | ||||
Abstract | 0,624 JCR
Word spotting is the process of retrieving all instances of a queried keyword from a digital library of document images. In this paper we evaluate the performance of different word descriptors to assess the advantages and disadvantages of statistical and structural models in a framework of query-by-example word spotting in historical documents. We compare four word representation models, namely sequence alignment using DTW as a baseline reference, a bag of visual words approach as statistical model, a pseudo-structural model based on a Loci features representation, and a structural approach where words are represented by graphs. The four approaches have been tested with two collections of historical data: the George Washington database and the marriage records from the Barcelona Cathedral. We experimentally demonstrate that statistical representations generally give a better performance, however it cannot be neglected that large descriptors are difficult to be implemented in a retrieval scenario where word spotting requires the indexation of data with million word images. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ LRF2012 | Serial | 2128 | ||
Permanent link to this record | |||||
Author | Alicia Fornes; Anjan Dutta; Albert Gordo; Josep Llados | ||||
Title | CVC-MUSCIMA: A Ground-Truth of Handwritten Music Score Images for Writer Identification and Staff Removal | Type | Journal Article | ||
Year | 2012 | Publication | International Journal on Document Analysis and Recognition | Abbreviated Journal | IJDAR |
Volume | 15 | Issue | 3 | Pages | 243-251 |
Keywords | Music scores; Handwritten documents; Writer identification; Staff removal; Performance evaluation; Graphics recognition; Ground truths | ||||
Abstract | 0,405JCR
The analysis of music scores has been an active research field in the last decades. However, there are no publicly available databases of handwritten music scores for the research community. In this paper we present the CVC-MUSCIMA database and ground-truth of handwritten music score images. The dataset consists of 1,000 music sheets written by 50 different musicians. It has been especially designed for writer identification and staff removal tasks. In addition to the description of the dataset, ground-truth, partitioning and evaluation metrics, we also provide some base-line results for easing the comparison between different approaches. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1433-2833 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ FDG2012 | Serial | 2129 | ||
Permanent link to this record | |||||
Author | Susana Alvarez; Maria Vanrell | ||||
Title | Texton theory revisited: a bag-of-words approach to combine textons | Type | Journal Article | ||
Year | 2012 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 45 | Issue | 12 | Pages | 4312-4325 |
Keywords | |||||
Abstract | The aim of this paper is to revisit an old theory of texture perception and
update its computational implementation by extending it to colour. With this in mind we try to capture the optimality of perceptual systems. This is achieved in the proposed approach by sharing well-known early stages of the visual processes and extracting low-dimensional features that perfectly encode adequate properties for a large variety of textures without needing further learning stages. We propose several descriptors in a bag-of-words framework that are derived from different quantisation models on to the feature spaces. Our perceptual features are directly given by the shape and colour attributes of image blobs, which are the textons. In this way we avoid learning visual words and directly build the vocabularies on these lowdimensionaltexton spaces. Main differences between proposed descriptors rely on how co-occurrence of blob attributes is represented in the vocabularies. Our approach overcomes current state-of-art in colour texture description which is proved in several experiments on large texture datasets. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ AlV2012a | Serial | 2130 | ||
Permanent link to this record | |||||
Author | Javier Vazquez; Robert Benavente; Maria Vanrell | ||||
Title | Naming constraints constancy | Type | Conference Article | ||
Year | 2012 | Publication | 2nd Joint AVA / BMVA Meeting on Biological and Machine Vision | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Different studies have shown that languages from industrialized cultures
share a set of 11 basic colour terms: red, green, blue, yellow, pink, purple, brown, orange, black, white, and grey (Berlin & Kay, 1969, Basic Color Terms, University of California Press)( Kay & Regier, 2003, PNAS, 100, 9085-9089). Some of these studies have also reported the best representatives or focal values of each colour (Boynton and Olson, 1990, Vision Res. 30,1311–1317), (Sturges and Whitfield, 1995, CRA, 20:6, 364–376). Some further studies have provided us with fuzzy datasets for color naming by asking human observers to rate colours in terms of membership values (Benavente -et al-, 2006, CRA. 31:1, 48–56,). Recently, a computational model based on these human ratings has been developed (Benavente -et al-, 2008, JOSA-A, 25:10, 2582-2593). This computational model follows a fuzzy approach to assign a colour name to a particular RGB value. For example, a pixel with a value (255,0,0) will be named 'red' with membership 1, while a cyan pixel with a RGB value of (0, 200, 200) will be considered to be 0.5 green and 0.5 blue. In this work, we show how this colour naming paradigm can be applied to different computer vision tasks. In particular, we report results in colour constancy (Vazquez-Corral -et al-, 2012, IEEE TIP, in press) showing that the classical constraints on either illumination or surface reflectance can be substituted by the statistical properties encoded in the colour names. [Supported by projects TIN2010-21771-C02-1, CSD2007-00018]. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | AV A | ||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ VBV2012 | Serial | 2131 | ||
Permanent link to this record | |||||
Author | Xavier Otazu; Olivier Penacchio; Laura Dempere-Marco | ||||
Title | An investigation into plausible neural mechanisms related to the the CIWaM computational model for brightness induction | Type | Conference Article | ||
Year | 2012 | Publication | 2nd Joint AVA / BMVA Meeting on Biological and Machine Vision | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas. From a purely computational perspective, we built a low-level computational model (CIWaM) of early sensory processing based on multi-resolution wavelets with the aim of replicating brightness and colour (Otazu et al., 2010, Journal of Vision, 10(12):5) induction effects. Furthermore, we successfully used the CIWaM architecture to define a computational saliency model (Murray et al, 2011, CVPR, 433-440; Vanrell et al, submitted to AVA/BMVA'12). From a biological perspective, neurophysiological evidence suggests that perceived brightness information may be explicitly represented in V1. In this work we investigate possible neural mechanisms that offer a plausible explanation for such effects. To this end, we consider the model by Z.Li (Li, 1999, Network:Comput. Neural Syst., 10, 187-212) which is based on biological data and focuses on the part of V1 responsible for contextual influences, namely, layer 2-3 pyramidal cells, interneurons, and horizontal intracortical connections. This model has proven to account for phenomena such as visual saliency, which share with brightness induction the relevant effect of contextual influences (the ones modelled by CIWaM). In the proposed model, the input to the network is derived from a complete multiscale and multiorientation wavelet decomposition taken from the computational model (CIWaM).
This model successfully accounts for well known pyschophysical effects (among them: the White's and modied White's effects, the Todorovic, Chevreul, achromatic ring patterns, and grating induction effects) for static contexts and also for brigthness induction in dynamic contexts defined by modulating the luminance of surrounding areas. From a methodological point of view, we conclude that the results obtained by the computational model (CIWaM) are compatible with the ones obtained by the neurodynamical model proposed here. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | AV A | ||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ OPD2012a | Serial | 2132 | ||
Permanent link to this record | |||||
Author | Partha Pratim Roy; Umapada Pal; Josep Llados; Mathieu Nicolas Delalandre | ||||
Title | Multi-oriented touching text character segmentation in graphical documents using dynamic programming | Type | Journal Article | ||
Year | 2012 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 45 | Issue | 5 | Pages | 1972-1983 |
Keywords | |||||
Abstract | 2,292 JCR
The touching character segmentation problem becomes complex when touching strings are multi-oriented. Moreover in graphical documents sometimes characters in a single-touching string have different orientations. Segmentation of such complex touching is more challenging. In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characters. When two or more characters touch, they generate a big cavity region in the background portion. Based on the convex hull information, at first, we use this background information to find some initial points for segmentation of a touching string into possible primitives (a primitive consists of a single character or part of a character). Next, the primitives are merged to get optimum segmentation. A dynamic programming algorithm is applied for this purpose using the total likelihood of characters as the objective function. A SVM classifier is used to find the likelihood of a character. To consider multi-oriented touching strings the features used in the SVM are invariant to character orientation. Experiments were performed in different databases of real and synthetic touching characters and the results show that the method is efficient in segmenting touching characters of arbitrary orientations and sizes. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ RPL2012a | Serial | 2133 | ||
Permanent link to this record | |||||
Author | Partha Pratim Roy; Umapada Pal; Josep Llados | ||||
Title | Text line extraction in graphical documents using background and foreground | Type | Journal Article | ||
Year | 2012 | Publication | International Journal on Document Analysis and Recognition | Abbreviated Journal | IJDAR |
Volume | 15 | Issue | 3 | Pages | 227-241 |
Keywords | |||||
Abstract | 0,405 JCR
In graphical documents (e.g., maps, engineering drawings), artistic documents etc., the text lines are annotated in multiple orientations or curvilinear way to illustrate different locations or symbols. For the optical character recognition of such documents, individual text lines from the documents need to be extracted. In this paper, we propose a novel method to segment such text lines and the method is based on the foreground and background information of the text components. To effectively utilize the background information, a water reservoir concept is used here. In the proposed scheme, at first, individual components are detected and grouped into character clusters in a hierarchical way using size and positional information. Next, the clusters are extended in two extreme sides to determine potential candidate regions. Finally, with the help of these candidate regions, individual lines are extracted. The experimental results are presented on different datasets of graphical documents, camera-based warped documents, noisy images containing seals, etc. The results demonstrate that our approach is robust and invariant to size and orientation of the text lines present in the document. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1433-2833 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ RPL2012b | Serial | 2134 | ||
Permanent link to this record | |||||
Author | Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades | ||||
Title | Text/graphic separation using a sparse representation with multi-learned dictionaries | Type | Conference Article | ||
Year | 2012 | Publication | 21st International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Graphics Recognition; Layout Analysis; Document Understandin | ||||
Abstract | In this paper, we propose a new approach to extract text regions from graphical documents. In our method, we first empirically construct two sequences of learned dictionaries for the text and graphical parts respectively. Then, we compute the sparse representations of all different sizes and non-overlapped document patches in these learned dictionaries. Based on these representations, each patch can be classified into the text or graphic category by comparing its reconstruction errors. Same-sized patches in one category are then merged together to define the corresponding text or graphic layers which are combined to createfinal text/graphic layer. Finally, in a post-processing step, text regions are further filtered out by using some learned thresholds. | ||||
Address | Tsukuba | ||||
Corporate Author | Thesis | ||||
Publisher ![]() |
Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ DTR2012a | Serial | 2135 | ||
Permanent link to this record |