toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Partha Pratim Roy; Umapada Pal; Josep Llados edit  doi
isbn  openurl
  Title Touching Text Character Localization in Graphical Documents using SIFT Type Book Chapter
  Year 2010 Publication Graphics Recognition. Achievements, Challenges, and Evolution. 8th International Workshop, GREC 2009. Selected Papers Abbreviated Journal  
  Volume 6020 Issue Pages 199-211  
  Keywords Support Vector Machine; Text Component; Graphical Line; Document Image; Scale Invariant Feature Transform  
  Abstract Interpretation of graphical document images is a challenging task as it requires proper understanding of text/graphics symbols present in such documents. Difficulties arise in graphical document recognition when text and symbol overlapped/touched. Intersection of text and symbols with graphical lines and curves occur frequently in graphical documents and hence separation of such symbols is very difficult.
Several pattern recognition and classification techniques exist to recognize isolated text/symbol. But, the touching/overlapping text and symbol recognition has not yet been dealt successfully. An interesting technique, Scale Invariant Feature Transform (SIFT), originally devised for object recognition can take care of overlapping problems. Even if SIFT features have emerged as a very powerful object descriptors, their employment in graphical documents context has not been investigated much. In this paper we present the adaptation of the SIFT approach in the context of text character localization (spotting) in graphical documents. We evaluate the applicability of this technique in such documents and discuss the scope of improvement by combining some state-of-the-art approaches.
 
  Address  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-13727-3 Medium  
  Area Expedition Conference  
  Notes (up) DAG Approved no  
  Call Number Admin @ si @ RPL2010c Serial 2408  
Permanent link to this record
 

 
Author Nuria Cirera edit  openurl
  Title Recognition of Handwritten Historical Documents Type Report
  Year 2012 Publication CVC Technical Report Abbreviated Journal  
  Volume 174 Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis Master's thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (up) DAG Approved no  
  Call Number Admin @ si @ Cir2012 Serial 2416  
Permanent link to this record
 

 
Author Miquel Ferrer; I. Bardaji; Ernest Valveny; Dimosthenis Karatzas; Horst Bunke edit  doi
isbn  openurl
  Title Median Graph Computation by Means of Graph Embedding into Vector Spaces Type Book Chapter
  Year 2013 Publication Graph Embedding for Pattern Analysis Abbreviated Journal  
  Volume Issue Pages 45-72  
  Keywords  
  Abstract In pattern recognition [8, 14], a key issue to be addressed when designing a system is how to represent input patterns. Feature vectors is a common option. That is, a set of numerical features describing relevant properties of the pattern are computed and arranged in a vector form. The main advantages of this kind of representation are computational simplicity and a well sound mathematical foundation. Thus, a large number of operations are available to work with vectors and a large repository of algorithms for pattern analysis and classification exist. However, the simple structure of feature vectors might not be the best option for complex patterns where nonnumerical features or relations between different parts of the pattern become relevant.  
  Address  
  Corporate Author Thesis  
  Publisher Springer New York Place of Publication Editor Yun Fu; Yungian Ma  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-4614-4456-5 Medium  
  Area Expedition Conference  
  Notes (up) DAG Approved no  
  Call Number Admin @ si @ FBV2013 Serial 2421  
Permanent link to this record
 

 
Author Lluis Pere de las Heras; Ernest Valveny; Gemma Sanchez edit  openurl
  Title Unsupervised and Notation-Independent Wall Segmentation in Floor Plans Using a Combination of Statistical and Structural Strategies Type Conference Article
  Year 2013 Publication 10th IAPR International Workshop on Graphics Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Bethlehem; PA; USA; August 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference GREC  
  Notes (up) DAG Approved no  
  Call Number Admin @ si @ HVS2013b Serial 2696  
Permanent link to this record
 

 
Author Francisco Cruz edit  isbn
openurl 
  Title Probabilistic Graphical Models for Document Analysis Type Book Whole
  Year 2016 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Latest advances in digitization techniques have fostered the interest in creating digital copies of collections of documents. Digitized documents permit an easy maintenance, loss-less storage, and efficient ways for transmission and to perform information retrieval processes. This situation has opened a new market niche to develop systems able to automatically extract and analyze information contained in these collections, specially in the ambit of the business activity.

Due to the great variety of types of documents this is not a trivial task. For instance, the automatic extraction of numerical data from invoices differs substantially from a task of text recognition in historical documents. However, in order to extract the information of interest, is always necessary to identify the area of the document where it is located. In the area of Document Analysis we refer to this process as layout analysis, which aims at identifying and categorizing the different entities that compose the document, such as text regions, pictures, text lines, or tables, among others. To perform this task it is usually necessary to incorporate a prior knowledge about the task into the analysis process, which can be modeled by defining a set of contextual relations between the different entities of the document. The use of context has proven to be useful to reinforce the recognition process and improve the results on many computer vision tasks. It presents two fundamental questions: What kind of contextual information is appropriate for a given task, and how to incorporate this information into the models.

In this thesis we study several ways to incorporate contextual information to the task of document layout analysis, and to the particular case of handwritten text line segmentation. We focus on the study of Probabilistic Graphical Models and other mechanisms for this purpose, and propose several solutions to these problems. First, we present a method for layout analysis based on Conditional Random Fields. With this model we encode local contextual relations between variables, such as pair-wise constraints. Besides, we encode a set of structural relations between different classes of regions at feature level. Second, we present a method based on 2D-Probabilistic Context-free Grammars to encode structural and hierarchical relations. We perform a comparative study between Probabilistic Graphical Models and this syntactic approach. Third, we propose a method for structured documents based on Bayesian Networks to represent the document structure, and an algorithm based in the Expectation-Maximization to find the best configuration of the page. We perform a thorough evaluation of the proposed methods on two particular collections of documents: a historical collection composed of ancient structured documents, and a collection of contemporary documents. In addition, we present a general method for the task of handwritten text line segmentation. We define a probabilistic framework where we combine the EM algorithm with variational approaches for computing inference and parameter learning on a Markov Random Field. We evaluate our method on several collections of documents, including a general dataset of annotated administrative documents. Results demonstrate the applicability of our method to real problems, and the contribution of the use of contextual information to this kind of problems.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Oriol Ramos Terrades  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-945373-2-5 Medium  
  Area Expedition Conference  
  Notes (up) DAG Approved no  
  Call Number Admin @ si @ Cru2016 Serial 2861  
Permanent link to this record
 

 
Author Pau Riba; Alicia Fornes; Josep Llados edit  isbn
openurl 
  Title Towards the Alignment of Handwritten Music Scores Type Conference Article
  Year 2015 Publication 11th IAPR International Workshop on Graphics Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract It is very common to find different versions of the same music work in archives of Opera Theaters. These differences correspond to modifications and annotations from the musicians. From the musicologist point of view, these variations are very interesting and deserve study. This paper explores the alignment of music scores as a tool for automatically detecting the passages that contain such differences. Given the difficulties in the recognition of handwritten music scores, our goal is to align the music scores and at the same time, avoid the recognition of music elements as much as possible. After removing the staff lines, braces and ties, the bar lines are detected. Then, the bar units are described as a whole using the Blurred Shape Model. The bar units alignment is performed by using Dynamic Time Warping. The analysis of the alignment path is used to detect the variations in the music scores. The method has been evaluated on a subset of the CVC-MUSCIMA dataset, showing encouraging results.  
  Address Nancy; France; August 2015  
  Corporate Author Thesis  
  Publisher Springer International Publishing Place of Publication Editor Bart Lamiroy; Rafael Dueire Lins  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-319-52158-9 Medium  
  Area Expedition Conference GREC  
  Notes (up) DAG Approved no  
  Call Number Admin @ si @ Serial 2874  
Permanent link to this record
 

 
Author Lluis Gomez edit  openurl
  Title Exploiting Similarity Hierarchies for Multi-script Scene Text Understanding Type Book Whole
  Year 2016 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This thesis addresses the problem of automatic scene text understanding in unconstrained conditions. In particular, we tackle the tasks of multi-language and arbitrary-oriented text detection, tracking, and script identification in natural scenes.
For this we have developed a set of generic methods that build on top of the basic observation that text has always certain key visual and structural characteristics that are independent of the language or script in which it is written. Text instances in any
language or script are always formed as groups of similar atomic parts, being them either individual characters, small stroke parts, or even whole words in the case of cursive text. This holistic (sumof-parts) and recursive perspective has lead us to explore different variants of the “segmentation and grouping” paradigm of computer vision.
Scene text detection methodologies are usually based in classification of individual regions or patches, using a priory knowledge for a given script or language. Human perception of text, on the other hand, is based on perceptual organization through which
text emerges as a perceptually significant group of atomic objects.
In this thesis, we argue that the text detection problem must be posed as the detection of meaningful groups of regions. We address the problem of text detection in natural scenes from a hierarchical perspective, making explicit use of the recursive nature of text, aiming directly to the detection of region groupings corresponding to text within a hierarchy produced by an agglomerative similarity clustering process over individual regions. We propose an optimal way to construct such an hierarchy introducing a feature space designed to produce text group hypothese with high recall and a novel stopping rule combining a discriminative classifier and a probabilistic measure of group meaningfulness based in perceptual organization. Within this generic framework, we design a text-specific object proposals algorithm that, contrary to existing generic object proposals methods, aims directly to the detection of text regions groupings. For this, we abandon the rigid definition of “what is text” of traditional specialized text detectors, and move towards more fuzzy perspective of grouping-based object proposals methods.
Then, we present a hybrid algorithm for detection and tracking of scene text where the notion of region groupings plays also a central role. By leveraging the structural arrangement of text group components between consecutive frames we can improve
the overall tracking performance of the system.
Finally, since our generic detection framework is inherently designed for multi-language environments, we focus on the problem of script identification in order to build a multi-language end-toend reading system. Facing this problem with state of the art CNN classifiers is not straightforward, as they fail to address a key
characteristic of scene text instances: their extremely variable aspect ratio. Instead of resizing input images to a fixed size as in the typical use of holistic CNN classifiers, we propose a patch-based classification framework in order to preserve discriminative parts of the image that are characteristic of its class. We describe a novel method based on the use of ensembles of conjoined networks to jointly learn discriminative stroke-parts representations and their relative importance in a patch-based classification scheme.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Place of Publication Editor Dimosthenis Karatzas  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (up) DAG Approved no  
  Call Number Admin @ si @ Gom2016 Serial 2891  
Permanent link to this record
 

 
Author Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades edit  openurl
  Title Spotting Symbol over Graphical Documents Via Sparsity in Visual Vocabulary Type Book Chapter
  Year 2016 Publication Recent Trends in Image Processing and Pattern Recognition Abbreviated Journal  
  Volume 709 Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference RTIP2R  
  Notes (up) DAG Approved no  
  Call Number Admin @ si @ HTR2016 Serial 2956  
Permanent link to this record
 

 
Author Antonio Lopez; Atsushi Imiya; Tomas Pajdla; Jose Manuel Alvarez edit  isbn
openurl 
  Title Computer Vision in Vehicle Technology: Land, Sea & Air Type Book Whole
  Year Publication Computer Vision in Vehicle Technology: Land, Sea & Air Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract A unified view of the use of computer vision technology for different types of vehicles

Computer Vision in Vehicle Technology focuses on computer vision as on-board technology, bringing together fields of research where computer vision is progressively penetrating: the automotive sector, unmanned aerial and underwater vehicles. It also serves as a reference for researchers of current developments and challenges in areas of the application of computer vision, involving vehicles such as advanced driver assistance (pedestrian detection, lane departure warning, traffic sign recognition), autonomous driving and robot navigation (with visual simultaneous localization and mapping) or unmanned aerial vehicles (obstacle avoidance, landscape classification and mapping, fire risk assessment).

The overall role of computer vision for the navigation of different vehicles, as well as technology to address on-board applications, is analysed.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-118-86807-2 Medium  
  Area Expedition Conference  
  Notes (up) DAG Approved no  
  Call Number Admin @ si @ LIP2017b Serial 3049  
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) edit  doi
isbn  openurl
  Title 16th International Conference, 2021, Proceedings, Part III Type Book Whole
  Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal  
  Volume 12823 Issue Pages  
  Keywords  
  Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
 
  Address Lausanne, Switzerland, September 5-10, 2021  
  Corporate Author Thesis  
  Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-86333-3 Medium  
  Area Expedition Conference ICDAR  
  Notes (up) DAG Approved no  
  Call Number Admin @ si @ Serial 3727  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: