toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links (up)
Author Dena Bazazian edit  isbn
openurl 
  Title Fully Convolutional Networks for Text Understanding in Scene Images Type Book Whole
  Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Text understanding in scene images has gained plenty of attention in the computer vision community and it is an important task in many applications as text carries semantically rich information about scene content and context. For instance, reading text in a scene can be applied to autonomous driving, scene understanding or assisting visually impaired people. The general aim of scene text understanding is to localize and recognize text in scene images. Text regions are first localized in the original image by a trained detector model and afterwards fed into a recognition module. The tasks of localization and recognition are highly correlated since an inaccurate localization can affect the recognition task.
The main purpose of this thesis is to devise efficient methods for scene text understanding. We investigate how the latest results on deep learning can advance text understanding pipelines. Recently, Fully Convolutional Networks (FCNs) and derived methods have achieved a significant performance on semantic segmentation and pixel level classification tasks. Therefore, we took benefit of the strengths of FCN approaches in order to detect text in natural scenes. In this thesis we have focused on two challenging tasks of scene text understanding which are Text Detection and Word Spotting. For the task of text detection, we have proposed an efficient text proposal technique in scene images. We have considered the Text Proposals method as the baseline which is an approach to reduce the search space of possible text regions in an image. In order to improve the Text Proposals method we combined it with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same level of accuracy and thus gaining a significant speed up. Our experiments demonstrate that this text proposal approach yields significantly higher recall rates than the line based text localization techniques, while also producing better-quality localization. We have also applied this technique on compressed images such as videos from wearable egocentric cameras. For the task of word spotting, we have introduced a novel mid-level word representation method. We have proposed a technique to create and exploit an intermediate representation of images based on text attributes which roughly correspond to character probability maps. Our representation extends the concept of Pyramidal Histogram Of Characters (PHOC) by exploiting Fully Convolutional Networks to derive a pixel-wise mapping of the character distribution within candidate word regions. We call this representation the Soft-PHOC. Furthermore, we show how to use Soft-PHOC descriptors for word spotting tasks through an efficient text line proposal algorithm. To evaluate the detected text, we propose a novel line based evaluation along with the classic bounding box based approach. We test our method on incidental scene text images which comprises real-life scenarios such as urban scenes. The importance of incidental scene text images is due to the complexity of backgrounds, perspective, variety of script and language, short text and little linguistic context. All of these factors together makes the incidental scene text images challenging.
 
  Address November 2018  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Dimosthenis Karatzas;Andrew Bagdanov  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-948531-1-1 Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ Baz2018 Serial 3220  
Permanent link to this record
 

 
Author Marçal Rusiñol edit  openurl
  Title Geometric and Structural-based Symbol Spotting. Application to Focused Retrieval in Graphic Document Collections Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Usually, pattern recognition systems consist of two main parts. On the one hand, the data acquisition and, on the other hand, the classification of this data on a certain category. In order to recognize which category a certain query element belongs to, a set of pattern models must be provided beforehand. An off-line learning stage is needed to train the classifier and to offer a robust classification of the patterns. Within the pattern recognition field, we are interested in the recognition of graphics and, in particular, on the analysis of documents rich in graphical information. In this context, one of the main concerns is to see if the proposed systems remain scalable with respect to the data volume so as it can handle growing amounts of symbol models. In order to avoid to work with a database of reference symbols, symbol spotting and on-the-fly symbol recognition methods have been introduced in the past years.

Generally speaking, the symbol spotting problem can be defined as the identification of a set of regions of interest from a document image which are likely to contain an instance of a certain queriedn symbol without explicitly applying the whole pattern recognition scheme. Our application framework consists on indexing a collection of graphic-rich document images. This collection is
queried by example with a single instance of the symbol to look for and, by means of symbol spotting methods we retrieve the regions of interest where the symbol is likely to appear within the documents. This kind of applications are known as focused retrieval methods.

In order that the focused retrieval application can handle large collections of documents there is a need to provide an efficient access to the large volume of information that might be stored. We use indexing strategies in order to efficiently retrieve by similarity the locations where a certain part of the symbol appears. In that scenario, graphical patterns should be used as indices for accessing and navigating the collection of documents.
These indexing mechanism allow the user to search for similar elements using graphical information rather than textual queries.

Along this thesis we present a spotting architecture and different methods aiming to build a complete focused retrieval application dealing with a graphic-rich document collections. In addition, a protocol to evaluate the performance of symbol
spotting systems in terms of recognition abilities, location accuracy and scalability is proposed.
 
  Address Barcelona (Spain)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ Rus2009 Serial 1264  
Permanent link to this record
 

 
Author Agnes Borras edit   pdf
openurl 
  Title Contributions to the Content-Based Image Retrieval Using Pictorial Queries Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The broad access to digital cameras, personal computers and Internet, has lead to the generation of large volumes of data in digital form. If we want an effective usage of this huge amount of data, we need automatic tools to allow the retrieval of relevant information. Image data is a particular type of information that requires specific techniques of description and indexing. The computer vision field that studies these kind of techniques is called Content-Based Image Retrieval (CBIR). Instead of using text-based descriptions, a system of CBIR deals on properties that are inherent in the images themselves. Hence, the feature-based description provides a universal via of image expression in contrast with the more than 6000 languages spoken in the world.
Nowadays, the CBIR is a dynamic focus of research that has derived in important applications for many professional groups. The potential fields of application can be such diverse as: the medical domain, the crime prevention, the protection of the intel- lectual property, the journalism, the graphic design, the web search, the preservation of cultural heritage, etc.
The definition on the role of the user is a key point in the development of a CBIR application. The user is in charge to formulate the queries from which the images are retrieved. We have centered our attention on the image retrieval techniques that use queries based on pictorial information. We have identified a taxonomy composed by four main query paradigms: query-by-selection, query-by-iconic-composition, query- by-sketch and query-by-paint. Each one of these paradigms allows a different degree of user expressivity. From a simple image selection, to a complete painting of the query, the user takes control of the input in the CBIR system.
Along the chapters of this thesis we have analyzed the influence that each query paradigm imposes in the internal operations of a CBIR system. Moreover, we have proposed a set of contributions that we have exemplified in the context of a final application.
 
  Address Barcelona (Spain)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Bellaterra Editor Josep Llados  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; Approved no  
  Call Number DAG @ dag @ Bor2009; IAM @ iam @ Bor2009 Serial 1269  
Permanent link to this record
 

 
Author Marçal Rusiñol; Josep Llados edit  isbn
openurl 
  Title Symbol Spotting in Digital Libraries:Focused Retrieval over Graphic-rich Document Collections Type Book Whole
  Year 2010 Publication Symbol Spotting in Digital Libraries:Focused Retrieval over Graphic-rich Document Collections Abbreviated Journal  
  Volume Issue Pages  
  Keywords Focused Retrieval , Graphical Pattern Indexation,Graphics Recognition ,Pattern Recognition , Performance Evaluation , Symbol Description ,Symbol Spotting  
  Abstract The specific problem of symbol recognition in graphical documents requires additional techniques to those developed for character recognition. The most well-known obstacle is the so-called Sayre paradox: Correct recognition requires good segmentation, yet improvement in segmentation is achieved using information provided by the recognition process. This dilemma can be avoided by techniques that identify sets of regions containing useful information. Such symbol-spotting methods allow the detection of symbols in maps or technical drawings without having to fully segment or fully recognize the entire content.

This unique text/reference provides a complete, integrated and large-scale solution to the challenge of designing a robust symbol-spotting method for collections of graphic-rich documents. The book examines a number of features and descriptors, from basic photometric descriptors commonly used in computer vision techniques to those specific to graphical shapes, presenting a methodology which can be used in a wide variety of applications. Additionally, readers are supplied with an insight into the problem of performance evaluation of spotting methods. Some very basic knowledge of pattern recognition, document image analysis and graphics recognition is assumed.
 
  Address  
  Corporate Author Thesis  
  Publisher Springer Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-84996-208-7 Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ RuL2010a Serial 1292  
Permanent link to this record
 

 
Author Muhammad Muzzamil Luqman; Thierry Brouard; Jean-Yves Ramel; Josep Llados edit  openurl
  Title Vers une approche foue of encapsulation de graphes: application a la reconnaissance de symboles Type Conference Article
  Year 2010 Publication Colloque International Francophone sur l'Écrit et le Document Abbreviated Journal  
  Volume Issue Pages 169-184  
  Keywords Fuzzy interval; Graph embedding; Bayesian network; Symbol recognition  
  Abstract We present a new methodology for symbol recognition, by employing a structural approach for representing visual associations in symbols and a statistical classifier for recognition. A graphic symbol is vectorized, its topological and geometrical details are encoded by an attributed relational graph and a signature is computed for it. Data adapted fuzzy intervals have been introduced for addressing the sensitivity of structural representations to noise. The joint probability distribution of signatures is encoded by a Bayesian network, which serves as a mechanism for pruning irrelevant features and choosing a subset of interesting features from structural signatures of underlying symbol set, and is deployed in a supervised learning scenario for recognizing query symbols. Experimental results on pre-segmented 2D linear architectural and electronic symbols from GREC databases are presented.  
  Address Sousse, Tunisia  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CIFED  
  Notes DAG Approved no  
  Call Number DAG @ dag @ LBR2010a Serial 1293  
Permanent link to this record
 

 
Author Joan Mas edit  isbn
openurl 
  Title A Syntactic Pattern Recognition Approach based on a Distribution Tolerant Adjacency Grammar and a Spatial Indexed Parser. Application to Sketched Document Recognition Type Book Whole
  Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Sketch recognition is a discipline which has gained an increasing interest in the last
20 years. This is due to the appearance of new devices such as PDA, Tablet PC’s
or digital pen & paper protocols. From the wide range of sketched documents we
focus on those that represent structured documents such as: architectural floor-plans,
engineering drawing, UML diagrams, etc. To recognize and understand these kinds
of documents, first we have to recognize the different compounding symbols and then
we have to identify the relations between these elements. From the way that a sketch
is captured, there are two categories: on-line and off-line. On-line input modes refer
to draw directly on a PDA or a Tablet PC’s while off-line input modes refer to scan
a previously drawn sketch.
This thesis is an overlapping of three different areas on Computer Science: Pattern
Recognition, Document Analysis and Human-Computer Interaction. The aim of this
thesis is to interpret sketched documents independently on whether they are captured
on-line or off-line. For this reason, the proposed approach should contain the following
features. First, as we are working with sketches the elements present in our input
contain distortions. Second, as we would work in on-line or off-line input modes, the
order in the input of the primitives is indifferent. Finally, the proposed method should
be applied in real scenarios, its response time must be slow.
To interpret a sketched document we propose a syntactic approach. A syntactic
approach is composed of two correlated components: a grammar and a parser. The
grammar allows describing the different elements on the document as well as their
relations. The parser, given a document checks whether it belongs to the language
generated by the grammar or not. Thus, the grammar should be able to cope with
the distortions appearing on the instances of the elements. Moreover, it would be
necessary to define a symbol independently of the order of their primitives. Concerning to the parser when analyzing 2D sentences, it does not assume an order in the
primitives. Then, at each new primitive in the input, the parser searches among the
previous analyzed symbols candidates to produce a valid reduction.
Taking into account these features, we have proposed a grammar based on Adjacency Grammars. This kind of grammars defines their productions as a multiset
of symbols rather than a list. This allows describing a symbol without an order in
their components. To cope with distortion we have proposed a distortion model.
This distortion model is an attributed estimated over the constraints of the grammar and passed through the productions. This measure gives an idea on how far is the
symbol from its ideal model. In addition to the distortion on the constraints other
distortions appear when working with sketches. These distortions are: overtracing,
overlapping, gaps or spurious strokes. Some grammatical productions have been defined to cope with these errors. Concerning the recognition, we have proposed an
incremental parser with an indexation mechanism. Incremental parsers analyze the
input symbol by symbol given a response to the user when a primitive is analyzed.
This makes incremental parser suitable to work in on-line as well as off-line input
modes. The parser has been adapted with an indexation mechanism based on a spatial division. This indexation mechanism allows setting the primitives in the space
and reducing the search to a neighbourhood.
A third contribution is a grammatical inference algorithm. This method given a
set of symbols captures the production describing it. In the field of formal languages,
different approaches has been proposed but in the graphical domain not so much work
is done in this field. The proposed method is able to capture the production from
a set of symbol although they are drawn in different order. A matching step based
on the Haussdorff distance and the Hungarian method has been proposed to match
the primitives of the different symbols. In addition the proposed approach is able to
capture the variability in the parameters of the constraints.
From the experimental results, we may conclude that we have proposed a robust
approach to describe and recognize sketches. Moreover, the addition of new symbols
to the alphabet is not restricted to an expert. Finally, the proposed approach has
been used in two real scenarios obtaining a good performance.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Gemma Sanchez;Josep Llados  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-937261-4-0 Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ Mas2010 Serial 1334  
Permanent link to this record
 

 
Author Anjan Dutta edit  openurl
  Title Symbol Spotting in Graphical Documents by Serialized Subgraph Matching Type Report
  Year 2010 Publication CVC Technical Report Abbreviated Journal  
  Volume 159 Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis Master's thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number Admin @ si @ Dut2010 Serial 1351  
Permanent link to this record
 

 
Author David Fernandez edit  openurl
  Title Handwritten Word Spotting in Old Manuscript Images using Shape Descriptors Type Report
  Year 2010 Publication CVC Technical Report Abbreviated Journal  
  Volume 161 Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis Master's thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number Admin @ si @ Fer2010b Serial 1353  
Permanent link to this record
 

 
Author Marçal Rusiñol; Josep Llados edit  openurl
  Title Efficient Logo Retrieval Through Hashing Shape Context Descriptors Type Conference Article
  Year 2010 Publication 9th IAPR International Workshop on Document Analysis Systems Abbreviated Journal  
  Volume Issue Pages 215–222  
  Keywords  
  Abstract In this paper, we present an approach towards the retrieval of words from graphical document images. In graphical documents, due to presence of multi-oriented characters in non-structured layout, word indexing is a challenging task. The proposed approach uses recognition results of individual components to form character pairs with the neighboring components. An indexing scheme is designed to store the spatial description of components and to access them efficiently. Given a query text word (ascii/unicode format), the character pairs present in it are searched in the document. Next the retrieved character pairs are linked sequentially to form character string. Dynamic programming is applied to find different instances of query words. A string edit distance is used here to match the query word as the objective function. Recognition of multi-scale and multi-oriented character component is done using Support Vector Machine classifier. To consider multi-oriented character strings the features used in the SVM are invariant to character orientation. Experimental results show that the method is efficient to locate a query word from multi-oriented text in graphical documents.  
  Address Boston; USA  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG Approved no  
  Call Number DAG @ dag @ RuL2010b Serial 1434  
Permanent link to this record
 

 
Author Herve Locteau; Sebastien Mace; Ernest Valveny; Salvatore Tabbone edit  openurl
  Title Extraction des pieces de un plan de habitation Type Conference Article
  Year 2010 Publication Colloque Internacional Francophone de l´Ecrit et le Document Abbreviated Journal  
  Volume Issue Pages 1–12  
  Keywords  
  Abstract In this article, a method to extract the rooms of an architectural floor plan image is described. We first present a line detection algorithm to extract long lines in the image. Those lines are analyzed to identify the existing walls. From this point, room extraction can be seen as a classical segmentation task for which each region corresponds to a room. The chosen resolution strategy consists in recursively decomposing the image until getting nearly convex regions. The notion of convexity is difficult to quantify, and the selection of separation lines can also be rough. Thus, we take advantage of knowledge associated to architectural floor plans in order to obtain mainly rectangular rooms. Preliminary tests on a set of real documents show promising results.  
  Address Sousse, Tunisia  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CIFED  
  Notes DAG Approved no  
  Call Number DAG @ dag @ LMV2010 Serial 1440  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: