toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Oriol Pujol; David Masip edit  doi
openurl 
  Title Geometry-Based Ensembles: Toward a Structural Characterization of the Classification Boundary Type Journal Article
  Year 2009 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI  
  Volume 31 Issue 6 Pages 1140–1146  
  Keywords  
  Abstract This article introduces a novel binary discriminative learning technique based on the approximation of the non-linear decision boundary by a piece-wise linear smooth additive model. The decision border is geometrically defined by means of the characterizing boundary points – points that belong to the optimal boundary under a certain notion of robustness. Based on these points, a set of locally robust linear classifiers is defined and assembled by means of a Tikhonov regularized optimization procedure in an additive model to create a final lambda-smooth decision rule. As a result, a very simple and robust classifier with a strong geometrical meaning and non-linear behavior is obtained. The simplicity of the method allows its extension to cope with some of nowadays machine learning challenges, such as online learning, large scale learning or parallelization, with linear computational complexity. We validate our approach on the UCI database. Finally, we apply our technique in online and large scale scenarios, and in six real life computer vision and pattern recognition problems: gender recognition, intravascular ultrasound tissue classification, speed traffic sign detection, Chagas' disease severity detection, clef classification and action recognition using a 3D accelerometer data. The results are promising and this paper opens a line of research that deserves further attention  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium (up)  
  Area Expedition Conference  
  Notes OR;HuPBA;MV Approved no  
  Call Number BCNPCL @ bcnpcl @ PuM2009 Serial 1252  
Permanent link to this record
 

 
Author J. Oliver; Ricardo Toledo; J. Pujol; J. Sorribes; E. Valderrama edit  isbn
openurl 
  Title Un ABP basado en la robotica para las ingenierias informaticas Type Miscellaneous
  Year 2009 Publication 15th Jornadas de Enseñanza Universitaria de la Informatica Abbreviated Journal  
  Volume Issue Pages 331–338  
  Keywords  
  Abstract  
  Address Barcelona, Spain  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN ISBN:978–84–692–2758–9 Medium (up)  
  Area Expedition Conference JENUI  
  Notes ADAS Approved no  
  Call Number Admin @ si @ OTP2009 Serial 1253  
Permanent link to this record
 

 
Author Eduard Vazquez edit  openurl
  Title Distribution Characterization using Topological Features. Application to Colour Image Processing Type Report
  Year 2007 Publication CVC Technical Report # 107 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis Master's thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium (up)  
  Area Expedition Conference  
  Notes Approved no  
  Call Number Admin @ si @ Vaz2009 Serial 1254  
Permanent link to this record
 

 
Author Oscar Camara; Estanislao Oubel; Gemma Piella; Simone Balocco; Mathieu De Craene; Alejandro F. Frangi edit  doi
isbn  openurl
  Title Multi-sequence Registration of Cine, Tagged and Delay-Enhancement MRI with Shift Correction and Steerable Pyramid-Based Detagging Type Conference Article
  Year 2009 Publication 5th International Conference on Functional Imaging and Modeling of the Heart Abbreviated Journal  
  Volume 5528 Issue Pages 330–338  
  Keywords  
  Abstract In this work, we present a registration framework for cardiac cine MRI (cMRI), tagged (tMRI) and delay-enhancement MRI (deMRI), where the two main issues to find an accurate alignment between these images have been taking into account: the presence of tags in tMRI and respiration artifacts in all sequences. A steerable pyramid image decomposition has been used for detagging purposes since it is suitable to extract high-order oriented structures by directional adaptive filtering. Shift correction of cMRI is achieved by firstly maximizing the similarity between the Long Axis and Short Axis cMRI. Subsequently, these shift-corrected images are used as target images in a rigid registration procedure with their corresponding tMRI/deMRI in order to correct their shift. The proposed registration framework has been evaluated by 840 registration tests, considerably improving the alignment of the MR images (mean RMS error of 2.04mm vs. 5.44mm).  
  Address Nice, France  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-01931-9 Medium (up)  
  Area Expedition Conference FIMH  
  Notes MILAB Approved no  
  Call Number BCNPCL @ bcnpcl @ COP2009 Serial 1255  
Permanent link to this record
 

 
Author Fadi Dornaika; Bogdan Raducanu edit  doi
isbn  openurl
  Title Simultaneous 3D face pose and person-specific shape estimation from a single image using a holistic approach Type Conference Article
  Year 2009 Publication IEEE Workshop on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This paper presents a new approach for the simultaneous estimation of the 3D pose and specific shape of a previously unseen face from a single image. The face pose is not limited to a frontal view. We describe a holistic approach based on a deformable 3D model and a learned statistical facial texture model. Rather than obtaining a person-specific facial surface, the goal of this work is to compute person-specific 3D face shape in terms of a few control parameters that are used by many applications. The proposed holistic approach estimates the 3D pose parameters as well as the face shape control parameters by registering the warped texture to a statistical face texture, which is carried out by a stochastic and genetic optimizer. The proposed approach has several features that make it very attractive: (i) it uses a single grey-scale image, (ii) it is person-independent, (iii) it is featureless (no facial feature extraction is required), and (iv) its learning stage is easy. The proposed approach lends itself nicely to 3D face tracking and face gesture recognition in monocular videos. We describe extensive experiments that show the feasibility and robustness of the proposed approach.  
  Address Utah, USA  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1550-5790 ISBN 978-1-4244-5497-6 Medium (up)  
  Area Expedition Conference WACV  
  Notes OR;MV Approved no  
  Call Number BCNPCL @ bcnpcl @ DoR2009b Serial 1256  
Permanent link to this record
 

 
Author Bogdan Raducanu; Fadi Dornaika edit  doi
isbn  openurl
  Title Natural Facial Expression Recognition Using Dynamic and Static Schemes Type Conference Article
  Year 2009 Publication 5th International Symposium on Visual Computing Abbreviated Journal  
  Volume 5875 Issue Pages 730–739  
  Keywords  
  Abstract Affective computing is at the core of a new paradigm in HCI and AI represented by human-centered computing. Within this paradigm, it is expected that machines will be enabled with perceiving capabilities, making them aware about users’ affective state. The current paper addresses the problem of facial expression recognition from monocular videos sequences. We propose a dynamic facial expression recognition scheme, which is proven to be very efficient. Furthermore, it is conveniently compared with several static-based systems adopting different magnitude of facial expression. We provide evaluations of performance using Linear Discriminant Analysis (LDA), Non parametric Discriminant Analysis (NDA), and Support Vector Machines (SVM). We also provide performance evaluations using arbitrary test video sequences.  
  Address Las Vegas, USA  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-10330-8 Medium (up)  
  Area Expedition Conference ISVC  
  Notes OR;MV Approved no  
  Call Number BCNPCL @ bcnpcl @ RaD2009 Serial 1257  
Permanent link to this record
 

 
Author Sergio Escalera; Oriol Pujol; J. Mauri; Petia Radeva edit  doi
openurl 
  Title Intravascular Ultrasound Tissue Characterization with Sub-class Error-Correcting Output Codes Type Journal Article
  Year 2009 Publication Journal of Signal Processing Systems Abbreviated Journal  
  Volume 55 Issue 1-3 Pages 35–47  
  Keywords  
  Abstract Intravascular ultrasound (IVUS) represents a powerful imaging technique to explore coronary vessels and to study their morphology and histologic properties. In this paper, we characterize different tissues based on radial frequency, texture-based, and combined features. To deal with the classification of multiple tissues, we require the use of robust multi-class learning techniques. In this sense, error-correcting output codes (ECOC) show to robustly combine binary classifiers to solve multi-class problems. In this context, we propose a strategy to model multi-class classification tasks using sub-classes information in the ECOC framework. The new strategy splits the classes into different sub-sets according to the applied base classifier. Complex IVUS data sets containing overlapping data are learnt by splitting the original set of classes into sub-classes, and embedding the binary problems in a problem-dependent ECOC design. The method automatically characterizes different tissues, showing performance improvements over the state-of-the-art ECOC techniques for different base classifiers. Furthermore, the combination of RF and texture-based features also shows improvements over the state-of-the-art approaches.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1939-8018 ISBN Medium (up)  
  Area Expedition Conference  
  Notes MILAB;HuPBA Approved no  
  Call Number BCNPCL @ bcnpcl @ EPM2009 Serial 1258  
Permanent link to this record
 

 
Author Anjan Dutta; Zeynep Akata edit   pdf
url  doi
openurl 
  Title Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-based Image Retrieval Type Conference Article
  Year 2019 Publication 32nd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 5089-5098  
  Keywords  
  Abstract Zero-shot sketch-based image retrieval (SBIR) is an emerging task in computer vision, allowing to retrieve natural images relevant to sketch queries that might not been seen in the training phase. Existing works either require aligned sketch-image pairs or inefficient memory fusion layer for mapping the visual information to a semantic space. In this work, we propose a semantically aligned paired cycle-consistent generative (SEM-PCYC) model for zero-shot SBIR, where each branch maps the visual information to a common semantic space via an adversarial training. Each of these branches maintains a cycle consistency that only requires supervision at category levels, and avoids the need of highly-priced aligned sketch-image pairs. A classification criteria on the generators' outputs ensures the visual to semantic space mapping to be discriminating. Furthermore, we propose to combine textual and hierarchical side information via a feature selection auto-encoder that selects discriminating side information within a same end-to-end model. Our results demonstrate a significant boost in zero-shot SBIR performance over the state-of-the-art on the challenging Sketchy and TU-Berlin datasets.  
  Address Long beach; California; USA; June 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium (up)  
  Area Expedition Conference CVPR  
  Notes DAG; 600.141; 600.121 Approved no  
  Call Number Admin @ si @ DuA2019 Serial 3268  
Permanent link to this record
 

 
Author Oriol Pujol; Eloi Puertas; Carlo Gatta edit  doi
isbn  openurl
  Title Multi-scale Stacked Sequential Learning Type Conference Article
  Year 2009 Publication 8th International Workshop of Multiple Classifier Systems Abbreviated Journal  
  Volume 5519 Issue Pages 262–271  
  Keywords  
  Abstract One of the most widely used assumptions in supervised learning is that data is independent and identically distributed. This assumption does not hold true in many real cases. Sequential learning is the discipline of machine learning that deals with dependent data such that neighboring examples exhibit some kind of relationship. In the literature, there are different approaches that try to capture and exploit this correlation, by means of different methodologies. In this paper we focus on meta-learning strategies and, in particular, the stacked sequential learning approach. The main contribution of this work is two-fold: first, we generalize the stacked sequential learning. This generalization reflects the key role of neighboring interactions modeling. Second, we propose an effective and efficient way of capturing and exploiting sequential correlations that takes into account long-range interactions by means of a multi-scale pyramidal decomposition of the predicted labels. Additionally, this new method subsumes the standard stacked sequential learning approach. We tested the proposed method on two different classification tasks: text lines classification in a FAQ data set and image classification. Results on these tasks clearly show that our approach outperforms the standard stacked sequential learning. Moreover, we show that the proposed method allows to control the trade-off between the detail and the desired range of the interactions.  
  Address Reykjavik, Iceland  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-02325-5 Medium (up)  
  Area Expedition Conference MCS  
  Notes MILAB;HuPBA Approved no  
  Call Number BCNPCL @ bcnpcl @ PPG2009 Serial 1260  
Permanent link to this record
 

 
Author David Rotger edit  openurl
  Title Analysis and Multi-Modal Fusion of coronary Images Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The framework of this thesis is to study in detail different techniques and tools for medical image registration in order to ease the daily life of clinical experts in cardiology. The first aim of this thesis is providing computer tools for
fusing IVUS and angiogram data is of high clinical interest to help the physicians locate in IVUS data and decide which lesion is observed, how long it is, how far from a bifurcation or another lesions stays, etc. This thesis proves and
validates that we can segment the catheter path in angiographies using geodesic snakes (based on fast marching algorithm), a three-dimensional reconstruction of the catheter inspired in stereo vision and a new technique to fuse IVUS
and angiograms that establishes exact correspondences between them. We have developed a new workstation called iFusion that has four strong advantages: registration of IVUS and angiographic images with sub-pixel precision, it works on- and off-line, it is independent on the X-ray system and there is no need of daily calibration. The second aim of the thesis is devoted to developing a computer-aided analysis of IVUS for image-guided intervention. We have designed, implemented
and validated a robust algorithm for stent extraction and reconstruction from IVUS videos. We consider a very special and recent kind of stents, bioabsorbable stents that represent a great clinical challenge due to their property to be
absorbed by time and thus avoiding the “danger” of neostenosis as one of the main problems of metallic stents. We present a new and very promising algorithm based on an optimized cascade of multiple classifiers to automatically detect individual stent struts of a very novel bioabsorbable drug eluting coronary stent. This problem represents a very challenging target given the variability in contrast, shape and grey levels of the regions to be detected, what is
denoted by the high variability between the specialists (inter-observer variability of 0.14~$\pm$0.12). The obtained results of the automatic strut detection are within the inter-observer variability.
 
  Address Barcelona (Espanya)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Petia Radeva  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium (up)  
  Area Expedition Conference  
  Notes Approved no  
  Call Number Admin @ si @ Rot2009 Serial 1261  
Permanent link to this record
 

 
Author Xavier Baro edit  openurl
  Title Probabilistic Darwin Machines: A New Approach to Develop Evolutionary Object Detection Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Ever since computers were invented, we have wondered whether they might perform some of the human quotidian tasks. One of the most studied and still nowadays less understood problem is the capacity to learn from our experiences and how we generalize the knowledge that we acquire. One of that unaware tasks for the persons and that more interest is awakening in different scientific areas since the beginning, is the one that is known as pattern recognition. The creation of models that represent the world that surrounds us, help us for recognizing objects in our environment, to predict situations, to identify behaviors... All this information allows us to adapt ourselves and to interact with our environment. The capacity of adaptation of individuals to their environment has been related to the amount of patterns that are capable of identifying.

This thesis faces the pattern recognition problem from a Computer Vision point of view, taking one of the most paradigmatic and extended approaches to object detection as starting point. After studying this approach, two weak points are identified: The first makes reference to the description of the objects, and the second is a limitation of the learning algorithm, which hampers the utilization of best descriptors.

In order to address the learning limitations, we introduce evolutionary computation techniques to the classical object detection approach.

After testing the classical evolutionary approaches, such as genetic algorithms, we develop a new learning algorithm based on Probabilistic Darwin Machines, which better adapts to the learning problem. Once the learning limitation is avoided, we introduce a new feature set, which maintains the benefits of the classical feature set, adding the ability to describe non localities. This combination of evolutionary learning algorithm and features is tested on different public data sets, outperforming the results obtained by the classical approach.
 
  Address Barcelona (Spain)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Vitria  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium (up)  
  Area Expedition Conference  
  Notes OR;HuPBA;MV Approved no  
  Call Number BCNPCL @ bcnpcl @ Bar2009 Serial 1262  
Permanent link to this record
 

 
Author Agata Lapedriza edit  openurl
  Title Multitask Learning Techniques for Automatic Face Classification Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Automatic face classification is currently a popular research area in Computer Vision. It involves several subproblems, such as subject recognition, gender classification or subject verification.

Current systems of automatic face classification need a large amount of training data to robustly learn a task. However, the collection of labeled data is usually a difficult issue. For this reason, the research on methods that are able to learn from a small sized training set is essential.

The dependency on the abundance of training data is not so evident in human learning processes. We are able to learn from a very small number of examples, given that we use, additionally, some prior knowledge to learn a new task. For example, we frequently find patterns and analogies from other domains to reuse them in new situations, or exploit training data from other experiences.

In computer science, Multitask Learning is a new Machine Learning approach that studies this idea of knowledge transfer among different tasks, to overcome the effects of the small sample sized problem.

This thesis explores, proposes and tests some Multitask Learning methods specially developed for face classification purposes. Moreover, it presents two more contributions dealing with the small sample sized problem, out of the Multitask Learning context. The first one is a method to extract external face features, to be used as an additional information source in automatic face classification problems. The second one is an empirical study on the most suitable face image resolution to perform automatic subject recognition.
 
  Address Barcelona (Spain)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Vitria;David Masip  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium (up)  
  Area Expedition Conference  
  Notes OR;MV Approved no  
  Call Number BCNPCL @ bcnpcl @ Lap2009 Serial 1263  
Permanent link to this record
 

 
Author Marçal Rusiñol edit  openurl
  Title Geometric and Structural-based Symbol Spotting. Application to Focused Retrieval in Graphic Document Collections Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Usually, pattern recognition systems consist of two main parts. On the one hand, the data acquisition and, on the other hand, the classification of this data on a certain category. In order to recognize which category a certain query element belongs to, a set of pattern models must be provided beforehand. An off-line learning stage is needed to train the classifier and to offer a robust classification of the patterns. Within the pattern recognition field, we are interested in the recognition of graphics and, in particular, on the analysis of documents rich in graphical information. In this context, one of the main concerns is to see if the proposed systems remain scalable with respect to the data volume so as it can handle growing amounts of symbol models. In order to avoid to work with a database of reference symbols, symbol spotting and on-the-fly symbol recognition methods have been introduced in the past years.

Generally speaking, the symbol spotting problem can be defined as the identification of a set of regions of interest from a document image which are likely to contain an instance of a certain queriedn symbol without explicitly applying the whole pattern recognition scheme. Our application framework consists on indexing a collection of graphic-rich document images. This collection is
queried by example with a single instance of the symbol to look for and, by means of symbol spotting methods we retrieve the regions of interest where the symbol is likely to appear within the documents. This kind of applications are known as focused retrieval methods.

In order that the focused retrieval application can handle large collections of documents there is a need to provide an efficient access to the large volume of information that might be stored. We use indexing strategies in order to efficiently retrieve by similarity the locations where a certain part of the symbol appears. In that scenario, graphical patterns should be used as indices for accessing and navigating the collection of documents.
These indexing mechanism allow the user to search for similar elements using graphical information rather than textual queries.

Along this thesis we present a spotting architecture and different methods aiming to build a complete focused retrieval application dealing with a graphic-rich document collections. In addition, a protocol to evaluate the performance of symbol
spotting systems in terms of recognition abilities, location accuracy and scalability is proposed.
 
  Address Barcelona (Spain)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium (up)  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ Rus2009 Serial 1264  
Permanent link to this record
 

 
Author Alicia Fornes edit  openurl
  Title Writer Identification by a Combination of Graphical Features in the Framework of Old Handwritten Music Scores Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The analysis and recognition of historical document images has attracted growing interest in the last years. Mass digitization and document image understanding allows the preservation, access and indexation of this artistic, cultural and technical heritage. The analysis of handwritten documents is an outstanding subfield. The main interest is not only the transcription of the document to a standard format, but also, the identification of the author of a document from a set of writers (namely writer identification).

Writer identification in handwritten text documents is an active area of study, however, the identification of the writer of graphical documents is still a challenge. The main objective of this thesis is the identification of the writer in old music scores, as an example of graphic documents. Concerning old music scores, many historical archives contain a huge number of sheets of musical compositions without information about the composer, and the research on this field could be helpful for musicologists.

The writer identification framework proposed in this thesis combines three different writer identification approaches, which are the main scientific contributions. The first one is based on symbol recognition methods. For this purpose, two novel symbol recognition methods are proposed for coping with the typical distortions in hand-drawn symbols. The second approach preprocesses the music score for obtaining music lines, and extracts information about the slant, width of the writing, connected components, contours and fractals. Finally, the third approach extracts global information by generating texture images from the music scores and extracting textural features (such as Gabor filters and co-occurence matrices).

The high identification rates obtained in the experimental results demonstrate the suitability of the proposed ensemble architecture. To the best of our knowledge, this work is the first contribution on writer identification from images containing graphical languages.
 
  Address Barcelona (Spain)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados;Gemma Sanchez  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium (up)  
  Area Expedition Conference  
  Notes Approved no  
  Call Number DAG @ dag @ For2009 Serial 1265  
Permanent link to this record
 

 
Author Jose Antonio Rodriguez edit  openurl
  Title Statistical frameworks and prior information modeling in handwritten word-spotting Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Handwritten word-spotting (HWS) is the pattern analysis task that consists in finding keywords in handwritten document images. So far, HWS has been applied mostly to historical documents in order to build search engines for such image collections. This thesis addresses the problem of word-spotting for detecting important keywords in business documents. This is a first step towards the process of automatic routing of correspondence based on content.

However, the application of traditional HWS techniques fails for this type of documents. As opposed to historical documents, real business documents present a very high variability in terms of writing styles, spontaneous writing, crossed-out words, spelling mistakes, etc. The main goal of this thesis is the development of pattern recognition techniques that lead to a high-performance HWS system for this challenging type of data.

We develop a statistical framework in which word models are expressed in terms of hidden Markov models and the a priori information is encoded in a universal vocabulary of Gaussian codewords. This systems leads to a very robust performance in word-spotting task. We also find that by constraining the word models to the universal vocabulary, the a priori information of the problem of interest can be exploited for developing new contributions. These include a novel writer adaptation method, a system for searching handwritten words by generating typed text images, and a novel model-based similarity between feature vector sequences.
 
  Address Barcelona (Spain)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Gemma Sanchez;Josep Llados;Florent Perronnin  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium (up)  
  Area Expedition Conference  
  Notes Approved no  
  Call Number Admin @ si @ Rod2009 Serial 1266  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: