toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Volkmar Frinken; Andreas Fischer; Carlos David Martinez Hinarejos edit   pdf
doi  isbn
openurl 
  Title Handwriting Recognition in Historical Documents using Very Large Vocabularies Type Conference Article
  Year 2013 Publication 2nd International Workshop on Historical Document Imaging and Processing Abbreviated Journal  
  Volume (down) Issue Pages 67-72  
  Keywords  
  Abstract Language models are used in automatic transcription system to resolve ambiguities. This is done by limiting the vocabulary of words that can be recognized as well as estimating the n-gram probability of the words in the given text. In the context of historical documents, a non-unified spelling and the limited amount of written text pose a substantial problem for the selection of the recognizable vocabulary as well as the computation of the word probabilities. In this paper we propose for the transcription of historical Spanish text to keep the corpus for the n-gram limited to a sample of the target text, but expand the vocabulary with words gathered from external resources. We analyze the performance of such a transcription system with different sizes of external vocabularies and demonstrate the applicability and the significant increase in recognition accuracy of using up to 300 thousand external words.  
  Address Washington; USA; August 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-4503-2115-0 Medium  
  Area Expedition Conference HIP  
  Notes DAG; 600.056; 600.045; 600.061; 602.006; 602.101 Approved no  
  Call Number Admin @ si @ FFM2013 Serial 2296  
Permanent link to this record
 

 
Author Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Apostolos Antonacopoulos; Josep Llados edit   pdf
openurl 
  Title An interactive appearance-based document retrieval system for historical newspapers Type Conference Article
  Year 2013 Publication Proceedings of the International Conference on Computer Vision Theory and Applications Abbreviated Journal  
  Volume (down) Issue Pages 84-87  
  Keywords  
  Abstract In this paper we present a retrieval-based application aimed at assisting a user to semi-automatically segment an incoming flow of historical newspaper images by automatically detecting a particular type of pages based on their appearance. A visual descriptor is used to assess page similarity while a relevance feedback process allow refining the results iteratively. The application is tested on a large dataset of digitised historic newspapers.  
  Address Barcelona; February 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference VISAPP  
  Notes DAG; 600.056; 600.045; 605.203 Approved no  
  Call Number Admin @ si @ GRK2013a Serial 2290  
Permanent link to this record
 

 
Author Albert Gordo edit  openurl
  Title Document Image Representation, Classification and Retrieval in Large-Scale Domains Type Book Whole
  Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume (down) Issue Pages  
  Keywords  
  Abstract Despite the “paperless office” ideal that started in the decade of the seventies, businesses still strive against an increasing amount of paper documentation. Companies still receive huge amounts of paper documentation that need to be analyzed and processed, mostly in a manual way. A solution for this task consists in, first, automatically scanning the incoming documents. Then, document images can be analyzed and information can be extracted from the data. Documents can also be automatically dispatched to the appropriate workflows, used to retrieve similar documents in the dataset to transfer information, etc.

Due to the nature of this “digital mailroom”, we need document representation methods to be general, i.e., able to cope with very different types of documents. We need the methods to be sound, i.e., able to cope with unexpected types of documents, noise, etc. And, we need to methods to be scalable, i.e., able to cope with thousands or millions of documents that need to be processed, stored, and consulted. Unfortunately, current techniques of document representation, classification and retrieval are not apt for this digital mailroom framework, since they do not fulfill some or all of these requirements.

Through this thesis we focus on the problem of document representation aimed at classification and retrieval tasks under this digital mailroom framework. We first propose a novel document representation based on runlength histograms, and extend it to cope with more complex documents such as multiple-page documents, or documents that contain more sources of information such as extracted OCR text. Then we focus on the scalability requirements and propose a novel binarization method which we dubbed PCAE, as well as two general asymmetric distances between binary embeddings that can significantly improve the retrieval results at a minimal extra computational cost. Finally, we note the importance of supervised learning when performing large-scale retrieval, and study several approaches that can significantly boost the results at no extra cost at query time.
 
  Address Barcelona  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Ernest Valveny;Florent Perronnin  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number Admin @ si @ Gor2013 Serial 2277  
Permanent link to this record
 

 
Author Muhammad Muzzamil Luqman; Jean-Yves Ramel; Josep Llados edit  url
doi  isbn
openurl 
  Title Multilevel Analysis of Attributed Graphs for Explicit Graph Embedding in Vector Spaces Type Book Chapter
  Year 2013 Publication Graph Embedding for Pattern Analysis Abbreviated Journal  
  Volume (down) Issue Pages 1-26  
  Keywords  
  Abstract Ability to recognize patterns is among the most crucial capabilities of human beings for their survival, which enables them to employ their sophisticated neural and cognitive systems [1], for processing complex audio, visual, smell, touch, and taste signals. Man is the most complex and the best existing system of pattern recognition. Without any explicit thinking, we continuously compare, classify, and identify huge amount of signal data everyday [2], starting from the time we get up in the morning till the last second we fall asleep. This includes recognizing the face of a friend in a crowd, a spoken word embedded in noise, the proper key to lock the door, smell of coffee, the voice of a favorite singer, the recognition of alphabetic characters, and millions of more tasks that we perform on regular basis.  
  Address  
  Corporate Author Thesis  
  Publisher Springer New York Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-4614-4456-5 Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number Admin @ si @ LRL2013b Serial 2271  
Permanent link to this record
 

 
Author Marçal Rusiñol; R.Roset; Josep Llados; C.Montaner edit  openurl
  Title Automatic Index Generation of Digitized Map Series by Coordinate Extraction and Interpretation Type Conference Article
  Year 2011 Publication In Proceedings of the Sixth International Workshop on Digital Technologies in Cartographic Heritage Abbreviated Journal  
  Volume (down) Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CartoHerit  
  Notes DAG Approved no  
  Call Number Admin @ si @ RRL2011b Serial 1978  
Permanent link to this record
 

 
Author Jon Almazan; David Fernandez; Alicia Fornes; Josep Llados; Ernest Valveny edit   pdf
doi  isbn
openurl 
  Title A Coarse-to-Fine Approach for Handwritten Word Spotting in Large Scale Historical Documents Collection Type Conference Article
  Year 2012 Publication 13th International Conference on Frontiers in Handwriting Recognition Abbreviated Journal  
  Volume (down) Issue Pages 453-458  
  Keywords  
  Abstract In this paper we propose an approach for word spotting in handwritten document images. We state the problem from a focused retrieval perspective, i.e. locating instances of a query word in a large scale dataset of digitized manuscripts. We combine two approaches, namely one based on word segmentation and another one segmentation-free. The first approach uses a hashing strategy to coarsely prune word images that are unlikely to be instances of the query word. This process is fast but has a low precision due to the errors introduced in the segmentation step. The regions containing candidate words are sent to the second process based on a state of the art technique from the visual object detection field. This discriminative model represents the appearance of the query word and computes a similarity score. In this way we propose a coarse-to-fine approach achieving a compromise between efficiency and accuracy. The validation of the model is shown using a collection of old handwritten manuscripts. We appreciate a substantial improvement in terms of precision regarding the previous proposed method with a low computational cost increase.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-4673-2262-1 Medium  
  Area Expedition Conference ICFHR  
  Notes DAG Approved no  
  Call Number DAG @ dag @ AFF2012 Serial 1983  
Permanent link to this record
 

 
Author Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny edit   pdf
url  isbn
openurl 
  Title Efficient Exemplar Word Spotting Type Conference Article
  Year 2012 Publication 23rd British Machine Vision Conference Abbreviated Journal  
  Volume (down) Issue Pages 67.1- 67.11  
  Keywords  
  Abstract In this paper we propose an unsupervised segmentation-free method for word spotting in document images.
Documents are represented with a grid of HOG descriptors, and a sliding window approach is used to locate the document regions that are most similar to the query. We use the exemplar SVM framework to produce a better representation of the query in an unsupervised way. Finally, the document descriptors are precomputed and compressed with Product Quantization. This offers two advantages: first, a large number of documents can be kept in RAM memory at the same time. Second, the sliding window becomes significantly faster since distances between quantized HOG descriptors can be precomputed. Our results significantly outperform other segmentation-free methods in the literature, both in accuracy and in speed and memory usage.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 1-901725-46-4 Medium  
  Area Expedition Conference BMVC  
  Notes DAG Approved no  
  Call Number DAG @ dag @ AGF2012 Serial 1984  
Permanent link to this record
 

 
Author Dimosthenis Karatzas;Ch. Lioutas edit  openurl
  Title Software Package Development for Electron Diffraction Image Analysis Type Conference Article
  Year 1998 Publication Proceedings of the XIV Solid State Physics National Conference Abbreviated Journal  
  Volume (down) Issue Pages  
  Keywords  
  Abstract  
  Address Ioannina, Greece  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number IAM @ iam @ KaL1998 Serial 2045  
Permanent link to this record
 

 
Author Albert Gordo; Florent Perronnin; Ernest Valveny edit   pdf
doi  isbn
openurl 
  Title Document classification using multiple views Type Conference Article
  Year 2012 Publication 10th IAPR International Workshop on Document Analysis Systems Abbreviated Journal  
  Volume (down) Issue Pages 33-37  
  Keywords  
  Abstract The combination of multiple features or views when representing documents or other kinds of objects usually leads to improved results in classification (and retrieval) tasks. Most systems assume that those views will be available both at training and test time. However, some views may be too `expensive' to be available at test time. In this paper, we consider the use of Canonical Correlation Analysis to leverage `expensive' views that are available only at training time. Experimental results show that this information may significantly improve the results in a classification task.  
  Address Australia  
  Corporate Author Thesis  
  Publisher IEEE Computer Society Washington Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-0-7695-4661-2 Medium  
  Area Expedition Conference DAS  
  Notes DAG Approved no  
  Call Number Admin @ si @ GPV2012 Serial 2049  
Permanent link to this record
 

 
Author Albert Gordo; Jose Antonio Rodriguez; Florent Perronnin; Ernest Valveny edit   pdf
doi  isbn
openurl 
  Title Leveraging category-level labels for instance-level image retrieval Type Conference Article
  Year 2012 Publication 25th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
  Volume (down) Issue Pages 3045-3052  
  Keywords  
  Abstract In this article, we focus on the problem of large-scale instance-level image retrieval. For efficiency reasons, it is common to represent an image by a fixed-length descriptor which is subsequently encoded into a small number of bits. We note that most encoding techniques include an unsupervised dimensionality reduction step. Our goal in this work is to learn a better subspace in a supervised manner. We especially raise the following question: “can category-level labels be used to learn such a subspace?” To answer this question, we experiment with four learning techniques: the first one is based on a metric learning framework, the second one on attribute representations, the third one on Canonical Correlation Analysis (CCA) and the fourth one on Joint Subspace and Classifier Learning (JSCL). While the first three approaches have been applied in the past to the image retrieval problem, we believe we are the first to show the usefulness of JSCL in this context. In our experiments, we use ImageNet as a source of category-level labels and report retrieval results on two standard dataseis: INRIA Holidays and the University of Kentucky benchmark. Our experimental study shows that metric learning and attributes do not lead to any significant improvement in retrieval accuracy, as opposed to CCA and JSCL. As an example, we report on Holidays an increase in accuracy from 39.3% to 48.6% with 32-dimensional representations. Overall JSCL is shown to yield the best results.  
  Address Providence, Rhode Island  
  Corporate Author Thesis  
  Publisher IEEE Xplore Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1063-6919 ISBN 978-1-4673-1226-4 Medium  
  Area Expedition Conference CVPR  
  Notes DAG Approved no  
  Call Number Admin @ si @ GRP2012 Serial 2050  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: