|
Records |
Links |
|
Author |
Albert Gordo |

|
|
Title |
Document Image Representation, Classification and Retrieval in Large-Scale Domains |
Type |
Book Whole |
|
Year |
2013 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Despite the “paperless office” ideal that started in the decade of the seventies, businesses still strive against an increasing amount of paper documentation. Companies still receive huge amounts of paper documentation that need to be analyzed and processed, mostly in a manual way. A solution for this task consists in, first, automatically scanning the incoming documents. Then, document images can be analyzed and information can be extracted from the data. Documents can also be automatically dispatched to the appropriate workflows, used to retrieve similar documents in the dataset to transfer information, etc.
Due to the nature of this “digital mailroom”, we need document representation methods to be general, i.e., able to cope with very different types of documents. We need the methods to be sound, i.e., able to cope with unexpected types of documents, noise, etc. And, we need to methods to be scalable, i.e., able to cope with thousands or millions of documents that need to be processed, stored, and consulted. Unfortunately, current techniques of document representation, classification and retrieval are not apt for this digital mailroom framework, since they do not fulfill some or all of these requirements.
Through this thesis we focus on the problem of document representation aimed at classification and retrieval tasks under this digital mailroom framework. We first propose a novel document representation based on runlength histograms, and extend it to cope with more complex documents such as multiple-page documents, or documents that contain more sources of information such as extracted OCR text. Then we focus on the scalability requirements and propose a novel binarization method which we dubbed PCAE, as well as two general asymmetric distances between binary embeddings that can significantly improve the retrieval results at a minimal extra computational cost. Finally, we note the importance of supervised learning when performing large-scale retrieval, and study several approaches that can significantly boost the results at no extra cost at query time. |
|
|
Address |
Barcelona |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Ernest Valveny;Florent Perronnin |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN  |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ Gor2013 |
Serial |
2277 |
|
Permanent link to this record |
|
|
|
|
Author |
Muhammad Muzzamil Luqman; Jean-Yves Ramel; Josep Llados |


|
|
Title |
Multilevel Analysis of Attributed Graphs for Explicit Graph Embedding in Vector Spaces |
Type |
Book Chapter |
|
Year |
2013 |
Publication |
Graph Embedding for Pattern Analysis |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1-26 |
|
|
Keywords |
|
|
|
Abstract |
Ability to recognize patterns is among the most crucial capabilities of human beings for their survival, which enables them to employ their sophisticated neural and cognitive systems [1], for processing complex audio, visual, smell, touch, and taste signals. Man is the most complex and the best existing system of pattern recognition. Without any explicit thinking, we continuously compare, classify, and identify huge amount of signal data everyday [2], starting from the time we get up in the morning till the last second we fall asleep. This includes recognizing the face of a friend in a crowd, a spoken word embedded in noise, the proper key to lock the door, smell of coffee, the voice of a favorite singer, the recognition of alphabetic characters, and millions of more tasks that we perform on regular basis. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer New York |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN  |
|
ISBN |
978-1-4614-4456-5 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ LRL2013b |
Serial |
2271 |
|
Permanent link to this record |
|
|
|
|
Author |
Jean-Marc Ogier; Wenyin Liu; Josep Llados (eds) |

|
|
Title |
Graphics Recognition: Achievements, Challenges, and Evolution |
Type |
Book Whole |
|
Year |
2010 |
Publication |
8th International Workshop GREC 2009. |
Abbreviated Journal |
|
|
|
Volume |
6020 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
La Rochelle |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Link |
Place of Publication |
|
Editor |
Jean-Marc Ogier; Wenyin Liu; Josep Llados |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
Lecture Notes in Computer Science |
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN  |
|
ISBN |
978-3-642-13727-3 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GREC |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ OLL2010 |
Serial |
1976 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; R.Roset; Josep Llados; C.Montaner |

|
|
Title |
Automatic Index Generation of Digitized Map Series by Coordinate Extraction and Interpretation |
Type |
Conference Article |
|
Year |
2011 |
Publication |
In Proceedings of the Sixth International Workshop on Digital Technologies in Cartographic Heritage |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN  |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CartoHerit |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ RRL2011b |
Serial |
1978 |
|
Permanent link to this record |
|
|
|
|
Author |
Jon Almazan; David Fernandez; Alicia Fornes; Josep Llados; Ernest Valveny |


|
|
Title |
A Coarse-to-Fine Approach for Handwritten Word Spotting in Large Scale Historical Documents Collection |
Type |
Conference Article |
|
Year |
2012 |
Publication |
13th International Conference on Frontiers in Handwriting Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
453-458 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we propose an approach for word spotting in handwritten document images. We state the problem from a focused retrieval perspective, i.e. locating instances of a query word in a large scale dataset of digitized manuscripts. We combine two approaches, namely one based on word segmentation and another one segmentation-free. The first approach uses a hashing strategy to coarsely prune word images that are unlikely to be instances of the query word. This process is fast but has a low precision due to the errors introduced in the segmentation step. The regions containing candidate words are sent to the second process based on a state of the art technique from the visual object detection field. This discriminative model represents the appearance of the query word and computes a similarity score. In this way we propose a coarse-to-fine approach achieving a compromise between efficiency and accuracy. The validation of the model is shown using a collection of old handwritten manuscripts. We appreciate a substantial improvement in terms of precision regarding the previous proposed method with a low computational cost increase. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN  |
|
ISBN |
978-1-4673-2262-1 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ AFF2012 |
Serial |
1983 |
|
Permanent link to this record |
|
|
|
|
Author |
Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny |


|
|
Title |
Efficient Exemplar Word Spotting |
Type |
Conference Article |
|
Year |
2012 |
Publication |
23rd British Machine Vision Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
67.1- 67.11 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we propose an unsupervised segmentation-free method for word spotting in document images.
Documents are represented with a grid of HOG descriptors, and a sliding window approach is used to locate the document regions that are most similar to the query. We use the exemplar SVM framework to produce a better representation of the query in an unsupervised way. Finally, the document descriptors are precomputed and compressed with Product Quantization. This offers two advantages: first, a large number of documents can be kept in RAM memory at the same time. Second, the sliding window becomes significantly faster since distances between quantized HOG descriptors can be precomputed. Our results significantly outperform other segmentation-free methods in the literature, both in accuracy and in speed and memory usage. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN  |
|
ISBN |
1-901725-46-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
BMVC |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ AGF2012 |
Serial |
1984 |
|
Permanent link to this record |
|
|
|
|
Author |
Jaume Gibert; Ernest Valveny; Horst Bunke |


|
|
Title |
Feature Selection on Node Statistics Based Embedding of Graphs |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
33 |
Issue |
15 |
Pages |
1980–1990 |
|
|
Keywords |
Structural pattern recognition; Graph embedding; Feature ranking; PCA; Graph classification |
|
|
Abstract |
Representing a graph with a feature vector is a common way of making statistical machine learning algorithms applicable to the domain of graphs. Such a transition from graphs to vectors is known as graphembedding. A key issue in graphembedding is to select a proper set of features in order to make the vectorial representation of graphs as strong and discriminative as possible. In this article, we propose features that are constructed out of frequencies of node label representatives. We first build a large set of features and then select the most discriminative ones according to different ranking criteria and feature transformation algorithms. On different classification tasks, we experimentally show that only a small significant subset of these features is needed to achieve the same classification rates as competing to state-of-the-art methods. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN  |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ GVB2012b |
Serial |
1993 |
|
Permanent link to this record |
|
|
|
|
Author |
Sophie Wuerger; Kaida Xiao; Dimitris Mylonas; Q. Huang; Dimosthenis Karatzas; Galina Paramei |


|
|
Title |
Blue green color categorization in mandarin english speakers |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Journal of the Optical Society of America A |
Abbreviated Journal |
JOSA A |
|
|
Volume |
29 |
Issue |
2 |
Pages |
A102-A1207 |
|
|
Keywords |
|
|
|
Abstract |
Observers are faster to detect a target among a set of distracters if the targets and distracters come from different color categories. This cross-boundary advantage seems to be limited to the right visual field, which is consistent with the dominance of the left hemisphere for language processing [Gilbert et al., Proc. Natl. Acad. Sci. USA 103, 489 (2006)]. Here we study whether a similar visual field advantage is found in the color identification task in speakers of Mandarin, a language that uses a logographic system. Forty late Mandarin-English bilinguals performed a blue-green color categorization task, in a blocked design, in their first language (L1: Mandarin) or second language (L2: English). Eleven color singletons ranging from blue to green were presented for 160 ms, randomly in the left visual field (LVF) or right visual field (RVF). Color boundary and reaction times (RTs) at the color boundary were estimated in L1 and L2, for both visual fields. We found that the color boundary did not differ between the languages; RTs at the color boundary, however, were on average more than 100 ms shorter in the English compared to the Mandarin sessions, but only when the stimuli were presented in the RVF. The finding may be explained by the script nature of the two languages: Mandarin logographic characters are analyzed visuospatially in the right hemisphere, which conceivably facilitates identification of color presented to the LVF. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN  |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ WXM2012 |
Serial |
2007 |
|
Permanent link to this record |
|
|
|
|
Author |
Dimosthenis Karatzas;Ch. Lioutas |

|
|
Title |
Software Package Development for Electron Diffraction Image Analysis |
Type |
Conference Article |
|
Year |
1998 |
Publication |
Proceedings of the XIV Solid State Physics National Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Ioannina, Greece |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN  |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
IAM @ iam @ KaL1998 |
Serial |
2045 |
|
Permanent link to this record |
|
|
|
|
Author |
Albert Gordo; Florent Perronnin; Ernest Valveny |


|
|
Title |
Document classification using multiple views |
Type |
Conference Article |
|
Year |
2012 |
Publication |
10th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
33-37 |
|
|
Keywords |
|
|
|
Abstract |
The combination of multiple features or views when representing documents or other kinds of objects usually leads to improved results in classification (and retrieval) tasks. Most systems assume that those views will be available both at training and test time. However, some views may be too `expensive' to be available at test time. In this paper, we consider the use of Canonical Correlation Analysis to leverage `expensive' views that are available only at training time. Experimental results show that this information may significantly improve the results in a classification task. |
|
|
Address |
Australia |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
IEEE Computer Society Washington |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN  |
|
ISBN |
978-0-7695-4661-2 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ GPV2012 |
Serial |
2049 |
|
Permanent link to this record |