|
Records |
Links |
|
Author |
Marçal Rusiñol; J. Chazalon; Jean-Marc Ogier; Josep Llados |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
A Comparative Study of Local Detectors and Descriptors for Mobile Document Classification |
Type |
Conference Article |
|
Year |
2015 |
Publication |
13th International Conference on Document Analysis and Recognition ICDAR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
596-600 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we conduct a comparative study of local key-point detectors and local descriptors for the specific task of mobile document classification. A classification architecture based on direct matching of local descriptors is used as baseline for the comparative study. A set of four different key-point
detectors and four different local descriptors are tested in all the possible combinations. The experiments are conducted in a database consisting of 30 model documents acquired on 6 different backgrounds, totaling more than 36.000 test images. |
|
|
Address |
Nancy; France; August 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.084; 600.61; 601.223; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RCO2015 |
Serial |
2684 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; Josep Llados; Gemma Sanchez; Xavier Otazu; Horst Bunke |
![goto web page (via DOI) doi](http://refbase.cvc.uab.es/img/doi.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
A Combination of Features for Symbol-Independent Writer Identification in Old Music Scores |
Type |
Journal Article |
|
Year |
2010 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
IJDAR |
|
|
Volume |
13 |
Issue |
4 |
Pages |
243-259 |
|
|
Keywords |
|
|
|
Abstract |
The aim of writer identification is determining the writer of a piece of handwriting from a set of writers. In this paper, we present an architecture for writer identification in old handwritten music scores. Even though an important amount of music compositions contain handwritten text, the aim of our work is to use only music notation to determine the author. The main contribution is therefore the use of features extracted from graphical alphabets. Our proposal consists in combining the identification results of two different approaches, based on line and textural features. The steps of the ensemble architecture are the following. First of all, the music sheet is preprocessed for removing the staff lines. Then, music lines and texture images are generated for computing line features and textural features. Finally, the classification results are combined for identifying the writer. The proposed method has been tested on a database of old music scores from the seventeenth to nineteenth centuries, achieving a recognition rate of about 92% with 20 writers. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer-Verlag |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1433-2833 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; CAT;CIC |
Approved |
no |
|
|
Call Number |
FLS2010b |
Serial |
1319 |
|
Permanent link to this record |
|
|
|
|
Author |
P. Wang; V. Eglin; C. Garcia; C. Largeron; Josep Llados; Alicia Fornes |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
A Coarse-to-Fine Word Spotting Approach for Historical Handwritten Documents Based on Graph Embedding and Graph Edit Distance |
Type |
Conference Article |
|
Year |
2014 |
Publication |
22nd International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
3074 - 3079 |
|
|
Keywords |
word spotting; coarse-to-fine mechamism; graphbased representation; graph embedding; graph edit distance |
|
|
Abstract |
Effective information retrieval on handwritten document images has always been a challenging task, especially historical ones. In the paper, we propose a coarse-to-fine handwritten word spotting approach based on graph representation. The presented model comprises both the topological and morphological signatures of the handwriting. Skeleton-based graphs with the Shape Context labelled vertexes are established for connected components. Each word image is represented as a sequence of graphs. Aiming at developing a practical and efficient word spotting approach for large-scale historical handwritten documents, a fast and coarse comparison is first applied to prune the regions that are not similar to the query based on the graph embedding methodology. Afterwards, the query and regions of interest are compared by graph edit distance based on the Dynamic Time Warping alignment. The proposed approach is evaluated on a public dataset containing 50 pages of historical marriage license records. The results show that the proposed approach achieves a compromise between efficiency and accuracy. |
|
|
Address |
Stockholm; Sweden; August 2014 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1051-4651 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
DAG; 600.061; 602.006; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ WEG2014a |
Serial |
2515 |
|
Permanent link to this record |
|
|
|
|
Author |
Jon Almazan; David Fernandez; Alicia Fornes; Josep Llados; Ernest Valveny |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find book details (via ISBN) isbn](http://refbase.cvc.uab.es/img/isbn.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
A Coarse-to-Fine Approach for Handwritten Word Spotting in Large Scale Historical Documents Collection |
Type |
Conference Article |
|
Year |
2012 |
Publication |
13th International Conference on Frontiers in Handwriting Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
453-458 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we propose an approach for word spotting in handwritten document images. We state the problem from a focused retrieval perspective, i.e. locating instances of a query word in a large scale dataset of digitized manuscripts. We combine two approaches, namely one based on word segmentation and another one segmentation-free. The first approach uses a hashing strategy to coarsely prune word images that are unlikely to be instances of the query word. This process is fast but has a low precision due to the errors introduced in the segmentation step. The regions containing candidate words are sent to the second process based on a state of the art technique from the visual object detection field. This discriminative model represents the appearance of the query word and computes a similarity score. In this way we propose a coarse-to-fine approach achieving a compromise between efficiency and accuracy. The validation of the model is shown using a collection of old handwritten manuscripts. We appreciate a substantial improvement in terms of precision regarding the previous proposed method with a low computational cost increase. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-4673-2262-1 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ AFF2012 |
Serial |
1983 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Ernest Valveny; Gemma Sanchez; Enric Marti |
![goto web page url](http://refbase.cvc.uab.es/img/www.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
A Case Study of Pattern Recognition: Symbol Recognition in Graphic Documentsa |
Type |
Conference Article |
|
Year |
2003 |
Publication |
Proceedings of Pattern Recognition in Information Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1-13 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Angers, France |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
ICEIS Press |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
972-98816-3-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
PRIS'03 |
|
|
Notes |
DAG;IAM; |
Approved |
no |
|
|
Call Number |
IAM @ iam @ LVS2003 |
Serial |
1576 |
|
Permanent link to this record |
|
|
|
|
Author |
Ali Furkan Biten |
![find book details (via ISBN) isbn](http://refbase.cvc.uab.es/img/isbn.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
A Bitter-Sweet Symphony on Vision and Language: Bias and World Knowledge |
Type |
Book Whole |
|
Year |
2022 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Vision and Language are broadly regarded as cornerstones of intelligence. Even though language and vision have different aims – language having the purpose of communication, transmission of information and vision having the purpose of constructing mental representations around us to navigate and interact with objects – they cooperate and depend on one another in many tasks we perform effortlessly. This reliance is actively being studied in various Computer Vision tasks, e.g. image captioning, visual question answering, image-sentence retrieval, phrase grounding, just to name a few. All of these tasks share the inherent difficulty of the aligning the two modalities, while being robust to language
priors and various biases existing in the datasets. One of the ultimate goal for vision and language research is to be able to inject world knowledge while getting rid of the biases that come with the datasets. In this thesis, we mainly focus on two vision and language tasks, namely Image Captioning and Scene-Text Visual Question Answering (STVQA).
In both domains, we start by defining a new task that requires the utilization of world knowledge and in both tasks, we find that the models commonly employed are prone to biases that exist in the data. Concretely, we introduce new tasks and discover several problems that impede performance at each level and provide remedies or possible solutions in each chapter: i) We define a new task to move beyond Image Captioning to Image Interpretation that can utilize Named Entities in the form of world knowledge. ii) We study the object hallucination problem in classic Image Captioning systems and develop an architecture-agnostic solution. iii) We define a sub-task of Visual Question Answering that requires reading the text in the image (STVQA), where we highlight the limitations of current models. iv) We propose an architecture for the STVQA task that can point to the answer in the image and show how to combine it with classic VQA models. v) We show how far language can get us in STVQA and discover yet another bias which causes the models to disregard the image while doing Visual Question Answering. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
IMPRIMA |
Place of Publication |
|
Editor |
Dimosthenis Karatzas;Lluis Gomez |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-84-124793-5-5 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ Bit2022 |
Serial |
3755 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; Josep Llados; Joan Mas; Joana Maria Pujadas-Mora; Anna Cabre |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find book details (via ISBN) isbn](http://refbase.cvc.uab.es/img/isbn.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
A Bimodal Crowdsourcing Platform for Demographic Historical Manuscripts |
Type |
Conference Article |
|
Year |
2014 |
Publication |
Digital Access to Textual Cultural Heritage Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
103-108 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we present a crowdsourcing web-based application for extracting information from demographic handwritten document images. The proposed application integrates two points of view: the semantic information for demographic research, and the ground-truthing for document analysis research. Concretely, the application has the contents view, where the information is recorded into forms, and the labeling view, with the word labels for evaluating document analysis techniques. The crowdsourcing architecture allows to accelerate the information extraction (many users can work simultaneously), validate the information, and easily provide feedback to the users. We finally show how the proposed application can be extended to other kind of demographic historical manuscripts. |
|
|
Address |
Madrid; May 2014 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-4503-2588-2 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DATeCH |
|
|
Notes |
DAG; 600.061; 602.006; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ FLM2014 |
Serial |
2516 |
|
Permanent link to this record |
|
|
|
|
Author |
Anjan Dutta; Josep Llados; Umapada Pal |
![goto web page (via DOI) doi](http://refbase.cvc.uab.es/img/doi.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
A Bag-of-Paths Based Serialized Subgraph Matching for Symbol Spotting in Line Drawings |
Type |
Conference Article |
|
Year |
2011 |
Publication |
5th Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
6669 |
Issue |
|
Pages |
620-627 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we propose an error tolerant subgraph matching algorithm based on bag-of-paths for solving the problem of symbol spotting in line drawings. Bag-of-paths is a factorized representation of graphs where the factorization is done by considering all the acyclic paths between each pair of connected nodes. Similar paths within the whole collection of documents are clustered and organized in a lookup table for efficient indexing. The lookup table contains the index key of each cluster and the corresponding list of locations as a single entry. The mean path of each of the clusters serves as the index key for each table entry. The spotting method is then formulated by a spatial voting scheme to the list of locations of the paths that are decided in terms of search of similar paths that compose the query symbol. Efficient indexing of common substructures helps to reduce the computational burden of usual graph based methods. The proposed method can also be seen as a way to serialize graphs which allows to reduce the complexity of the subgraph isomorphism. We have encoded the paths in terms of both attributed strings and turning functions, and presented a comparative results between them within the symbol spotting framework. Experimentations for matching different shape silhouettes are also reported and the method has been proved to work in noisy environment also. |
|
|
Address |
Las Palmas de Gran Canaria. Spain |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
Berlin |
Editor |
Jordi Vitria; Joao Miguel Raposo; Mario Hernandez |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-21256-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ DLP2011a |
Serial |
1738 |
|
Permanent link to this record |
|
|
|
|
Author |
Albert Gordo; Florent Perronnin |
![goto web page (via DOI) doi](http://refbase.cvc.uab.es/img/doi.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
A Bag-of-Pages Approach to Unordered Multi-Page Document Classification |
Type |
Conference Article |
|
Year |
2010 |
Publication |
20th International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1920–1923 |
|
|
Keywords |
|
|
|
Abstract |
We consider the problem of classifying documents containing multiple unordered pages. For this purpose, we propose a novel bag-of-pages document representation. To represent a document, one assigns every page to a prototype in a codebook of pages. This leads to a histogram representation which can then be fed to any discriminative classifier. We also consider several refinements over this initial approach. We show on two challenging datasets that the proposed approach significantly outperforms a baseline system. |
|
|
Address |
Istanbul (Turkey) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1051-4651 |
ISBN |
978-1-4244-7542-1 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ GoP2010 |
Serial |
1480 |
|
Permanent link to this record |
|
|
|
|
Author |
Albert Gordo; Alicia Fornes; Ernest Valveny; Josep Llados |
![goto web page (via DOI) doi](http://refbase.cvc.uab.es/img/doi.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
A Bag of Notes Approach to Writer Identification in Old Handwritten Music Scores |
Type |
Conference Article |
|
Year |
2010 |
Publication |
9th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
247–254 |
|
|
Keywords |
|
|
|
Abstract |
Determining the authorship of a document, namely writer identification, can be an important source of information for document categorization. Contrary to text documents, the identification of the writer of graphical documents is still a challenge. In this paper we present a robust approach for writer identification in a particular kind of graphical documents, old music scores. This approach adapts the bag of visual terms method for coping with graphic documents. The identification is performed only using the graphical music notation. For this purpose, we generate a graphic vocabulary without recognizing any music symbols, and consequently, avoiding the difficulties in the recognition of hand-drawn symbols in old and degraded documents. The proposed method has been tested on a database of old music scores from the 17th to 19th centuries, achieving very high identification rates. |
|
|
Address |
Boston; USA; |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-60558-773-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ GFV2010 |
Serial |
1320 |
|
Permanent link to this record |