|
Records |
Links |
|
Author |
Ruben Tito; Dimosthenis Karatzas; Ernest Valveny |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
Hierarchical multimodal transformers for Multi-Page DocVQA |
Type |
Journal Article |
|
Year |
2023 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
144 |
Issue |
|
Pages |
109834 |
|
|
Keywords |
|
|
|
Abstract |
Document Visual Question Answering (DocVQA) refers to the task of answering questions from document images. Existing work on DocVQA only considers single-page documents. However, in real scenarios documents are mostly composed of multiple pages that should be processed altogether. In this work we extend DocVQA to the multi-page scenario. For that, we first create a new dataset, MP-DocVQA, where questions are posed over multi-page documents instead of single pages. Second, we propose a new hierarchical method, Hi-VT5, based on the T5 architecture, that overcomes the limitations of current methods to process long multi-page documents. The proposed method is based on a hierarchical transformer architecture where the encoder summarizes the most relevant information of every page and then, the decoder takes this summarized information to generate the final answer. Through extensive experimentation, we demonstrate that our method is able, in a single stage, to answer the questions and provide the page that contains the relevant information to find the answer, which can be used as a kind of explainability measure. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
ISSN 0031-3203 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.155; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ TKV2023 |
Serial |
3825 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Riba; Josep Llados; Alicia Fornes |
![goto web page url](http://refbase.cvc.uab.es/img/www.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
Hierarchical graphs for coarse-to-fine error tolerant matching |
Type |
Journal Article |
|
Year |
2020 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
134 |
Issue |
|
Pages |
116-124 |
|
|
Keywords |
Hierarchical graph representation; Coarse-to-fine graph matching; Graph-based retrieval |
|
|
Abstract |
During the last years, graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their ability to capture both structural and appearance-based information. Thus, they provide a greater representational power than classical statistical frameworks. However, graph-based representations leads to high computational complexities usually dealt by graph embeddings or approximated matching techniques. Despite their representational power, they are very sensitive to noise and small variations of the input image. With the aim to cope with the time complexity and the variability present in the generated graphs, in this paper we propose to construct a novel hierarchical graph representation. Graph clustering techniques adapted from social media analysis have been used in order to contract a graph at different abstraction levels while keeping information about the topology. Abstract nodes attributes summarise information about the contracted graph partition. For the proposed representations, a coarse-to-fine matching technique is defined. Hence, small graphs are used as a filtering before more accurate matching methods are applied. This approach has been validated in real scenarios such as classification of colour images or retrieval of handwritten words (i.e. word spotting). |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.097; 601.302; 603.057; 600.140; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RLF2020 |
Serial |
3349 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Enric Marti |
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
Graph-edit algorithms for hand-drawn graphical document recognition and their automatic introduction |
Type |
Journal Article |
|
Year |
1999 |
Publication |
Machine Graphics & Vision journal, special issue on Graph transformation |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG;IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ LIM1999c |
Serial |
1569 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Gemma Sanchez |
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
Graph Matching vs. Graph Parsing in Graphics Recognition: A Combined Approach |
Type |
Journal |
|
Year |
2004 |
Publication |
International Journal of Pattern Recognition and Artificial Intelligence |
Abbreviated Journal |
IJPRAI |
|
|
Volume |
18 |
Issue |
3 |
Pages |
455–473 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; IF: 0.588 |
Approved |
no |
|
|
Call Number |
DAG @ dag @ LlS2004 |
Serial |
445 |
|
Permanent link to this record |
|
|
|
|
Author |
Jaume Gibert; Ernest Valveny; Horst Bunke |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title ![sorted by Title field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
Graph Embedding in Vector Spaces by Node Attribute Statistics |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
45 |
Issue |
9 |
Pages |
3072-3083 |
|
|
Keywords |
Structural pattern recognition; Graph embedding; Data clustering; Graph classification |
|
|
Abstract |
Graph-based representations are of broad use and applicability in pattern recognition. They exhibit, however, a major drawback with regards to the processing tools that are available in their domain. Graphembedding into vectorspaces is a growing field among the structural pattern recognition community which aims at providing a feature vector representation for every graph, and thus enables classical statistical learning machinery to be used on graph-based input patterns. In this work, we propose a novel embedding methodology for graphs with continuous nodeattributes and unattributed edges. The approach presented in this paper is based on statistics of the node labels and the edges between them, based on their similarity to a set of representatives. We specifically deal with an important issue of this methodology, namely, the selection of a suitable set of representatives. In an experimental evaluation, we empirically show the advantages of this novel approach in the context of different classification problems using several databases of graphs. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0031-3203 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ GVB2012a |
Serial |
1992 |
|
Permanent link to this record |