toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Ruben Perez Tito edit  isbn
openurl 
  Title Exploring the role of Text in Visual Question Answering on Natural Scenes and Documents Type Book Whole
  Year 2023 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Visual Question Answering (VQA) is the task where given an image and a natural language question, the objective is to generate a natural language answer. At the intersection between computer vision and natural language processing, this task can be seen as a measure of image understanding capabilities, as it requires to reason about objects, actions, colors, positions, the relations between the different elements as well as commonsense reasoning, world knowledge, arithmetic skills and natural language understanding. However, even though the text present in the images conveys important semantically rich information that is explicit and not available in any other form, most VQA methods remained illiterate, largely
ignoring the text despite its potential significance. In this thesis, we set out on a journey to bring reading capabilities to computer vision models applied to the VQA task, creating new datasets and methods that can read, reason and integrate the text with other visual cues in natural scene images and documents.
In Chapter 3, we address the combination of scene text with visual information to fully understand all the nuances of natural scene images. To achieve this objective, we define a new sub-task of VQA that requires reading the text in the image, and highlight the limitations of the current methods. In addition, we propose a new architecture that integrates both modalities and jointly reasons about textual and visual features. In Chapter 5, we shift the domain of VQA with reading capabilities and apply it on scanned industry document images, providing a high-level end-purpose perspective to Document Understanding, which has been
primarily focused on digitizing the document’s contents and extracting key values without considering the ultimate purpose of the extracted information. For this, we create a dataset which requires methods to reason about the unique and challenging elements of documents, such as text, images, tables, graphs and complex layouts, to provide accurate answers in natural language. However, we observed that explicit visual features provide a slight contribution in the overall performance, since the main information is usually conveyed within the text and its position. In consequence, in Chapter 6, we propose VQA on infographic images, seeking for document images with more visually rich elements that require to fully exploit visual information in order to answer the questions. We show the performance gap of
different methods when used over industry scanned and infographic images, and propose a new method that integrates the visual features in early stages, which allows the transformer architecture to exploit the visual features during the self-attention operation. Instead, in Chapter 7, we apply VQA on a big collection of single-page documents, where the methods must find which documents are relevant to answer the question, and provide the answer itself. Finally, in Chapter 8, mimicking real-world application problems where systems must process documents with multiple pages, we address the multipage document visual question answering task. We demonstrate the limitations of existing methods, including models specifically designed to process long sequences. To overcome these limitations, we propose
a hierarchical architecture that can process long documents, answer questions, and provide the index of the page where the information to answer the question is located as an explainability measure.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher IMPRIMA Place of Publication Editor Ernest Valveny  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-124793-5-5 Medium  
  Area Expedition Conference (up)  
  Notes DAG Approved no  
  Call Number Admin @ si @ Per2023 Serial 3967  
Permanent link to this record
 

 
Author Angel Sappa (ed) edit  isbn
openurl 
  Title Computer Graphics and Imaging Type Book Whole
  Year 2010 Publication Computer Graphics and Imaging Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor Angel Sappa  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978–0–88986–836–6 Medium  
  Area Expedition Conference (up) CGIM  
  Notes ADAS Approved no  
  Call Number ADAS @ adas @ Sap2010 Serial 1468  
Permanent link to this record
 

 
Author Josep Llados edit  isbn
openurl 
  Title Computer Vision: Progress of Research and Development Type Book Whole
  Year 2006 Publication 1st CVC Internal Workshop Computer Vision: Progress of Research and Development, Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor J. Llados (ed.),  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 84-933652-8-9 Medium  
  Area Expedition Conference (up) CVCRD  
  Notes DAG Approved no  
  Call Number DAG @ dag @ Lla2006b Serial 766  
Permanent link to this record
 

 
Author Robert Benavente; Laura Igual; Fernando Vilariño edit  isbn
openurl 
  Title Current Challenges in Computer Vision Type Book Whole
  Year 2008 Publication Proccedings of the Third Internal Workshop Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-936529-0-6 Medium  
  Area Expedition Conference (up) CVCRD  
  Notes MILAB;CIC;SIAI Approved no  
  Call Number BCNPCL @ bcnpcl @ BIV2008 Serial 1110  
Permanent link to this record
 

 
Author W. Liu; Josep Llados edit  openurl
  Title Graphics Recognition. Ten Years Review and Future Perspectives Type Book Whole
  Year 2006 Publication 6th International Workshop Abbreviated Journal  
  Volume 3926 Issue Pages  
  Keywords  
  Abstract  
  Address Hong Kong (China)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference (up) GREC  
  Notes DAG Approved no  
  Call Number DAG @ dag @ LiL2006 Serial 800  
Permanent link to this record
 

 
Author Liu Wenyin; Josep Llados; Jean-Marc Ogier edit  isbn
openurl 
  Title Graphics Recognition. Recent Advances and New Opportunities. Type Book Whole
  Year 2008 Publication 7th International Workshop, Selected Papers, Abbreviated Journal  
  Volume 5046 Issue Pages  
  Keywords  
  Abstract  
  Address Curitiba (Brazil)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-540-88184-1 Medium  
  Area Expedition Conference (up) GREC  
  Notes DAG Approved no  
  Call Number DAG @ dag @ WLO2008 Serial 1012  
Permanent link to this record
 

 
Author Jean-Marc Ogier; Wenyin Liu; Josep Llados (eds) edit  isbn
openurl 
  Title Graphics Recognition: Achievements, Challenges, and Evolution Type Book Whole
  Year 2010 Publication 8th International Workshop GREC 2009. Abbreviated Journal  
  Volume 6020 Issue Pages  
  Keywords  
  Abstract  
  Address La Rochelle  
  Corporate Author Thesis  
  Publisher Springer Link Place of Publication Editor Jean-Marc Ogier; Wenyin Liu; Josep Llados  
  Language Summary Language Original Title  
  Series Editor Series Title Lecture Notes in Computer Science Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-642-13727-3 Medium  
  Area Expedition Conference (up) GREC  
  Notes DAG Approved no  
  Call Number Admin @ si @ OLL2010 Serial 1976  
Permanent link to this record
 

 
Author Juan J. Villanueva edit  isbn
openurl 
  Title Visualization, Imaging and Image Processing. Type Book Whole
  Year 2002 Publication International Association of Science and Technology for Development. ACTA Press, Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 0–88986–354–3 Medium  
  Area Expedition Conference (up) IASTE  
  Notes Approved no  
  Call Number ISE @ ise @ Vil2002 Serial 276  
Permanent link to this record
 

 
Author Joan Marti; Jose Miguel Benedi; Ana Maria Mendonça; Joan Serrat edit  openurl
  Title Pattern Recognition and Image Analysis Type Book Whole
  Year 2007 Publication 3rd Iberian Conference Abbreviated Journal  
  Volume 6669 Issue Pages 4477-4478  
  Keywords  
  Abstract  
  Address Girona (Spain)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference (up) IbPRIA  
  Notes ADAS Approved no  
  Call Number ADAS @ adas @ MBM2007 Serial 994  
Permanent link to this record
 

 
Author Jordi Vitria; Joao Sanchez; Miguel Raposo; Mario Hernandez edit  isbn
openurl 
  Title Pattern Recognition and Image Analysis Type Book Whole
  Year 2011 Publication 5th Iberian Conference Pattern Recognition and Image Analysis Abbreviated Journal  
  Volume 6669 Issue Pages  
  Keywords  
  Abstract  
  Address Las Palmas de Gran Canaria. Spain  
  Corporate Author Thesis  
  Publisher Springer-Verlag Place of Publication Berlin Editor J. Vitrià; J. Sanchez; M. Raposo; M. Hernandez  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-642-2125 Medium  
  Area Expedition Conference (up) IbPRIA  
  Notes OR;MV Approved no  
  Call Number Admin @ si @ VSR2011 Serial 1730  
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) edit  doi
isbn  openurl
  Title 16th International Conference, 2021, Proceedings, Part III Type Book Whole
  Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal  
  Volume 12823 Issue Pages  
  Keywords  
  Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
 
  Address Lausanne, Switzerland, September 5-10, 2021  
  Corporate Author Thesis  
  Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-86333-3 Medium  
  Area Expedition Conference (up) ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ Serial 3727  
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) edit  doi
isbn  openurl
  Title 16th International Conference, 2021, Proceedings, Part IV Type Book Whole
  Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal  
  Volume 12824 Issue Pages  
  Keywords  
  Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
 
  Address Lausanne, Switzerland, September 5-10, 2021  
  Corporate Author Thesis  
  Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-86336-4 Medium  
  Area Expedition Conference (up) ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ Serial 3728  
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) edit  doi
isbn  openurl
  Title 16th International Conference, 2021, Proceedings, Part I Type Book Whole
  Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal  
  Volume 12821 Issue Pages  
  Keywords  
  Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: historical document analysis, document analysis systems, handwriting recognition, scene text detection and recognition, document image processing, natural language processing (NLP) for document understanding, and graphics, diagram and math recognition.
 
  Address Lausanne, Switzerland, September 5-10, 2021  
  Corporate Author Thesis  
  Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-86548-1 Medium  
  Area Expedition Conference (up) ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ Serial 3725  
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) edit  doi
isbn  openurl
  Title 16th International Conference, 2021, Proceedings, Part II Type Book Whole
  Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal  
  Volume 12822 Issue Pages  
  Keywords  
  Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
 
  Address Lausanne, Switzerland, September 5-10, 2021  
  Corporate Author Thesis  
  Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-86330-2 Medium  
  Area Expedition Conference (up) ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ Serial 3726  
Permanent link to this record
 

 
Author Mickael Coustaty; Alicia Fornes edit  url
openurl 
  Title Document Analysis and Recognition – ICDAR 2023 Workshops Type Book Whole
  Year 2023 Publication Document Analysis and Recognition – ICDAR 2023 Workshops Abbreviated Journal  
  Volume 14194 Issue 2 Pages  
  Keywords  
  Abstract  
  Address San Jose; USA; August 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference (up) ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ CoF2023 Serial 3852  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: