TY - CONF AU - Ruben Tito AU - Dimosthenis Karatzas AU - Ernest Valveny A2 - ICDAR PY - 2021// TI - Document Collection Visual Question Answering T2 - LNCS BT - 16th International Conference on Document Analysis and Recognition SP - 778 EP - 792 VL - 12822 KW - Document collection KW - Visual Question Answering N2 - Current tasks and methods in Document Understanding aims to process documents as single elements. However, documents are usually organized in collections (historical records, purchase invoices), that provide context useful for their interpretation. To address this problem, we introduce Document Collection Visual Question Answering (DocCVQA) a new dataset and related task, where questions are posed over a whole collection of document images and the goal is not only to provide the answer to the given question, but also to retrieve the set of documents that contain the information needed to infer the answer. Along with the dataset we propose a new evaluation metric and baselines which provide further insights to the new dataset and task. UR - https://link.springer.com/chapter/10.1007/978-3-030-86331-9_50 L1 - http://refbase.cvc.uab.es/files/TKV2021.pdf N1 - DAG; 600.121 ID - Ruben Tito2021 ER -