|
Records |
Links |
|
Author |
Jordy Van Landeghem; Ruben Tito; Lukasz Borchmann; Michal Pietruszka; Pawel Joziak; Rafal Powalski; Dawid Jurkiewicz; Mickael Coustaty; Bertrand Anckaert; Ernest Valveny; Matthew Blaschko; Sien Moens; Tomasz Stanislawek |
|
|
Title |
Document Understanding Dataset and Evaluation (DUDE) |
Type |
Conference Article |
|
Year |
2023 |
Publication |
20th IEEE International Conference on Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
19528-19540 |
|
|
Keywords |
|
|
|
Abstract |
We call on the Document AI (DocAI) community to re-evaluate current methodologies and embrace the challenge of creating more practically-oriented benchmarks. Document Understanding Dataset and Evaluation (DUDE) seeks to remediate the halted research progress in understanding visually-rich documents (VRDs). We present a new dataset with novelties related to types of questions, answers, and document layouts based on multi-industry, multi-domain, and multi-page VRDs of various origins and dates. Moreover, we are pushing the boundaries of current methods by creating multi-task and multi-domain evaluation setups that more accurately simulate real-world situations where powerful generalization and adaptation under low-resource settings are desired. DUDE aims to set a new standard as a more practical, long-standing benchmark for the community, and we hope that it will lead to future extensions and contributions that address real-world challenges. Finally, our work illustrates the importance of finding more efficient ways to model language, images, and layout in DocAI. |
|
|
Address |
Paris; France; October 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCV |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ LTB2023 |
Serial |
3948 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Gomez; Dimosthenis Karatzas |
|
|
Title |
Scene Text Recognition: No Country for Old Men? |
Type |
Conference Article |
|
Year |
2014 |
Publication |
1st International Workshop on Robust Reading |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IWRR |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GoK2014c |
Serial |
2538 |
|
Permanent link to this record |
|
|
|
|
Author |
Arnau Baro; Pau Riba; Alicia Fornes |
|
|
Title |
A Starting Point for Handwritten Music Recognition |
Type |
Conference Article |
|
Year |
2018 |
Publication |
1st International Workshop on Reading Music Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
5-6 |
|
|
Keywords |
Optical Music Recognition; Long Short-Term Memory; Convolutional Neural Networks; MUSCIMA++; CVCMUSCIMA |
|
|
Abstract |
In the last years, the interest in Optical Music Recognition (OMR) has reawakened, especially since the appearance of deep learning. However, there are very few works addressing handwritten scores. In this work we describe a full OMR pipeline for handwritten music scores by using Convolutional and Recurrent Neural Networks that could serve as a baseline for the research community. |
|
|
Address |
Paris; France; September 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WORMS |
|
|
Notes |
DAG; 600.097; 601.302; 601.330; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRF2018 |
Serial |
3223 |
|
Permanent link to this record |
|
|
|
|
Author |
Dimosthenis Karatzas; Lluis Gomez; Marçal Rusiñol |
|
|
Title |
The Robust Reading Competition Annotation and Evaluation Platform |
Type |
Conference Article |
|
Year |
2017 |
Publication |
1st International Workshop on Open Services and Tools for Document Analysis |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
The ICDAR Robust Reading Competition (RRC), initiated in 2003 and re-established in 2011, has become the defacto evaluation standard for the international community. Concurrent with its second incarnation in 2011, a continuous effort started to develop an online framework to facilitate the hosting and management of competitions. This short paper briefly outlines the Robust Reading Competition Annotation and Evaluation Platform, the backbone of the Robust Reading Competition, comprising a collection of tools and processes that aim to simplify the management and annotation
of data, and to provide online and offline performance evaluation and analysis services |
|
|
Address |
Kyoto; Japan; November 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR-OST |
|
|
Notes |
DAG; 600.084; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ KGR2017 |
Serial |
3063 |
|
Permanent link to this record |
|
|
|
|
Author |
Leonardo Galteri; Dena Bazazian; Lorenzo Seidenari; Marco Bertini; Andrew Bagdanov; Anguelos Nicolaou; Dimosthenis Karatzas; Alberto del Bimbo |
|
|
Title |
Reading Text in the Wild from Compressed Images |
Type |
Conference Article |
|
Year |
2017 |
Publication |
1st International workshop on Egocentric Perception, Interaction and Computing |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Reading text in the wild is gaining attention in the computer vision community. Images captured in the wild are almost always compressed to varying degrees, depending on application context, and this compression introduces artifacts
that distort image content into the captured images. In this paper we investigate the impact these compression artifacts have on text localization and recognition in the wild. We also propose a deep Convolutional Neural Network (CNN) that can eliminate text-specific compression artifacts and which leads to an improvement in text recognition. Experimental results on the ICDAR-Challenge4 dataset demonstrate that compression artifacts have a significant
impact on text localization and recognition and that our approach yields an improvement in both – especially at high compression rates. |
|
|
Address |
Venice; Italy; October 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCV - EPIC |
|
|
Notes |
DAG; 600.084; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GBS2017 |
Serial |
3006 |
|
Permanent link to this record |
|
|
|
|
Author |
Fernando Vilariño; Dimosthenis Karatzas |
|
|
Title |
A Living Lab approach for Citizen Science in Libraries |
Type |
Conference Article |
|
Year |
2016 |
Publication |
1st International ECSA Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Berlin; Germany; May 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECSA |
|
|
Notes |
MV; DAG; 600.084; 600.097;SIAI |
Approved |
no |
|
|
Call Number |
Admin @ si @ViK2016 |
Serial |
2804 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados |
|
|
Title |
Computer Vision: Progress of Research and Development |
Type |
Book Whole |
|
Year |
2006 |
Publication |
1st CVC Internal Workshop Computer Vision: Progress of Research and Development, |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
J. Llados (ed.), |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
84-933652-8-9 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVCRD |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ Lla2006b |
Serial |
766 |
|
Permanent link to this record |
|
|
|
|
Author |
J. Chazalon; P. Gomez-Kramer; Jean-Christophe Burie; M.Coustaty; S.Eskenazi; Muhammad Muzzamil Luqman; Nibal Nayef; Marçal Rusiñol; N. Sidere; Jean-Marc Ogier |
|
|
Title |
SmartDoc 2017 Video Capture: Mobile Document Acquisition in Video Mode |
Type |
Conference Article |
|
Year |
2017 |
Publication |
1st International Workshop on Open Services and Tools for Document Analysis |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
As mobile document acquisition using smartphones is getting more and more common, along with the continuous improvement of mobile devices (both in terms of computing power and image quality), we can wonder to which extent mobile phones can replace desktop scanners. Modern applications can cope with perspective distortion and normalize the contrast of a document page captured with a smartphone, and in some cases like bottle labels or posters, smartphones even have the advantage of allowing the acquisition of non-flat or large documents. However, several cases remain hard to handle, such as reflective documents (identity cards, badges, glossy magazine cover, etc.) or large documents for which some regions require an important amount of detail. This paper introduces the SmartDoc 2017 benchmark (named “SmartDoc Video Capture”), which aims at
assessing whether capturing documents using the video mode of a smartphone could solve those issues. The task under evaluation is both a stitching and a reconstruction problem, as the user can move the device over different parts of the document to capture details or try to erase highlights. The material released consists of a dataset, an evaluation method and the associated tool, a sample method, and the tools required to extend the dataset. All the components are released publicly under very permissive licenses, and we particularly cared about maximizing the ease of
understanding, usage and improvement. |
|
|
Address |
Kyoto; Japan; November 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR-OST |
|
|
Notes |
DAG; 600.084; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CGB2017 |
Serial |
2997 |
|
Permanent link to this record |
|
|
|
|
Author |
Agnes Borras; Francesc Tous; Josep Llados; Maria Vanrell |
|
|
Title |
High-Level Clothes Description Based on Colour-Texture and Structural Features |
Type |
Conference Article |
|
Year |
2003 |
Publication |
1rst. Iberian Conference on Pattern Recognition and Image Analysis IbPRIA 2003 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Palma de Mallorca |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG;CIC |
Approved |
no |
|
|
Call Number |
CAT @ cat @ BTL2003b |
Serial |
369 |
|
Permanent link to this record |
|
|
|
|
Author |
Clement Guerin; Christophe Rigaud; Karell Bertet; Jean-Christophe Burie; Arnaud Revel ; Jean-Marc Ogier |
|
|
Title |
Réduction de l’espace de recherche pour les personnages de bandes dessinées |
Type |
Conference Article |
|
Year |
2014 |
Publication |
19th National Congress Reconnaissance de Formes et l'Intelligence Artificielle |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
contextual search; document analysis; comics characters |
|
|
Abstract |
Les bandes dessinées représentent un patrimoine culturel important dans de nombreux pays et leur numérisation massive offre la possibilité d'effectuer des recherches dans le contenu des images. À ce jour, ce sont principalement les structures des pages et leurs contenus textuels qui ont été étudiés, peu de travaux portent sur le contenu graphique. Nous proposons de nous appuyer sur des éléments déjà étudiés tels que la position des cases et des bulles, pour réduire l'espace de recherche et localiser les personnages en fonction de la queue des bulles. L'évaluation de nos différentes contributions à partir de la base eBDtheque montre un taux de détection des queues de bulle de 81.2%, de localisation des personnages allant jusqu'à 85% et un gain d'espace de recherche de plus de 50%. |
|
|
Address |
Rouen; Francia; July 2014 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
RFIA |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GRB2014 |
Serial |
2480 |
|
Permanent link to this record |