|
Records |
Links |
|
Author |
Juan Ignacio Toledo; Alicia Fornes; Jordi Cucurull; Josep Llados |
|
|
Title |
Election Tally Sheets Processing System |
Type |
Conference Article |
|
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
364-368 |
|
|
Keywords |
|
|
|
Abstract |
In paper based elections, manual tallies at polling station level produce myriads of documents. These documents share a common form-like structure and a reduced vocabulary worldwide. On the other hand, each tally sheet is filled by a different writer and on different countries, different scripts are used. We present a complete document analysis system for electoral tally sheet processing combining state of the art techniques with a new handwriting recognition subprocess based on unsupervised feature discovery with Variational Autoencoders and sequence classification with BLSTM neural networks. The whole system is designed to be script independent and allows a fast and reliable results consolidation process with reduced operational cost. |
|
|
Address |
Santorini; Greece; April 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 602.006; 600.061; 601.225; 600.077; 600.097 |
Approved |
no |
|
|
Call Number |
TFC2016 |
Serial |
2752 |
|
Permanent link to this record |
|
|
|
|
Author |
Anders Hast; Alicia Fornes |
|
|
Title |
A Segmentation-free Handwritten Word Spotting Approach by Relaxed Feature Matching |
Type |
Conference Article |
|
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
150-155 |
|
|
Keywords |
|
|
|
Abstract |
The automatic recognition of historical handwritten documents is still considered challenging task. For this reason, word spotting emerges as a good alternative for making the information contained in these documents available to the user. Word spotting is defined as the task of retrieving all instances of the query word in a document collection, becoming a useful tool for information retrieval. In this paper we propose a segmentation-free word spotting approach able to deal with large document collections. Our method is inspired on feature matching algorithms that have been applied to image matching and retrieval. Since handwritten words have different shape, there is no exact transformation to be obtained. However, the sufficient degree of relaxation is achieved by using a Fourier based descriptor and an alternative approach to RANSAC called PUMA. The proposed approach is evaluated on historical marriage records, achieving promising results. |
|
|
Address |
Santorini; Greece; April 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 602.006; 600.061; 600.077; 600.097 |
Approved |
no |
|
|
Call Number |
HaF2016 |
Serial |
2753 |
|
Permanent link to this record |
|
|
|
|
Author |
Dimosthenis Karatzas; V. Poulain d'Andecy; Marçal Rusiñol |
|
|
Title |
Human-Document Interaction – a new frontier for document image analysis |
Type |
Conference Article |
|
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
369-374 |
|
|
Keywords |
|
|
|
Abstract |
All indications show that paper documents will not cede in favour of their digital counterparts, but will instead be used increasingly in conjunction with digital information. An open challenge is how to seamlessly link the physical with the digital – how to continue taking advantage of the important affordances of paper, without missing out on digital functionality. This paper
presents the authors’ experience with developing systems for Human-Document Interaction based on augmented document interfaces and examines new challenges and opportunities arising for the document image analysis field in this area. The system presented combines state of the art camera-based document
image analysis techniques with a range of complementary tech-nologies to offer fluid Human-Document Interaction. Both fixed and nomadic setups are discussed that have gone through user testing in real-life environments, and use cases are presented that span the spectrum from business to educational application |
|
|
Address |
Santorini; Greece; April 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.084; 600.077 |
Approved |
no |
|
|
Call Number |
KPR2016 |
Serial |
2756 |
|
Permanent link to this record |
|
|
|
|
Author |
Q. Bao; Marçal Rusiñol; M.Coustaty; Muhammad Muzzamil Luqman; C.D. Tran; Jean-Marc Ogier |
|
|
Title |
Delaunay triangulation-based features for Camera-based document image retrieval system |
Type |
Conference Article |
|
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1-6 |
|
|
Keywords |
Camera-based Document Image Retrieval; Delaunay Triangulation; Feature descriptors; Indexing |
|
|
Abstract |
In this paper, we propose a new feature vector, named DElaunay TRIangulation-based Features (DETRIF), for real-time camera-based document image retrieval. DETRIF is computed based on the geometrical constraints from each pair of adjacency triangles in delaunay triangulation which is constructed from centroids of connected components. Besides, we employ a hashing-based indexing system in order to evaluate the performance of DETRIF and to compare it with other systems such as LLAH and SRIF. The experimentation is carried out on two datasets comprising of 400 heterogeneous-content complex linguistic map images (huge size, 9800 X 11768 pixels resolution)and 700 textual document images. |
|
|
Address |
Santorini; Greece; April 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.061; 600.084; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRC2016 |
Serial |
2757 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Gomez; Dimosthenis Karatzas |
|
|
Title |
A fine-grained approach to scene text script identification |
Type |
Conference Article |
|
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
192-197 |
|
|
Keywords |
|
|
|
Abstract |
This paper focuses on the problem of script identification in unconstrained scenarios. Script identification is an important prerequisite to recognition, and an indispensable condition for automatic text understanding systems designed for multi-language environments. Although widely studied for document images and handwritten documents, it remains an almost unexplored territory for scene text images. We detail a novel method for script identification in natural images that combines convolutional features and the Naive-Bayes Nearest Neighbor classifier. The proposed framework efficiently exploits the discriminative power of small stroke-parts, in a fine-grained classification framework. In addition, we propose a new public benchmark dataset for the evaluation of joint text detection and script identification in natural scenes. Experiments done in this new dataset demonstrate that the proposed method yields state of the art results, while it generalizes well to different datasets and variable number of scripts. The evidence provided shows that multi-lingual scene text recognition in the wild is a viable proposition. Source code of the proposed method is made available online. |
|
|
Address |
Santorini; Grecia; April 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 601.197; 600.084 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GoK2016b |
Serial |
2863 |
|
Permanent link to this record |
|
|
|
|
Author |
Suman Ghosh; Ernest Valveny |
|
|
Title |
A Sliding Window Framework for Word Spotting Based on Word Attributes |
Type |
Conference Article |
|
Year |
2015 |
Publication |
Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 |
Abbreviated Journal |
|
|
|
Volume |
9117 |
Issue |
|
Pages |
652-661 |
|
|
Keywords |
Word spotting; Sliding window; Word attributes |
|
|
Abstract |
In this paper we propose a segmentation-free approach to word spotting. Word images are first encoded into feature vectors using Fisher Vector. Then, these feature vectors are used together with pyramidal histogram of characters labels (PHOC) to learn SVM-based attribute models. Documents are represented by these PHOC based word attributes. To efficiently compute the word attributes over a sliding window, we propose to use an integral image representation of the document using a simplified version of the attribute model. Finally we re-rank the top word candidates using the more discriminative full version of the word attributes. We show state-of-the-art results for segmentation-free query-by-example word spotting in single-writer and multi-writer standard datasets. |
|
|
Address |
Santiago de Compostela; June 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer International Publishing |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-319-19389-2 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GhV2015b |
Serial |
2716 |
|
Permanent link to this record |
|
|
|
|
Author |
Mickael Coustaty; Alicia Fornes |
|
|
Title |
Document Analysis and Recognition – ICDAR 2023 Workshops |
Type |
Book Whole |
|
Year |
2023 |
Publication |
Document Analysis and Recognition – ICDAR 2023 Workshops |
Abbreviated Journal |
|
|
|
Volume |
14194 |
Issue |
2 |
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
San Jose; USA; August 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ CoF2023 |
Serial |
3852 |
|
Permanent link to this record |
|
|
|
|
Author |
Francesc Net; Marc Folia; Pep Casals; Lluis Gomez |
|
|
Title |
Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections |
Type |
Conference Article |
|
Year |
2023 |
Publication |
17th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
14191 |
Issue |
|
Pages |
3-17 |
|
|
Keywords |
Image deduplication; Near-duplicate images detection; Transductive Learning; Photographic Archives; Deep Learning |
|
|
Abstract |
This paper presents a comparative study of near-duplicate image detection techniques in a real-world use case scenario, where a document management company is commissioned to manually annotate a collection of scanned photographs. Detecting duplicate and near-duplicate photographs can reduce the time spent on manual annotation by archivists. This real use case differs from laboratory settings as the deployment dataset is available in advance, allowing the use of transductive learning. We propose a transductive learning approach that leverages state-of-the-art deep learning architectures such as convolutional neural networks (CNNs) and Vision Transformers (ViTs). Our approach involves pre-training a deep neural network on a large dataset and then fine-tuning the network on the unlabeled target collection with self-supervised learning. The results show that the proposed approach outperforms the baseline methods in the task of near-duplicate image detection in the UKBench and an in-house private dataset. |
|
|
Address |
San Jose; CA; USA; August 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ NFC2023 |
Serial |
3859 |
|
Permanent link to this record |
|
|
|
|
Author |
Ayan Banerjee; Sanket Biswas; Josep Llados; Umapada Pal |
|
|
Title |
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation |
Type |
Conference Article |
|
Year |
2023 |
Publication |
17th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
14187 |
Issue |
|
Pages |
307–325 |
|
|
Keywords |
|
|
|
Abstract |
Instance-level segmentation of documents consists in assigning a class-aware and instance-aware label to each pixel of the image. It is a key step in document parsing for their understanding. In this paper, we present a unified transformer encoder-decoder architecture for en-to-end instance segmentation of complex layouts in document images. The method adapts a contrastive training with a mixed query selection for anchor initialization in the decoder. Later on, it performs a dot product between the obtained query embeddings and the pixel embedding map (coming from the encoder) for semantic reasoning. Extensive experimentation on competitive benchmarks like PubLayNet, PRIMA, Historical Japanese (HJ), and TableBank demonstrate that our model with SwinL backbone achieves better segmentation performance than the existing state-of-the-art approaches with the average precision of 93.72, 54.39, 84.65 and 98.04 respectively under one billion parameters. The code is made publicly available at: github.com/ayanban011/SwinDocSegmenter . |
|
|
Address |
San Jose; CA; USA; August 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ BBL2023 |
Serial |
3893 |
|
Permanent link to this record |
|
|
|
|
Author |
Wenwen Yu; Chengquan Zhang; Haoyu Cao; Wei Hua; Bohan Li; Huang Chen; Mingyu Liu; Mingrui Chen; Jianfeng Kuang; Mengjun Cheng; Yuning Du; Shikun Feng; Xiaoguang Hu; Pengyuan Lyu; Kun Yao; Yuechen Yu; Yuliang Liu; Wanxiang Che; Errui Ding; Cheng-Lin Liu; Jiebo Luo; Shuicheng Yan; Min Zhang; Dimosthenis Karatzas; Xing Sun; Jingdong Wang; Xiang Bai |
|
|
Title |
ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images |
Type |
Conference Article |
|
Year |
2023 |
Publication |
17th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
14188 |
Issue |
|
Pages |
536–552 |
|
|
Keywords |
|
|
|
Abstract |
Structured text extraction is one of the most valuable and challenging application directions in the field of Document AI. However, the scenarios of past benchmarks are limited, and the corresponding evaluation protocols usually focus on the submodules of the structured text extraction scheme. In order to eliminate these problems, we organized the ICDAR 2023 competition on Structured text extraction from Visually-Rich Document images (SVRD). We set up two tracks for SVRD including Track 1: HUST-CELL and Track 2: Baidu-FEST, where HUST-CELL aims to evaluate the end-to-end performance of Complex Entity Linking and Labeling, and Baidu-FEST focuses on evaluating the performance and generalization of Zero-shot/Few-shot Structured Text extraction from an end-to-end perspective. Compared to the current document benchmarks, our two tracks of competition benchmark enriches the scenarios greatly and contains more than 50 types of visually-rich document images (mainly from the actual enterprise applications). The competition opened on 30th December, 2022 and closed on 24th March, 2023. There are 35 participants and 91 valid submissions received for Track 1, and 15 participants and 26 valid submissions received for Track 2. In this report we will presents the motivation, competition datasets, task definition, evaluation protocol, and submission summaries. According to the performance of the submissions, we believe there is still a large gap on the expected information extraction performance for complex and zero-shot scenarios. It is hoped that this competition will attract many researchers in the field of CV and NLP, and bring some new thoughts to the field of Document AI. |
|
|
Address |
San Jose; CA; USA; August 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ YZC2023 |
Serial |
3896 |
|
Permanent link to this record |