|
Records |
Links |
|
Author |
Juan Ignacio Toledo; Jordi Cucurull; Jordi Puiggali; Alicia Fornes; Josep Llados |
![goto web page url](http://refbase.cvc.uab.es/img/www.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
Document Analysis Techniques for Automatic Electoral Document Processing: A Survey |
Type |
Conference Article |
|
Year |
2015 |
Publication |
E-Voting and Identity, Proceedings of 5th international conference, VoteID 2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
139-141 |
|
|
Keywords |
Document image analysis; Computer vision; Paper ballots; Paper based elections; Optical scan; Tally |
|
|
Abstract |
In this paper, we will discuss the most common challenges in electoral document processing and study the different solutions from the document analysis community that can be applied in each case. We will cover Optical Mark Recognition techniques to detect voter selections in the Australian Ballot, handwritten number recognition for preferential elections and handwriting recognition for write-in areas. We will also propose some particular adjustments that can be made to those general techniques in the specific context of electoral documents. |
|
|
Address |
Bern; Switzerland; September 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
VoteID |
|
|
Notes |
DAG; 600.061; 602.006; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ TCP2015 |
Serial |
2641 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Torras; Mohamed Ali Souibgui; Jialuo Chen; Alicia Fornes |
![goto web page url](http://refbase.cvc.uab.es/img/www.gif)
|
|
Title |
A Transcription Is All You Need: Learning to Align through Attention |
Type |
Conference Article |
|
Year |
2021 |
Publication |
14th IAPR International Workshop on Graphics Recognition |
Abbreviated Journal |
|
|
|
Volume |
12916 |
Issue |
|
Pages ![sorted by First Page field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
141–146 |
|
|
Keywords |
|
|
|
Abstract |
Historical ciphered manuscripts are a type of document where graphical symbols are used to encrypt their content instead of regular text. Nowadays, expert transcriptions can be found in libraries alongside the corresponding manuscript images. However, those transcriptions are not aligned, so these are barely usable for training deep learning-based recognition methods. To solve this issue, we propose a method to align each symbol in the transcript of an image with its visual representation by using an attention-based Sequence to Sequence (Seq2Seq) model. The core idea is that, by learning to recognise symbols sequence within a cipher line image, the model also identifies their position implicitly through an attention mechanism. Thus, the resulting symbol segmentation can be later used for training algorithms. The experimental evaluation shows that this method is promising, especially taking into account the small size of the cipher dataset. |
|
|
Address |
Virtual; September 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GREC |
|
|
Notes |
DAG; 602.230; 600.140; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ TSC2021 |
Serial |
3619 |
|
Permanent link to this record |
|
|
|
|
Author |
Francisco Alvaro; Francisco Cruz; Joan Andreu Sanchez; Oriol Ramos Terrades; Jose Miguel Benedi |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
|
|
Title |
Structure Detection and Segmentation of Documents Using 2D Stochastic Context-Free Grammars |
Type |
Journal Article |
|
Year |
2015 |
Publication |
Neurocomputing |
Abbreviated Journal |
NEUCOM |
|
|
Volume |
150 |
Issue |
A |
Pages ![sorted by First Page field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
147-154 |
|
|
Keywords |
document image analysis; stochastic context-free grammars; text classication features |
|
|
Abstract |
In this paper we dene a bidimensional extension of Stochastic Context-Free Grammars for structure detection and segmentation of images of documents.
Two sets of text classication features are used to perform an initial classication of each zone of the page. Then, the document segmentation is obtained as the most likely hypothesis according to a stochastic grammar. We used a dataset of historical marriage license books to validate this approach. We also tested several inference algorithms for Probabilistic Graphical Models
and the results showed that the proposed grammatical model outperformed
the other methods. Furthermore, grammars also provide the document structure
along with its segmentation. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 601.158; 600.077; 600.061 |
Approved |
no |
|
|
Call Number |
Admin @ si @ ACS2015 |
Serial |
2531 |
|
Permanent link to this record |
|
|
|
|
Author |
Stepan Simsa; Milan Sulc; Michal Uricar; Yash Patel; Ahmed Hamdi; Matej Kocian; Matyas Skalicky; Jiri Matas; Antoine Doucet; Mickael Coustaty; Dimosthenis Karatzas |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
DocILE Benchmark for Document Information Localization and Extraction |
Type |
Conference Article |
|
Year |
2023 |
Publication |
17th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
14188 |
Issue |
|
Pages ![sorted by First Page field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
147–166 |
|
|
Keywords |
Document AI; Information Extraction; Line Item Recognition; Business Documents; Intelligent Document Processing |
|
|
Abstract |
This paper introduces the DocILE benchmark with the largest dataset of business documents for the tasks of Key Information Localization and Extraction and Line Item Recognition. It contains 6.7k annotated business documents, 100k synthetically generated documents, and nearly 1M unlabeled documents for unsupervised pre-training. The dataset has been built with knowledge of domain- and task-specific aspects, resulting in the following key features: (i) annotations in 55 classes, which surpasses the granularity of previously published key information extraction datasets by a large margin; (ii) Line Item Recognition represents a highly practical information extraction task, where key information has to be assigned to items in a table; (iii) documents come from numerous layouts and the test set includes zero- and few-shot cases as well as layouts commonly seen in the training set. The benchmark comes with several baselines, including RoBERTa, LayoutLMv3 and DETR-based Table Transformer; applied to both tasks of the DocILE benchmark, with results shared in this paper, offering a quick starting point for future work. The dataset, baselines and supplementary material are available at https://github.com/rossumai/docile. |
|
|
Address |
San Jose; CA; USA; August 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ SSU2023 |
Serial |
3903 |
|
Permanent link to this record |
|
|
|
|
Author |
Muhammad Muzzamil Luqman; Thierry Brouard; Jean-Yves Ramel; Josep Llados |
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
Recherche de sous-graphes par encapsulation floue des cliques d'ordre 2: Application à la localisation de contenu dans les images de documents graphiques |
Type |
Conference Article |
|
Year |
2012 |
Publication |
Colloque International Francophone sur l'Écrit et le Document |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
149-162 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CIFED |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ LBR2012 |
Serial |
2382 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Jaime Lopez-Krahe; Enric Marti |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
A system to understand hand-drawn floor plans using subgraph isomorphism and Hough transform |
Type |
Book Chapter |
|
Year |
1997 |
Publication |
Machine Vision and Applications |
Abbreviated Journal |
|
|
|
Volume |
10 |
Issue |
3 |
Pages ![sorted by First Page field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
150-158 |
|
|
Keywords |
Line drawings – Hough transform – Graph matching – CAD systems – Graphics recognition |
|
|
Abstract |
Presently, man-machine interface development is a widespread research activity. A system to understand hand drawn architectural drawings in a CAD environment is presented in this paper. To understand a document, we have to identify its building elements and their structural properties. An attributed graph structure is chosen as a symbolic representation of the input document and the patterns to recognize in it. An inexact subgraph isomorphism procedure using relaxation labeling techniques is performed. In this paper we focus on how to speed up the matching. There is a building element, the walls, characterized by a hatching pattern. Using a straight line Hough transform (SLHT)-based method, we recognize this pattern, characterized by parallel straight lines, and remove from the input graph the edges belonging to this pattern. The isomorphism is then applied to the remainder of the input graph. When all the building elements have been recognized, the document is redrawn, correcting the inaccurate strokes obtained from a hand-drawn input. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG;IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ LLM1997a |
Serial |
1566 |
|
Permanent link to this record |
|
|
|
|
Author |
Anders Hast; Alicia Fornes |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
A Segmentation-free Handwritten Word Spotting Approach by Relaxed Feature Matching |
Type |
Conference Article |
|
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
150-155 |
|
|
Keywords |
|
|
|
Abstract |
The automatic recognition of historical handwritten documents is still considered challenging task. For this reason, word spotting emerges as a good alternative for making the information contained in these documents available to the user. Word spotting is defined as the task of retrieving all instances of the query word in a document collection, becoming a useful tool for information retrieval. In this paper we propose a segmentation-free word spotting approach able to deal with large document collections. Our method is inspired on feature matching algorithms that have been applied to image matching and retrieval. Since handwritten words have different shape, there is no exact transformation to be obtained. However, the sufficient degree of relaxation is achieved by using a Fourier based descriptor and an alternative approach to RANSAC called PUMA. The proposed approach is evaluated on historical marriage records, achieving promising results. |
|
|
Address |
Santorini; Greece; April 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 602.006; 600.061; 600.077; 600.097 |
Approved |
no |
|
|
Call Number |
HaF2016 |
Serial |
2753 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Ernest Valveny; Enric Marti |
![find book details (via ISBN) isbn](http://refbase.cvc.uab.es/img/isbn.gif)
|
|
Title |
Symbol Recognition in Document Image Analysis: Methods and Challenges |
Type |
Journal Article |
|
Year |
2000 |
Publication |
Recent Research Developments in Pattern Recognition, Transworld Research Network, |
Abbreviated Journal |
|
|
|
Volume |
1 |
Issue |
|
Pages ![sorted by First Page field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
151–178. |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
81-86846-61-1 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG;IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ LVM2000 |
Serial |
1575 |
|
Permanent link to this record |
|
|
|
|
Author |
Giacomo Magnifico; Beata Megyesi; Mohamed Ali Souibgui; Jialuo Chen; Alicia Fornes |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
Lost in Transcription of Graphic Signs in Ciphers |
Type |
Conference Article |
|
Year |
2022 |
Publication |
International Conference on Historical Cryptology (HistoCrypt 2022) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
153-158 |
|
|
Keywords |
transcription of ciphers; hand-written text recognition of symbols; graphic signs |
|
|
Abstract |
Hand-written Text Recognition techniques with the aim to automatically identify and transcribe hand-written text have been applied to historical sources including ciphers. In this paper, we compare the performance of two machine learning architectures, an unsupervised method based on clustering and a deep learning method with few-shot learning. Both models are tested on seen and unseen data from historical ciphers with different symbol sets consisting of various types of graphic signs. We compare the models and highlight their differences in performance, with their advantages and shortcomings. |
|
|
Address |
Amsterdam, Netherlands, June 20-22, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
HystoCrypt |
|
|
Notes |
DAG; 600.121; 600.162; 602.230; 600.140 |
Approved |
no |
|
|
Call Number |
Admin @ si @ MBS2022 |
Serial |
3731 |
|
Permanent link to this record |
|
|
|
|
Author |
Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades |
![goto web page (via DOI) doi](http://refbase.cvc.uab.es/img/doi.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
Spotting Symbol Using Sparsity over Learned Dictionary of Local Descriptors |
Type |
Conference Article |
|
Year |
2014 |
Publication |
11th IAPR International Workshop on Document Analysis and Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
156-160 |
|
|
Keywords |
|
|
|
Abstract |
This paper proposes a new approach to spot symbols into graphical documents using sparse representations. More specifically, a dictionary is learned from a training database of local descriptors defined over the documents. Following their sparse representations, interest points sharing similar properties are used to define interest regions. Using an original adaptation of information retrieval techniques, a vector model for interest regions and for a query symbol is built based on its sparsity in a visual vocabulary where the visual words are columns in the learned dictionary. The matching process is performed comparing the similarity between vector models. Evaluation on SESYD datasets demonstrates that our method is promising. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-4799-3243-6 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ DTR2014 |
Serial |
2543 |
|
Permanent link to this record |