Records |
Author |
Alicia Fornes; Josep Llados; Joan Mas; Joana Maria Pujadas-Mora; Anna Cabre |
Title |
A Bimodal Crowdsourcing Platform for Demographic Historical Manuscripts |
Type |
Conference Article |
Year |
2014 |
Publication |
Digital Access to Textual Cultural Heritage Conference |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
103-108 |
Keywords |
|
Abstract |
In this paper we present a crowdsourcing web-based application for extracting information from demographic handwritten document images. The proposed application integrates two points of view: the semantic information for demographic research, and the ground-truthing for document analysis research. Concretely, the application has the contents view, where the information is recorded into forms, and the labeling view, with the word labels for evaluating document analysis techniques. The crowdsourcing architecture allows to accelerate the information extraction (many users can work simultaneously), validate the information, and easily provide feedback to the users. We finally show how the proposed application can be extended to other kind of demographic historical manuscripts. |
Address |
Madrid; May 2014 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
978-1-4503-2588-2 |
Medium |
|
Area |
|
Expedition |
|
Conference |
DATeCH |
Notes |
DAG; 600.061; 602.006; 600.077 |
Approved |
no |
Call Number |
Admin @ si @ FLM2014 |
Serial |
2516 |
Permanent link to this record |
|
|
|
Author |
Alicia Fornes; Sergio Escalera; Josep Llados; Ernest Valveny |
Title |
Symbol Classification using Dynamic Aligned Shape Descriptor |
Type |
Conference Article |
Year |
2010 |
Publication |
20th International Conference on Pattern Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
1957–1960 |
Keywords |
|
Abstract |
Shape representation is a difficult task because of several symbol distortions, such as occlusions, elastic deformations, gaps or noise. In this paper, we propose a new descriptor and distance computation for coping with the problem of symbol recognition in the domain of Graphical Document Image Analysis. The proposed D-Shape descriptor encodes the arrangement information of object parts in a circular structure, allowing different levels of distortion. The classification is performed using a cyclic Dynamic Time Warping based method, allowing distortions and rotation. The methodology has been validated on different data sets, showing very high recognition rates. |
Address |
Istanbul (Turkey) |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1051-4651 |
ISBN |
978-1-4244-7542-1 |
Medium |
|
Area |
|
Expedition |
|
Conference |
ICPR |
Notes |
DAG; HUPBA; MILAB |
Approved |
no |
Call Number |
BCNPCL @ bcnpcl @ FEL2010 |
Serial |
1421 |
Permanent link to this record |
|
|
|
Author |
Alicia Fornes; Sergio Escalera; Josep Llados; Gemma Sanchez |
Title |
Symbol Recognition by Multi-class Blurred Shape Models |
Type |
Conference Article |
Year |
2007 |
Publication |
Seventh IAPR International Workshop on Graphics Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
11–13 |
Keywords |
|
Abstract |
|
Address |
Curitiba (Brazil) |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
GREC |
Notes |
DAG; MILAB; HUPBA |
Approved |
no |
Call Number |
BCNPCL @ bcnpcl @ FEL2007b |
Serial |
910 |
Permanent link to this record |
|
|
|
Author |
Alicia Fornes; Veronica Romero; Arnau Baro; Juan Ignacio Toledo; Joan Andreu Sanchez; Enrique Vidal; Josep Llados |
Title |
ICDAR2017 Competition on Information Extraction in Historical Handwritten Records |
Type |
Conference Article |
Year |
2017 |
Publication |
14th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
1389-1394 |
Keywords |
|
Abstract |
The extraction of relevant information from historical handwritten document collections is one of the key steps in order to make these manuscripts available for access and searches. In this competition, the goal is to detect the named entities and assign each of them a semantic category, and therefore, to simulate the filling in of a knowledge database. This paper describes the dataset, the tasks, the evaluation metrics, the participants methods and the results. |
Address |
Kyoto; Japan; November 2017 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICDAR |
Notes |
DAG; 600.097; 601.225; 600.121 |
Approved |
no |
Call Number |
Admin @ si @ FRB2017 |
Serial |
3052 |
Permanent link to this record |
|
|
|
Author |
Alicia Fornes; Volkmar Frinken; Andreas Fischer; Jon Almazan; G. Jackson; Horst Bunke |
Title |
A Keyword Spotting Approach Using Blurred Shape Model-Based Descriptors |
Type |
Conference Article |
Year |
2011 |
Publication |
Proceedings of the 2011 Workshop on Historical Document Imaging and Processing |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
83-90 |
Keywords |
|
Abstract |
The automatic processing of handwritten historical documents is considered a hard problem in pattern recognition. In addition to the challenges given by modern handwritten data, a lack of training data as well as effects caused by the degradation of documents can be observed. In this scenario, keyword spotting arises to be a viable solution to make documents amenable for searching and browsing. For this task we propose the adaptation of shape descriptors used in symbol recognition. By treating each word image as a shape, it can be represented using the Blurred Shape Model and the De-formable Blurred Shape Model. Experiments on the George Washington database demonstrate that this approach is able to outperform the commonly used Dynamic Time Warping approach. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
ACM |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
978-1-4503-0916-5 |
Medium |
|
Area |
|
Expedition |
|
Conference |
HIP |
Notes |
DAG |
Approved |
no |
Call Number |
Admin @ si @ FFF2011a |
Serial |
1823 |
Permanent link to this record |
|
|
|
Author |
Alicia Fornes; Xavier Otazu; Josep Llados |
Title |
Show through cancellation and image enhancement by multiresolution contrast processing |
Type |
Conference Article |
Year |
2013 |
Publication |
12th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
200-204 |
Keywords |
|
Abstract |
Historical documents suffer from different types of degradation and noise such as background variation, uneven illumination or dark spots. In case of double-sided documents, another common problem is that the back side of the document usually interferes with the front side because of the transparency of the document or ink bleeding. This effect is called the show through phenomenon. Many methods are developed to solve these problems, and in the case of show-through, by scanning and matching both the front and back sides of the document. In contrast, our approach is designed to use only one side of the scanned document. We hypothesize that show-trough are low contrast components, while foreground components are high contrast ones. A Multiresolution Contrast (MC) decomposition is presented in order to estimate the contrast of features at different spatial scales. We cancel the show-through phenomenon by thresholding these low contrast components. This decomposition is also able to enhance the image removing shadowed areas by weighting spatial scales. Results show that the enhanced images improve the readability of the documents, allowing scholars both to recover unreadable words and to solve ambiguities. |
Address |
Washington; USA; August 2013 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1520-5363 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICDAR |
Notes |
DAG; 602.006; 600.045; 600.061; 600.052;CIC |
Approved |
no |
Call Number |
Admin @ si @ FOL2013 |
Serial |
2241 |
Permanent link to this record |
|
|
|
Author |
Alloy Das; Sanket Biswas; Ayan Banerjee; Josep Llados; Umapada Pal; Saumik Bhattacharya |
Title |
Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance |
Type |
Conference Article |
Year |
2024 |
Publication |
Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
718-728 |
Keywords |
|
Abstract |
The adaptation capability to a wide range of domains is crucial for scene text spotting models when deployed to real-world conditions. However, existing state-of-the-art (SOTA) approaches usually incorporate scene text detection and recognition simply by pretraining on natural scene text datasets, which do not directly exploit the intermediate feature representations between multiple domains. Here, we investigate the problem of domain-adaptive scene text spotting, i.e., training a model on multi-domain source data such that it can directly adapt to target domains rather than being specialized for a specific domain or scenario. Further, we investigate a transformer baseline called Swin-TESTR to focus on solving scene-text spotting for both regular and arbitrary-shaped scene text along with an exhaustive evaluation. The results clearly demonstrate the potential of intermediate representations to achieve significant performance on text spotting benchmarks across multiple domains (e.g. language, synth-to-real, and documents). both in terms of accuracy and efficiency. |
Address |
Waikoloa; Hawai; USA; January 2024 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
WACV |
Notes |
DAG |
Approved |
no |
Call Number |
Admin @ si @ DBB2024 |
Serial |
3986 |
Permanent link to this record |
|
|
|
Author |
Alloy Das; Sanket Biswas; Umapada Pal; Josep Llados |
Title |
Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes |
Type |
Conference Article |
Year |
2024 |
Publication |
IEEE International Conference on Robotics and Automation in PACIFICO |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
When used in a real-world noisy environment, the capacity to generalize to multiple domains is essential for any autonomous scene text spotting system. However, existing state-of-the-art methods employ pretraining and fine-tuning strategies on natural scene datasets, which do not exploit the feature interaction across other complex domains. In this work, we explore and investigate the problem of domain-agnostic scene text spotting, i.e., training a model on multi-domain source data such that it can directly generalize to target domains rather than being specialized for a specific domain or scenario. In this regard, we present the community a text spotting validation benchmark called Under-Water Text (UWT) for noisy underwater scenes to establish an important case study. Moreover, we also design an efficient super-resolution based end-to-end transformer baseline called DA-TextSpotter which achieves comparable or superior performance over existing text spotting architectures for both regular and arbitrary-shaped scene text spotting benchmarks in terms of both accuracy and model efficiency. The dataset, code and pre-trained models will be released upon acceptance. |
Address |
Yokohama; Japan; May 2024 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICRA |
Notes |
DAG |
Approved |
no |
Call Number |
Admin @ si @ DBP2024 |
Serial |
3979 |
Permanent link to this record |
|
|
|
Author |
Alvaro Cepero; Albert Clapes; Sergio Escalera |
Title |
Quantitative analysis of non-verbal communication for competence analysis |
Type |
Conference Article |
Year |
2013 |
Publication |
16th Catalan Conference on Artificial Intelligence |
Abbreviated Journal |
|
Volume |
256 |
Issue |
|
Pages |
105-114 |
Keywords |
|
Abstract |
|
Address |
Vic; October 2013 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
CCIA |
Notes |
HUPBA;MILAB |
Approved |
no |
Call Number |
Admin @ si @ CCE2013 |
Serial |
2324 |
Permanent link to this record |
|
|
|
Author |
Alvaro Peris; Marc Bolaños; Petia Radeva; Francisco Casacuberta |
Title |
Video Description Using Bidirectional Recurrent Neural Networks |
Type |
Conference Article |
Year |
2016 |
Publication |
25th International Conference on Artificial Neural Networks |
Abbreviated Journal |
|
Volume |
2 |
Issue |
|
Pages |
3-11 |
Keywords |
Video description; Neural Machine Translation; Birectional Recurrent Neural Networks; LSTM; Convolutional Neural Networks |
Abstract |
Although traditionally used in the machine translation field, the encoder-decoder framework has been recently applied for the generation of video and image descriptions. The combination of Convolutional and Recurrent Neural Networks in these models has proven to outperform the previous state of the art, obtaining more accurate video descriptions. In this work we propose pushing further this model by introducing two contributions into the encoding stage. First, producing richer image representations by combining object and location information from Convolutional Neural Networks and second, introducing Bidirectional Recurrent Neural Networks for capturing both forward and backward temporal relationships in the input frames. |
Address |
Barcelona; September 2016 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICANN |
Notes |
MILAB; |
Approved |
no |
Call Number |
Admin @ si @ PBR2016 |
Serial |
2833 |
Permanent link to this record |
|
|
|
Author |
Ana Maria Ares; Jorge Bernal; Maria Jesus Nozal; F. Javier Sanchez; Jose Bernal |
Title |
Results of the use of Kahoot! gamification tool in a course of Chemistry |
Type |
Conference Article |
Year |
2018 |
Publication |
4th International Conference on Higher Education Advances |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
1215-1222 |
Keywords |
|
Abstract |
The present study examines the use of Kahoot! as a gamification tool to explore mixed learning strategies. We analyze its use in two different groups of a theoretical subject of the third course of the Degree in Chemistry. An empirical-analytical methodology was used using Kahoot! in two different groups of students, with different frequencies. The academic results of these two group of students were compared between them and with those obtained in the previous course, in which Kahoot! was not employed, with the aim of measuring the evolution in the students´ knowledge. The results showed, in all cases, that the use of Kahoot! has led to a significant increase in the overall marks, and in the number of students who passed the subject. Moreover, some differences were also observed in students´ academic performance according to the group. Finally, it can be concluded that the use of a gamification tool (Kahoot!) in a university classroom had generally improved students´ learning and marks, and that this improvement is more prevalent in those students who have achieved a better Kahoot! performance. |
Address |
Valencia; June 2018 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
HEAD |
Notes |
MV; no proj |
Approved |
no |
Call Number |
Admin @ si @ ABN2018 |
Serial |
3246 |
Permanent link to this record |
|
|
|
Author |
Anders Hast; Alicia Fornes |
Title |
A Segmentation-free Handwritten Word Spotting Approach by Relaxed Feature Matching |
Type |
Conference Article |
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
150-155 |
Keywords |
|
Abstract |
The automatic recognition of historical handwritten documents is still considered challenging task. For this reason, word spotting emerges as a good alternative for making the information contained in these documents available to the user. Word spotting is defined as the task of retrieving all instances of the query word in a document collection, becoming a useful tool for information retrieval. In this paper we propose a segmentation-free word spotting approach able to deal with large document collections. Our method is inspired on feature matching algorithms that have been applied to image matching and retrieval. Since handwritten words have different shape, there is no exact transformation to be obtained. However, the sufficient degree of relaxation is achieved by using a Fourier based descriptor and an alternative approach to RANSAC called PUMA. The proposed approach is evaluated on historical marriage records, achieving promising results. |
Address |
Santorini; Greece; April 2016 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
DAS |
Notes |
DAG; 602.006; 600.061; 600.077; 600.097 |
Approved |
no |
Call Number |
HaF2016 |
Serial |
2753 |
Permanent link to this record |
|
|
|
Author |
Andrea Gemelli; Sanket Biswas; Enrico Civitelli; Josep Llados; Simone Marinai |
Title |
Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks |
Type |
Conference Article |
Year |
2022 |
Publication |
17th European Conference on Computer Vision Workshops |
Abbreviated Journal |
|
Volume |
13804 |
Issue |
|
Pages |
329–344 |
Keywords |
|
Abstract |
Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis. The application of Graph Neural Networks (GNNs) has become crucial in various document-related tasks since they can unravel important structural patterns, fundamental in key information extraction processes. Previous works in the literature propose task-driven models and do not take into account the full power of graphs. We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model, to solve different tasks given different types of documents. We evaluated our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
978-3-031-25068-2 |
Medium |
|
Area |
|
Expedition |
|
Conference |
ECCV-TiE |
Notes |
DAG; 600.162; 600.140; 110.312 |
Approved |
no |
Call Number |
Admin @ si @ GBC2022 |
Serial |
3795 |
Permanent link to this record |
|
|
|
Author |
Andreas Fischer; Ching Y. Suen; Volkmar Frinken; Kaspar Riesen; Horst Bunke |
Title |
A Fast Matching Algorithm for Graph-Based Handwriting Recognition |
Type |
Conference Article |
Year |
2013 |
Publication |
9th IAPR – TC15 Workshop on Graph-based Representation in Pattern Recognition |
Abbreviated Journal |
|
Volume |
7877 |
Issue |
|
Pages |
194-203 |
Keywords |
|
Abstract |
The recognition of unconstrained handwriting images is usually based on vectorial representation and statistical classification. Despite their high representational power, graphs are rarely used in this field due to a lack of efficient graph-based recognition methods. Recently, graph similarity features have been proposed to bridge the gap between structural representation and statistical classification by means of vector space embedding. This approach has shown a high performance in terms of accuracy but had shortcomings in terms of computational speed. The time complexity of the Hungarian algorithm that is used to approximate the edit distance between two handwriting graphs is demanding for a real-world scenario. In this paper, we propose a faster graph matching algorithm which is derived from the Hausdorff distance. On the historical Parzival database it is demonstrated that the proposed method achieves a speedup factor of 12.9 without significant loss in recognition accuracy. |
Address |
Vienna; Austria; May 2013 |
Corporate Author |
|
Thesis |
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
0302-9743 |
ISBN |
978-3-642-38220-8 |
Medium |
|
Area |
|
Expedition |
|
Conference |
GBR |
Notes |
DAG; 600.045; 605.203 |
Approved |
no |
Call Number |
Admin @ si @ FSF2013 |
Serial |
2294 |
Permanent link to this record |
|
|
|
Author |
Andreas Fischer; Volkmar Frinken; Alicia Fornes; Horst Bunke |
Title |
Transcription Alignment of Latin Manuscripts Using Hidden Markov Models |
Type |
Conference Article |
Year |
2011 |
Publication |
Proceedings of the 2011 Workshop on Historical Document Imaging and Processing |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
29-36 |
Keywords |
|
Abstract |
Transcriptions of historical documents are a valuable source for extracting labeled handwriting images that can be used for training recognition systems. In this paper, we introduce the Saint Gall database that includes images as well as the transcription of a Latin manuscript from the 9th century written in Carolingian script. Although the available transcription is of high quality for a human reader, the spelling of the words is not accurate when compared with the handwriting image. Hence, the transcription poses several challenges for alignment regarding, e.g., line breaks, abbreviations, and capitalization. We propose an alignment system based on character Hidden Markov Models that can cope with these challenges and efficiently aligns complete document pages. On the Saint Gall database, we demonstrate that a considerable alignment accuracy can be achieved, even with weakly trained character models. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
ACM |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
HIP |
Notes |
DAG |
Approved |
no |
Call Number |
Admin @ si @ FFF2011b |
Serial |
1824 |
Permanent link to this record |