|
Records |
Links |
|
Author |
Marçal Rusiñol; V. Poulain d'Andecy; Dimosthenis Karatzas; Josep Llados |


|
|
Title |
Classification of Administrative Document Images by Logo Identification |
Type |
Book Chapter |
|
Year |
2014 |
Publication |
Graphics Recognition. Current Trends and Challenges |
Abbreviated Journal |
|
|
|
Volume |
8746 |
Issue |
|
Pages |
49-58 |
|
|
Keywords |
Administrative Document Classification; Logo Recognition; Logo Spotting |
|
|
Abstract  |
This paper is focused on the categorization of administrative document images (such as invoices) based on the recognition of the supplier’s graphical logo. Two different methods are proposed, the first one uses a bag-of-visual-words model whereas the second one tries to locate logo images described by the blurred shape model descriptor within documents by a sliding-window technique. Preliminar results are reported with a dataset of real administrative documents. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
Bart Lamiroy; Jean-Marc Ogier |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-662-44853-3 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.056; 600.045; 605.203; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RPK2014 |
Serial |
2701 |
|
Permanent link to this record |
|
|
|
|
Author |
Albert Berenguel; Oriol Ramos Terrades; Josep Llados; Cristina Cañero |


|
|
Title |
Recurrent Comparator with attention models to detect counterfeit documents |
Type |
Conference Article |
|
Year |
2019 |
Publication |
15th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract  |
This paper is focused on the detection of counterfeit documents via the recurrent comparison of the security textured background regions of two images. The main contributions are twofold: first we apply and adapt a recurrent comparator architecture with attention mechanism to the counterfeit detection task, which constructs a representation of the background regions by recurrently condition the next observation, learning the difference between genuine and counterfeit images through iterative glimpses. Second we propose a new counterfeit document dataset to ensure the generalization of the learned model towards the detection of the lack of resolution during the counterfeit manufacturing. The presented network, outperforms state-of-the-art classification approaches for counterfeit detection as demonstrated in the evaluation. |
|
|
Address |
Sidney; Australia; September 2019 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.140; 600.121; 601.269 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRL2019 |
Serial |
3456 |
|
Permanent link to this record |
|
|
|
|
Author |
Albert Berenguel; Oriol Ramos Terrades; Josep Llados; Cristina Cañero |

|
|
Title |
Banknote counterfeit detection through background texture printing analysis |
Type |
Conference Article |
|
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract  |
This paper is focused on the detection of counterfeit photocopy banknotes. The main difficulty is to work on a real industrial scenario without any constraint about the acquisition device and with a single image. The main contributions of this paper are twofold: first the adaptation and performance evaluation of existing approaches to classify the genuine and photocopy banknotes using background texture printing analysis, which have not been applied into this context before. Second, a new dataset of Euro banknotes images acquired with several cameras under different luminance conditions to evaluate these methods. Experiments on the proposed algorithms show that mixing SIFT features and sparse coding dictionaries achieves quasi perfect classification using a linear SVM with the created dataset. Approaches using dictionaries to cover all possible texture variations have demonstrated to be robust and outperform the state-of-the-art methods using the proposed benchmark. |
|
|
Address |
Rumania; May 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.061; 601.269; 600.097 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRL2016 |
Serial |
2950 |
|
Permanent link to this record |
|
|
|
|
Author |
Francesc Net; Marc Folia; Pep Casals; Lluis Gomez |

|
|
Title |
Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections |
Type |
Conference Article |
|
Year |
2023 |
Publication |
17th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
14191 |
Issue |
|
Pages |
3-17 |
|
|
Keywords |
Image deduplication; Near-duplicate images detection; Transductive Learning; Photographic Archives; Deep Learning |
|
|
Abstract  |
This paper presents a comparative study of near-duplicate image detection techniques in a real-world use case scenario, where a document management company is commissioned to manually annotate a collection of scanned photographs. Detecting duplicate and near-duplicate photographs can reduce the time spent on manual annotation by archivists. This real use case differs from laboratory settings as the deployment dataset is available in advance, allowing the use of transductive learning. We propose a transductive learning approach that leverages state-of-the-art deep learning architectures such as convolutional neural networks (CNNs) and Vision Transformers (ViTs). Our approach involves pre-training a deep neural network on a large dataset and then fine-tuning the network on the unlabeled target collection with self-supervised learning. The results show that the proposed approach outperforms the baseline methods in the task of near-duplicate image detection in the UKBench and an in-house private dataset. |
|
|
Address |
San Jose; CA; USA; August 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ NFC2023 |
Serial |
3859 |
|
Permanent link to this record |
|
|
|
|
Author |
Francesc Tous; Agnes Borras; Robert Benavente; Ramon Baldrich; Maria Vanrell; Josep Llados |

|
|
Title |
Textual Descriptors for browsing people by visual appearence. |
Type |
Conference Article |
|
Year |
2002 |
Publication |
5è. Congrés Català d’Intel·ligència Artificial CCIA |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Image retrieval, textual descriptors, colour naming, colour normalization, graph matching. |
|
|
Abstract  |
This paper presents a first approach to build colour and structural descriptors for information retrieval on a people database. Queries are formulated in terms of their appearance that allows to seek people wearing specific clothes of a given colour name or texture. Descriptors are automatically computed by following three essential steps. A colour naming labelling from pixel properties. A region seg- mentation step based on colour properties of pixels combined with edge information. And a high level step that models the region arrangements in order to build clothes structure. Results are tested on large set of images from real scenes taken at the entrance desk of a building. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG;CIC |
Approved |
no |
|
|
Call Number |
CAT @ cat @ TBB2002a |
Serial |
287 |
|
Permanent link to this record |
|
|
|
|
Author |
Francesc Tous; Agnes Borras; Robert Benavente; Ramon Baldrich; Maria Vanrell; Josep Llados |

|
|
Title |
Textual Descriptions for Browsing People by Visual Apperance. |
Type |
Book Chapter |
|
Year |
2002 |
Publication |
Lecture Notes in Artificial Intelligence |
Abbreviated Journal |
|
|
|
Volume |
2504 |
Issue |
|
Pages |
419-429 |
|
|
Keywords |
|
|
|
Abstract  |
This paper presents a first approach to build colour and structural descriptors for information retrieval on a people database. Queries are formulated in terms of their appearance that allows to seek people wearing specific clothes of a given colour name or texture. Descriptors are automatically computed by following three essential steps. A colour naming labelling from pixel properties. A region seg- mentation step based on colour properties of pixels combined with edge information. And a high level step that models the region arrangements in order to build clothes structure. Results are tested on large set of images from real scenes taken at the entrance desk of a building |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Verlag |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG;CIC |
Approved |
no |
|
|
Call Number |
CAT @ cat @ TBB2002b |
Serial |
319 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Riba; Josep Llados; Alicia Fornes |


|
|
Title |
Handwritten Word Spotting by Inexact Matching of Grapheme Graphs |
Type |
Conference Article |
|
Year |
2015 |
Publication |
13th International Conference on Document Analysis and Recognition ICDAR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
781 - 785 |
|
|
Keywords |
|
|
|
Abstract  |
This paper presents a graph-based word spotting for handwritten documents. Contrary to most word spotting techniques, which use statistical representations, we propose a structural representation suitable to be robust to the inherent deformations of handwriting. Attributed graphs are constructed using a part-based approach. Graphemes extracted from shape convexities are used as stable units of handwriting, and are associated to graph nodes. Then, spatial relations between them determine graph edges. Spotting is defined in terms of an error-tolerant graph matching using bipartite-graph matching algorithm. To make the method usable in large datasets, a graph indexing approach that makes use of binary embeddings is used as preprocessing. Historical documents are used as experimental framework. The approach is comparable to statistical ones in terms of time and memory requirements, especially when dealing with large document collections. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.077; 600.061; 602.006 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RLF2015b |
Serial |
2642 |
|
Permanent link to this record |
|
|
|
|
Author |
Jon Almazan; Alicia Fornes; Ernest Valveny |


|
|
Title |
A Non-Rigid Feature Extraction Method for Shape Recognition |
Type |
Conference Article |
|
Year |
2011 |
Publication |
11th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
987-991 |
|
|
Keywords |
|
|
|
Abstract  |
This paper presents a methodology for shape recognition that focuses on dealing with the difficult problem of large deformations. The proposed methodology consists in a novel feature extraction technique, which uses a non-rigid representation adaptable to the shape. This technique employs a deformable grid based on the computation of geometrical centroids that follows a region partitioning algorithm. Then, a feature vector is extracted by computing pixel density measures around these geometrical centroids. The result is a shape descriptor that adapts its representation to the given shape and encodes the pixel density distribution. The validity of the method when dealing with large deformations has been experimentally shown over datasets composed of handwritten shapes. It has been applied to signature verification and shape recognition tasks demonstrating high accuracy and low computational cost. |
|
|
Address |
Beijing; China; September 2011 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-0-7695-4520-2 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ AFV2011 |
Serial |
1763 |
|
Permanent link to this record |
|
|
|
|
Author |
Jon Almazan; Ernest Valveny; Alicia Fornes |

|
|
Title |
Deforming the Blurred Shape Model for Shape Description and Recognition |
Type |
Conference Article |
|
Year |
2011 |
Publication |
5th Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
6669 |
Issue |
|
Pages |
1-8 |
|
|
Keywords |
|
|
|
Abstract  |
This paper presents a new model for the description and recognition of distorted shapes, where the image is represented by a pixel density distribution based on the Blurred Shape Model combined with a non-linear image deformation model. This leads to an adaptive structure able to capture elastic deformations in shapes. This method has been evaluated using thee different datasets where deformations are present, showing the robustness and good performance of the new model. Moreover, we show that incorporating deformation and flexibility, the new model outperforms the BSM approach when classifying shapes with high variability of appearance. |
|
|
Address |
Las Palmas de Gran Canaria. Spain |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer-Verlag |
Place of Publication |
Berlin |
Editor |
Jordi Vitria; Joao Miguel Raposo; Mario Hernandez |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
|
|
Notes |
DAG; |
Approved |
no |
|
|
Call Number |
Admin @ si @ AVF2011 |
Serial |
1732 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Gomez; Ali Furkan Biten; Ruben Tito; Andres Mafla; Marçal Rusiñol; Ernest Valveny; Dimosthenis Karatzas |


|
|
Title |
Multimodal grid features and cell pointers for scene text visual question answering |
Type |
Journal Article |
|
Year |
2021 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
150 |
Issue |
|
Pages |
242-249 |
|
|
Keywords |
|
|
|
Abstract  |
This paper presents a new model for the task of scene text visual question answering. In this task questions about a given image can only be answered by reading and understanding scene text. Current state of the art models for this task make use of a dual attention mechanism in which one attention module attends to visual features while the other attends to textual features. A possible issue with this is that it makes difficult for the model to reason jointly about both modalities. To fix this problem we propose a new model that is based on an single attention mechanism that attends to multi-modal features conditioned to the question. The output weights of this attention module over a grid of multi-modal spatial features are interpreted as the probability that a certain spatial location of the image contains the answer text to the given question. Our experiments demonstrate competitive performance in two standard datasets with a model that is faster than previous methods at inference time. Furthermore, we also provide a novel analysis of the ST-VQA dataset based on a human performance study. Supplementary material, code, and data is made available through this link. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.084; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GBT2021 |
Serial |
3620 |
|
Permanent link to this record |