|
Records |
Links |
|
Author |
Miquel Ferrer; Ernest Valveny; F. Serratosa; I. Bardaji; Horst Bunke |


|
|
Title |
Graph-based k-means clustering: A comparison of the set versus the generalized median graph |
Type |
Conference Article |
|
Year |
2009 |
Publication |
13th International Conference on Computer Analysis of Images and Patterns |
Abbreviated Journal |
|
|
|
Volume |
5702 |
Issue |
|
Pages  |
342–350 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we propose the application of the generalized median graph in a graph-based k-means clustering algorithm. In the graph-based k-means algorithm, the centers of the clusters have been traditionally represented using the set median graph. We propose an approximate method for the generalized median graph computation that allows to use it to represent the centers of the clusters. Experiments on three databases show that using the generalized median graph as the clusters representative yields better results than the set median graph. |
|
|
Address |
Münster, Germany |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-03766-5 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CAIP |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ FVS2009d |
Serial |
1219 |
|
Permanent link to this record |
|
|
|
|
Author |
Subhajit Maity; Sanket Biswas; Siladittya Manna; Ayan Banerjee; Josep Llados; Saumik Bhattacharya; Umapada Pal |


|
|
Title |
SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation |
Type |
Conference Article |
|
Year |
2023 |
Publication |
17th International Conference on Doccument Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
14187 |
Issue |
|
Pages  |
342–360 |
|
|
Keywords |
|
|
|
Abstract |
Document layout analysis is a known problem to the documents research community and has been vastly explored yielding a multitude of solutions ranging from text mining, and recognition to graph-based representation, visual feature extraction, etc. However, most of the existing works have ignored the crucial fact regarding the scarcity of labeled data. With growing internet connectivity to personal life, an enormous amount of documents had been available in the public domain and thus making data annotation a tedious task. We address this challenge using self-supervision and unlike, the few existing self-supervised document segmentation approaches which use text mining and textual labels, we use a complete vision-based approach in pre-training without any ground-truth label or its derivative. Instead, we generate pseudo-layouts from the document images to pre-train an image encoder to learn the document object representation and localization in a self-supervised framework before fine-tuning it with an object detection model. We show that our pipeline sets a new benchmark in this context and performs at par with the existing methods and the supervised counterparts, if not outperforms. The code is made publicly available at: this https URL |
|
|
Address |
Document Layout Analysis; Document |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ MBM2023 |
Serial |
3990 |
|
Permanent link to this record |
|
|
|
|
Author |
Farshad Nourbakhsh; Dimosthenis Karatzas; Ernest Valveny |


|
|
Title |
A polar-based logo representation based on topological and colour features |
Type |
Conference Article |
|
Year |
2010 |
Publication |
9th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages  |
341–348 |
|
|
Keywords |
|
|
|
Abstract |
In this paper, we propose a novel rotation and scale invariant method for colour logo retrieval and classification, which involves performing a simple colour segmentation and subsequently describing each of the resultant colour components based on a set of topological and colour features. A polar representation is used to represent the logo and the subsequent logo matching is based on Cyclic Dynamic Time Warping (CDTW). We also show how combining information about the global distribution of the logo components and their local neighbourhood using the Delaunay triangulation allows to improve the results. All experiments are performed on a dataset of 2500 instances of 100 colour logo images in different rotations and scales. |
|
|
Address |
Boston; USA; |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-60558-773-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ NKV2010 |
Serial |
1436 |
|
Permanent link to this record |
|
|
|
|
Author |
Asma Bensalah; Alicia Fornes; Cristina Carmona_Duarte; Josep Llados |


|
|
Title |
Easing Automatic Neurorehabilitation via Classification and Smoothness Analysis |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022 |
Abbreviated Journal |
|
|
|
Volume |
13424 |
Issue |
|
Pages  |
336-348 |
|
|
Keywords |
Neurorehabilitation; Upper-lim; Movement classification; Movement smoothness; Deep learning; Jerk |
|
|
Abstract |
Assessing the quality of movements for post-stroke patients during the rehabilitation phase is vital given that there is no standard stroke rehabilitation plan for all the patients. In fact, it depends basically on the patient’s functional independence and its progress along the rehabilitation sessions. To tackle this challenge and make neurorehabilitation more agile, we propose an automatic assessment pipeline that starts by recognising patients’ movements by means of a shallow deep learning architecture, then measuring the movement quality using jerk measure and related measures. A particularity of this work is that the dataset used is clinically relevant, since it represents movements inspired from Fugl-Meyer a well common upper-limb clinical stroke assessment scale for stroke patients. We show that it is possible to detect the contrast between healthy and patients movements in terms of smoothness, besides achieving conclusions about the patients’ progress during the rehabilitation sessions that correspond to the clinicians’ findings about each case. |
|
|
Address |
June 7-9, 2022, Las Palmas de Gran Canaria, Spain |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IGS |
|
|
Notes |
DAG; 600.121; 600.162; 602.230; 600.140 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BFC2022 |
Serial |
3738 |
|
Permanent link to this record |
|
|
|
|
Author |
Anton Cervantes; Gemma Sanchez; Josep Llados; Agnes Borras; A. Rodriguez |

|
|
Title |
Biometric Recognition Based on Line Shape Descriptors |
Type |
Conference Article |
|
Year |
2005 |
Publication |
Sixth IAPR International Workshop on Graphics Recognition (GREC 2005) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages  |
335–344 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Hong Kong (China) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ CSL2005 |
Serial |
596 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Gomez; Dimosthenis Karatzas |


|
|
Title |
A fast hierarchical method for multi‐script and arbitrary oriented scene text extraction |
Type |
Journal Article |
|
Year |
2016 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
IJDAR |
|
|
Volume |
19 |
Issue |
4 |
Pages  |
335-349 |
|
|
Keywords |
scene text; segmentation; detection; hierarchical grouping; perceptual organisation |
|
|
Abstract |
Typography and layout lead to the hierarchical organisation of text in words, text lines, paragraphs. This inherent structure is a key property of text in any script and language, which has nonetheless been minimally leveraged by existing text detection methods. This paper addresses the problem of text
segmentation in natural scenes from a hierarchical perspective.
Contrary to existing methods, we make explicit use of text structure, aiming directly to the detection of region groupings corresponding to text within a hierarchy produced by an agglomerative similarity clustering process over individual regions. We propose an optimal way to construct such an hierarchy introducing a feature space designed to produce text group hypotheses with
high recall and a novel stopping rule combining a discriminative classifier and a probabilistic measure of group meaningfulness based in perceptual organization. Results obtained over four standard datasets, covering text in variable orientations and different languages, demonstrate that our algorithm, while being trained in a single mixed dataset, outperforms state of the art
methods in unconstrained scenarios. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.056; 601.197 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GoK2016a |
Serial |
2862 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; Volkmar Frinken; Dimosthenis Karatzas; Andrew Bagdanov; Josep Llados |

|
|
Title |
Multimodal page classification in administrative document image streams |
Type |
Journal Article |
|
Year |
2014 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
IJDAR |
|
|
Volume |
17 |
Issue |
4 |
Pages  |
331-341 |
|
|
Keywords |
Digital mail room; Multimodal page classification; Visual and textual document description |
|
|
Abstract |
In this paper, we present a page classification application in a banking workflow. The proposed architecture represents administrative document images by merging visual and textual descriptions. The visual description is based on a hierarchical representation of the pixel intensity distribution. The textual description uses latent semantic analysis to represent document content as a mixture of topics. Several off-the-shelf classifiers and different strategies for combining visual and textual cues have been evaluated. A final step uses an n-gram model of the page stream allowing a finer-grained classification of pages. The proposed method has been tested in a real large-scale environment and we report results on a dataset of 70,000 pages. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1433-2833 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; LAMP; 600.056; 600.061; 601.240; 601.223; 600.077; 600.079 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RFK2014 |
Serial |
2523 |
|
Permanent link to this record |
|
|
|
|
Author |
Katerine Diaz; Jesus Martinez del Rincon; Marçal Rusiñol; Aura Hernandez-Sabate |


|
|
Title |
Feature Extraction by Using Dual-Generalized Discriminative Common Vectors |
Type |
Journal Article |
|
Year |
2019 |
Publication |
Journal of Mathematical Imaging and Vision |
Abbreviated Journal |
JMIV |
|
|
Volume |
61 |
Issue |
3 |
Pages  |
331-351 |
|
|
Keywords |
Online feature extraction; Generalized discriminative common vectors; Dual learning; Incremental learning; Decremental learning |
|
|
Abstract |
In this paper, a dual online subspace-based learning method called dual-generalized discriminative common vectors (Dual-GDCV) is presented. The method extends incremental GDCV by exploiting simultaneously both the concepts of incremental and decremental learning for supervised feature extraction and classification. Our methodology is able to update the feature representation space without recalculating the full projection or accessing the previously processed training data. It allows both adding information and removing unnecessary data from a knowledge base in an efficient way, while retaining the previously acquired knowledge. The proposed method has been theoretically proved and empirically validated in six standard face recognition and classification datasets, under two scenarios: (1) removing and adding samples of existent classes, and (2) removing and adding new classes to a classification problem. Results show a considerable computational gain without compromising the accuracy of the model in comparison with both batch methodologies and other state-of-art adaptive methods. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; ADAS; 600.084; 600.118; 600.121; 600.129;IAM |
Approved |
no |
|
|
Call Number |
Admin @ si @ DRR2019 |
Serial |
3172 |
|
Permanent link to this record |
|
|
|
|
Author |
Andrea Gemelli; Sanket Biswas; Enrico Civitelli; Josep Llados; Simone Marinai |



|
|
Title |
Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks |
Type |
Conference Article |
|
Year |
2022 |
Publication |
17th European Conference on Computer Vision Workshops |
Abbreviated Journal |
|
|
|
Volume |
13804 |
Issue |
|
Pages  |
329–344 |
|
|
Keywords |
|
|
|
Abstract |
Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis. The application of Graph Neural Networks (GNNs) has become crucial in various document-related tasks since they can unravel important structural patterns, fundamental in key information extraction processes. Previous works in the literature propose task-driven models and do not take into account the full power of graphs. We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model, to solve different tasks given different types of documents. We evaluated our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-031-25068-2 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECCV-TiE |
|
|
Notes |
DAG; 600.162; 600.140; 110.312 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GBC2022 |
Serial |
3795 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados |

|
|
Title |
Automatic Verification of Properly Signed Multi-page Document Images |
Type |
Conference Article |
|
Year |
2015 |
Publication |
Proceedings of the Eleventh International Symposium on Visual Computing |
Abbreviated Journal |
|
|
|
Volume |
9475 |
Issue |
|
Pages  |
327-336 |
|
|
Keywords |
Document Image; Manual Inspection; Signature Verification; Rejection Criterion; Document Flow |
|
|
Abstract |
In this paper we present an industrial application for the automatic screening of incoming multi-page documents in a banking workflow aimed at determining whether these documents are properly signed or not. The proposed method is divided in three main steps. First individual pages are classified in order to identify the pages that should contain a signature. In a second step, we segment within those key pages the location where the signatures should appear. The last step checks whether the signatures are present or not. Our method is tested in a real large-scale environment and we report the results when checking two different types of real multi-page contracts, having in total more than 14,500 pages. |
|
|
Address |
Las Vegas, Nevada, USA; December 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
9475 |
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ISVC |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3189 |
|
Permanent link to this record |