|
Records |
Links |
|
Author |
David Fernandez; Simone Marinai; Josep Llados; Alicia Fornes |
|
|
Title |
Contextual Word Spotting in Historical Manuscripts using Markov Logic Networks |
Type |
Conference Article |
|
Year |
2013 |
Publication |
2nd International Workshop on Historical Document Imaging and Processing |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
36-43 |
|
|
Keywords |
|
|
|
Abstract |
Natural languages can often be modelled by suitable grammars whose knowledge can improve the word spotting results. The implicit contextual information is even more useful when dealing with information that is intrinsically described as one collection of records. In this paper, we present one approach to word spotting which uses the contextual information of records to improve the results. The method relies on Markov Logic Networks to probabilistically model the relational organization of handwritten records. The performance has been evaluated on the Barcelona Marriages Dataset that contains structured handwritten records that summarize marriage information. |
|
|
Address |
washington; USA; August 2013 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-4503-2115-0 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
HIP |
|
|
Notes |
DAG; 600.056; 600.045; 600.061; 602.006 |
Approved |
no |
|
|
Call Number |
Admin @ si @ FML2013 |
Serial |
2308 |
|
Permanent link to this record |
|
|
|
|
Author |
G.Thorvaldsen; Joana Maria Pujadas-Mora; T.Andersen ; L.Eikvil; Josep Llados; Alicia Fornes; Anna Cabre |
|
|
Title |
A Tale of two Transcriptions |
Type |
Journal |
|
Year |
2015 |
Publication |
Historical Life Course Studies |
Abbreviated Journal |
|
|
|
Volume |
2 |
Issue |
|
Pages |
1-19 |
|
|
Keywords |
Nominative Sources; Census; Vital Records; Computer Vision; Optical Character Recognition; Word Spotting |
|
|
Abstract |
non-indexed
This article explains how two projects implement semi-automated transcription routines: for census sheets in Norway and marriage protocols from Barcelona. The Spanish system was created to transcribe the marriage license books from 1451 to 1905 for the Barcelona area; one of the world’s longest series of preserved vital records. Thus, in the Project “Five Centuries of Marriages” (5CofM) at the Autonomous University of Barcelona’s Center for Demographic Studies, the Barcelona Historical Marriage Database has been built. More than 600,000 records were transcribed by 150 transcribers working online. The Norwegian material is cross-sectional as it is the 1891 census, recorded on one sheet per person. This format and the underlining of keywords for several variables made it more feasible to semi-automate data entry than when many persons are listed on the same page. While Optical Character Recognition (OCR) for printed text is scientifically mature, computer vision research is now focused on more difficult problems such as handwriting recognition. In the marriage project, document analysis methods have been proposed to automatically recognize the marriage licenses. Fully automatic recognition is still a challenge, but some promising results have been obtained. In Spain, Norway and elsewhere the source material is available as scanned pictures on the Internet, opening up the possibility for further international cooperation concerning automating the transcription of historic source materials. Like what is being done in projects to digitize printed materials, the optimal solution is likely to be a combination of manual transcription and machine-assisted recognition also for hand-written sources. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
2352-6343 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.077; 602.006 |
Approved |
no |
|
|
Call Number |
Admin @ si @ TPA2015 |
Serial |
2582 |
|
Permanent link to this record |
|
|
|
|
Author |
Carlos Boned Riera; Oriol Ramos Terrades |
|
|
Title |
Discriminative Neural Variational Model for Unbalanced Classification Tasks in Knowledge Graph |
Type |
Conference Article |
|
Year |
2022 |
Publication |
26th International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2186-2191 |
|
|
Keywords |
Measurement; Couplings; Semantics; Ear; Benchmark testing; Data models; Pattern recognition |
|
|
Abstract |
Nowadays the paradigm of link discovery problems has shown significant improvements on Knowledge Graphs. However, method performances are harmed by the unbalanced nature of this classification problem, since many methods are easily biased to not find proper links. In this paper we present a discriminative neural variational auto-encoder model, called DNVAE from now on, in which we have introduced latent variables to serve as embedding vectors. As a result, the learnt generative model approximate better the underlying distribution and, at the same time, it better differentiate the type of relations in the knowledge graph. We have evaluated this approach on benchmark knowledge graph and Census records. Results in this last data set are quite impressive since we reach the highest possible score in the evaluation metrics. However, further experiments are still needed to deeper evaluate the performance of the method in more challenging tasks. |
|
|
Address |
Montreal; Quebec; Canada; August 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
DAG; 600.121; 600.162 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BoR2022 |
Serial |
3741 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Pere de las Heras; Oriol Ramos Terrades; Josep Llados; David Fernandez; Cristina Cañero |
|
|
Title |
Use case visual Bag-of-Words techniques for camera based identity document classification |
Type |
Conference Article |
|
Year |
2015 |
Publication |
13th International Conference on Document Analysis and Recognition ICDAR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
721 - 725 |
|
|
Keywords |
|
|
|
Abstract |
Nowadays, automatic identity document recognition, including passport and driving license recognition, is at the core of many applications within the administrative and service sectors, such as police, hospitality, car renting, etc. In former years, the document information was manually extracted whereas today this data is recognized automatically from images obtained by flat-bed scanners. Yet, since these scanners tend to be expensive and voluminous, companies in the sector have recently turned their attention to cheaper, small and yet computationally powerful scanners: the mobile devices. The document identity recognition from mobile images enclose several new difficulties w.r.t traditional scanned images, such as the loss of a controlled background, perspective, blurring, etc. In this paper we present a real application for identity document classification of images taken from mobile devices. This classification process is of extreme importance since a prior knowledge of the document type and origin strongly facilitates the subsequent information extraction. The proposed method is based on a traditional Bagof-Words in which we have taken into consideration several key aspects to enhance recognition rate. The method performance has been studied on three datasets containing more than 2000 images from 129 different document classes. |
|
|
Address |
Nancy; France; August 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.077; 600.061; |
Approved |
no |
|
|
Call Number |
Admin @ si @ HRL2015a |
Serial |
2726 |
|
Permanent link to this record |
|
|
|
|
Author |
R. Bertrand; Oriol Ramos Terrades; P. Gomez-Kramer; P. Franco; Jean-Marc Ogier |
|
|
Title |
A Conditional Random Field model for font forgery detection |
Type |
Conference Article |
|
Year |
2015 |
Publication |
13th International Conference on Document Analysis and Recognition ICDAR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
576 - 580 |
|
|
Keywords |
|
|
|
Abstract |
Nowadays, document forgery is becoming a real issue. A large amount of documents that contain critical information as payment slips, invoices or contracts, are constantly subject to fraudster manipulation because of the lack of security regarding this kind of document. Previously, a system to detect fraudulent documents based on its intrinsic features has been presented. It was especially designed to retrieve copy-move forgery and imperfection due to fraudster manipulation. However, when a set of characters is not present in the original document, copy-move forgery is not feasible. Hence, the fraudster will use a text toolbox to add or modify information in the document by imitating the font or he will cut and paste characters from another document where the font properties are similar. This often results in font type errors. Thus, a clue to detect document forgery consists of finding characters, words or sentences in a document with font properties different from their surroundings. To this end, we present in this paper an automatic forgery detection method based on document font features. Using the Conditional Random Field a measurement of probability that a character belongs to a specific font is made by comparing the character font features to a knowledge database. Then, the character is classified as a genuine or a fake one by comparing its probability to belong to a certain font type with those of the neighboring characters. |
|
|
Address |
Nancy; France; August 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRG2015 |
Serial |
2725 |
|
Permanent link to this record |
|
|
|
|
Author |
B. Gautam; Oriol Ramos Terrades; Joana Maria Pujadas-Mora; Miquel Valls-Figols |
|
|
Title |
Knowledge graph based methods for record linkage |
Type |
Journal Article |
|
Year |
2020 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
136 |
Issue |
|
Pages |
127-133 |
|
|
Keywords |
|
|
|
Abstract |
Nowadays, it is common in Historical Demography the use of individual-level data as a consequence of a predominant life-course approach for the understanding of the demographic behaviour, family transition, mobility, etc. Advanced record linkage is key since it allows increasing the data complexity and its volume to be analyzed. However, current methods are constrained to link data from the same kind of sources. Knowledge graph are flexible semantic representations, which allow to encode data variability and semantic relations in a structured manner.
In this paper we propose the use of knowledge graph methods to tackle record linkage tasks. The proposed method, named WERL, takes advantage of the main knowledge graph properties and learns embedding vectors to encode census information. These embeddings are properly weighted to maximize the record linkage performance. We have evaluated this method on benchmark data sets and we have compared it to related methods with stimulating and satisfactory results. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.140; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GRP2020 |
Serial |
3453 |
|
Permanent link to this record |
|
|
|
|
Author |
Jialuo Chen; Pau Riba; Alicia Fornes; Juan Mas; Josep Llados; Joana Maria Pujadas-Mora |
|
|
Title |
Word-Hunter: A Gamesourcing Experience to Validate the Transcription of Historical Manuscripts |
Type |
Conference Article |
|
Year |
2018 |
Publication |
16th International Conference on Frontiers in Handwriting Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
528-533 |
|
|
Keywords |
Crowdsourcing; Gamification; Handwritten documents; Performance evaluation |
|
|
Abstract |
Nowadays, there are still many handwritten historical documents in archives waiting to be transcribed and indexed. Since manual transcription is tedious and time consuming, the automatic transcription seems the path to follow. However, the performance of current handwriting recognition techniques is not perfect, so a manual validation is mandatory. Crowdsourcing is a good strategy for manual validation, however it is a tedious task. In this paper we analyze experiences based in gamification
in order to propose and design a gamesourcing framework that increases the interest of users. Then, we describe and analyze our experience when validating the automatic transcription using the gamesourcing application. Moreover, thanks to the combination of clustering and handwriting recognition techniques, we can speed up the validation while maintaining the performance. |
|
|
Address |
Niagara Falls, USA; August 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG; 600.097; 603.057; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CRF2018 |
Serial |
3169 |
|
Permanent link to this record |
|
|
|
|
Author |
Ayan Banerjee; Sanket Biswas; Josep Llados; Umapada Pal |
|
|
Title |
GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation |
Type |
Miscellaneous |
|
Year |
2024 |
Publication |
Arxiv |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Object detection in documents is a key step to automate the structural elements identification process in a digital or scanned document through understanding the hierarchical structure and relationships between different elements. Large and complex models, while achieving high accuracy, can be computationally expensive and memory-intensive, making them impractical for deployment on resource constrained devices. Knowledge distillation allows us to create small and more efficient models that retain much of the performance of their larger counterparts. Here we present a graph-based knowledge distillation framework to correctly identify and localize the document objects in a document image. Here, we design a structured graph with nodes containing proposal-level features and edges representing the relationship between the different proposal regions. Also, to reduce text bias an adaptive node sampling strategy is designed to prune the weight distribution and put more weightage on non-text nodes. We encode the complete graph as a knowledge representation and transfer it from the teacher to the student through the proposed distillation loss by effectively capturing both local and global information concurrently. Extensive experimentation on competitive benchmarks demonstrates that the proposed framework outperforms the current state-of-the-art approaches. The code will be available at: this https URL. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ BBL2024b |
Serial |
4023 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Gomez; Dimosthenis Karatzas |
|
|
Title |
Object Proposals for Text Extraction in the Wild |
Type |
Conference Article |
|
Year |
2015 |
Publication |
13th International Conference on Document Analysis and Recognition ICDAR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
206 - 210 |
|
|
Keywords |
|
|
|
Abstract |
Object Proposals is a recent computer vision technique receiving increasing interest from the research community. Its main objective is to generate a relatively small set of bounding box proposals that are most likely to contain objects of interest. The use of Object Proposals techniques in the scene text understanding field is innovative. Motivated by the success of powerful while expensive techniques to recognize words in a holistic way, Object Proposals techniques emerge as an alternative to the traditional text detectors. In this paper we study to what extent the existing generic Object Proposals methods may be useful for scene text understanding. Also, we propose a new Object Proposals algorithm that is specifically designed for text and compare it with other generic methods in the state of the art. Experiments show that our proposal is superior in its ability of producing good quality word proposals in an efficient way. The source code of our method is made publicly available |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.077; 600.084; 601.197 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GoK2015 |
Serial |
2691 |
|
Permanent link to this record |
|
|
|
|
Author |
Sophie Wuerger; Kaida Xiao; Dimitris Mylonas; Q. Huang; Dimosthenis Karatzas; Galina Paramei |
|
|
Title |
Blue green color categorization in mandarin english speakers |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Journal of the Optical Society of America A |
Abbreviated Journal |
JOSA A |
|
|
Volume |
29 |
Issue |
2 |
Pages |
A102-A1207 |
|
|
Keywords |
|
|
|
Abstract |
Observers are faster to detect a target among a set of distracters if the targets and distracters come from different color categories. This cross-boundary advantage seems to be limited to the right visual field, which is consistent with the dominance of the left hemisphere for language processing [Gilbert et al., Proc. Natl. Acad. Sci. USA 103, 489 (2006)]. Here we study whether a similar visual field advantage is found in the color identification task in speakers of Mandarin, a language that uses a logographic system. Forty late Mandarin-English bilinguals performed a blue-green color categorization task, in a blocked design, in their first language (L1: Mandarin) or second language (L2: English). Eleven color singletons ranging from blue to green were presented for 160 ms, randomly in the left visual field (LVF) or right visual field (RVF). Color boundary and reaction times (RTs) at the color boundary were estimated in L1 and L2, for both visual fields. We found that the color boundary did not differ between the languages; RTs at the color boundary, however, were on average more than 100 ms shorter in the English compared to the Mandarin sessions, but only when the stimuli were presented in the RVF. The finding may be explained by the script nature of the two languages: Mandarin logographic characters are analyzed visuospatially in the right hemisphere, which conceivably facilitates identification of color presented to the LVF. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ WXM2012 |
Serial |
2007 |
|
Permanent link to this record |