|
Records |
Links |
|
Author |
Lasse Martensson; Ekta Vats; Anders Hast; Alicia Fornes |
|
|
Title |
In Search of the Scribe: Letter Spotting as a Tool for Identifying Scribes in Large Handwritten Text Corpora |
Type |
Journal |
|
Year |
2019 |
Publication |
Journal for Information Technology Studies as a Human Science |
Abbreviated Journal |
HUMAN IT |
|
|
Volume |
14 |
Issue |
2 |
Pages |
95-120 |
|
|
Keywords |
Scribal attribution/ writer identification; digital palaeography; word spotting; mediaeval charters; mediaeval manuscripts |
|
|
Abstract |
In this article, a form of the so-called word spotting-method is used on a large set of handwritten documents in order to identify those that contain script of similar execution. The point of departure for the investigation is the mediaeval Swedish manuscript Cod. Holm. D 3. The main scribe of this manuscript has yet not been identified in other documents. The current attempt aims at localising other documents that display a large degree of similarity in the characteristics of the script, these being possible candidates for being executed by the same hand. For this purpose, the method of word spotting has been employed, focusing on individual letters, and therefore the process is referred to as letter spotting in the article. In this process, a set of ‘g’:s, ‘h’:s and ‘k’:s have been selected as templates, and then a search has been made for close matches among the mediaeval Swedish charters. The search resulted in a number of charters that displayed great similarities with the manuscript D 3. The used letter spotting method thus proofed to be a very efficient sorting tool localising similar script samples. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.097; 600.140; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ MVH2019 |
Serial |
3234 |
|
Permanent link to this record |
|
|
|
|
Author |
M. Visani; Oriol Ramos Terrades; Salvatore Tabbone |
|
|
Title |
A Protocol to Characterize the Descriptive Power and the Complementarity of Shape Descriptors |
Type |
Journal Article |
|
Year |
2011 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
IJDAR |
|
|
Volume |
14 |
Issue |
1 |
Pages |
87-100 |
|
|
Keywords |
Document analysis; Shape descriptors; Symbol description; Performance characterization; Complementarity analysis |
|
|
Abstract |
Most document analysis applications rely on the extraction of shape descriptors, which may be grouped into different categories, each category having its own advantages and drawbacks (O.R. Terrades et al. in Proceedings of ICDAR’07, pp. 227–231, 2007). In order to improve the richness of their description, many authors choose to combine multiple descriptors. Yet, most of the authors who propose a new descriptor content themselves with comparing its performance to the performance of a set of single state-of-the-art descriptors in a specific applicative context (e.g. symbol recognition, symbol spotting...). This results in a proliferation of the shape descriptors proposed in the literature. In this article, we propose an innovative protocol, the originality of which is to be as independent of the final application as possible and which relies on new quantitative and qualitative measures. We introduce two types of measures: while the measures of the first type are intended to characterize the descriptive power (in terms of uniqueness, distinctiveness and robustness towards noise) of a descriptor, the second type of measures characterizes the complementarity between multiple descriptors. Characterizing upstream the complementarity of shape descriptors is an alternative to the usual approach where the descriptors to be combined are selected by trial and error, considering the performance characteristics of the overall system. To illustrate the contribution of this protocol, we performed experimental studies using a set of descriptors and a set of symbols which are widely used by the community namely ART and SC descriptors and the GREC 2003 database. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; IF 1.091 |
Approved |
no |
|
|
Call Number |
Admin @ si @VRT2011 |
Serial |
1856 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Gomez; Anguelos Nicolaou; Dimosthenis Karatzas |
|
|
Title |
Improving patch‐based scene text script identification with ensembles of conjoined networks |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
67 |
Issue |
|
Pages |
85-96 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.084; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GNK2017 |
Serial |
2887 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; Josep Llados |
|
|
Title |
A Performance Evaluation Protocol for Symbol Spotting Systems in Terms of Recognition and Location Indices |
Type |
Journal Article |
|
Year |
2009 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
IJDAR |
|
|
Volume |
12 |
Issue |
2 |
Pages |
83-96 |
|
|
Keywords |
Performance evaluation; Symbol Spotting; Graphics Recognition |
|
|
Abstract |
Symbol spotting systems are intended to retrieve regions of interest from a document image database where the queried symbol is likely to be found. They shall have the ability to recognize and locate graphical symbols in a single step. In this paper, we present a set of measures to evaluate the performance of a symbol spotting system in terms of recognition abilities, location accuracy and scalability. We show that the proposed measures allow to determine the weaknesses and strengths of different methods. In particular we have tested a symbol spotting method based on a set of four different off-the-shelf shape descriptors. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1433-2833 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ RuL2009a |
Serial |
1166 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol |
|
|
Title |
Classificació semàntica i visual de documents digitals |
Type |
Journal |
|
Year |
2019 |
Publication |
Revista de biblioteconomia i documentacio |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
75-86 |
|
|
Keywords |
|
|
|
Abstract |
Se analizan los sistemas de procesamiento automático que trabajan sobre documentos digitalizados con el objetivo de describir los contenidos. De esta forma contribuyen a facilitar el acceso, permitir la indización automática y hacer accesibles los documentos a los motores de búsqueda. El objetivo de estas tecnologías es poder entrenar modelos computacionales que sean capaces de clasificar, agrupar o realizar búsquedas sobre documentos digitales. Así, se describen las tareas de clasificación, agrupamiento y búsqueda. Cuando utilizamos tecnologías de inteligencia artificial en los sistemas de
clasificación esperamos que la herramienta nos devuelva etiquetas semánticas; en sistemas de agrupamiento que nos devuelva documentos agrupados en clusters significativos; y en sistemas de búsqueda esperamos que dada una consulta, nos devuelva una lista ordenada de documentos en función de la relevancia. A continuación se da una visión de conjunto de los métodos que nos permiten describir los documentos digitales, tanto de manera visual (cuál es su apariencia), como a partir de sus contenidos semánticos (de qué hablan). En cuanto a la descripción visual de documentos se aborda el estado de la cuestión de las representaciones numéricas de documentos digitalizados
tanto por métodos clásicos como por métodos basados en el aprendizaje profundo (deep learning). Respecto de la descripción semántica de los contenidos se analizan técnicas como el reconocimiento óptico de caracteres (OCR); el cálculo de estadísticas básicas sobre la aparición de las diferentes palabras en un texto (bag-of-words model); y los métodos basados en aprendizaje profundo como el método word2vec, basado en una red neuronal que, dadas unas cuantas palabras de un texto, debe predecir cuál será la
siguiente palabra. Desde el campo de las ingenierías se están transfiriendo conocimientos que se han integrado en productos o servicios en los ámbitos de la archivística, la biblioteconomía, la documentación y las plataformas de gran consumo, sin embargo los algoritmos deben ser lo suficientemente eficientes no sólo para el reconocimiento y transcripción literal sino también para la capacidad de interpretación de los contenidos. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.084; 600.135; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ Rus2019 |
Serial |
3282 |
|
Permanent link to this record |