|
Records |
Links |
|
Author |
L. Rothacker; Marçal Rusiñol; Josep Llados; G.A. Fink |

|
|
Title |
A Two-stage Approach to Segmentation-Free Query-by-example Word Spotting |
Type |
Journal |
|
Year |
2014 |
Publication |
Manuscript Cultures |
Abbreviated Journal  |
|
|
|
Volume |
7 |
Issue |
|
Pages |
47-58 |
|
|
Keywords |
|
|
|
Abstract |
With the ongoing progress in digitization, huge document collections and archives have become available to a broad audience. Scanned document images can be transmitted electronically and studied simultaneously throughout the world. While this is very beneficial, it is often impossible to perform automated searches on these document collections. Optical character recognition usually fails when it comes to handwritten or historic documents. In order to address the need for exploring document collections rapidly, researchers are working on word spotting. In query-by-example word spotting scenarios, the user selects an exemplary occurrence of the query word in a document image. The word spotting system then retrieves all regions in the collection that are visually similar to the given example of the query word. The best matching regions are presented to the user and no actual transcription is required.
An important property of a word spotting system is the computational speed with which queries can be executed. In our previous work, we presented a relatively slow but high-precision method. In the present work, we will extend this baseline system to an integrated two-stage approach. In a coarse-grained first stage, we will filter document images efficiently in order to identify regions that are likely to contain the query word. In the fine-grained second stage, these regions will be analyzed with our previously presented high-precision method. Finally, we will report recognition results and query times for the well-known George Washington
benchmark in our evaluation. We achieve state-of-the-art recognition results while the query times can be reduced to 50% in comparison with our baseline. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.061; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3190 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; Lluis Gomez |

|
|
Title |
Avances en clasificación de imágenes en los últimos diez años. Perspectivas y limitaciones en el ámbito de archivos fotográficos históricos |
Type |
Journal |
|
Year |
2018 |
Publication |
Revista anual de la Asociación de Archiveros de Castilla y León |
Abbreviated Journal  |
|
|
|
Volume |
21 |
Issue |
|
Pages |
161-174 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RuG2018 |
Serial |
3239 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol |

|
|
Title |
Classificació semàntica i visual de documents digitals |
Type |
Journal |
|
Year |
2019 |
Publication |
Revista de biblioteconomia i documentacio |
Abbreviated Journal  |
|
|
|
Volume |
|
Issue |
|
Pages |
75-86 |
|
|
Keywords |
|
|
|
Abstract |
Se analizan los sistemas de procesamiento automático que trabajan sobre documentos digitalizados con el objetivo de describir los contenidos. De esta forma contribuyen a facilitar el acceso, permitir la indización automática y hacer accesibles los documentos a los motores de búsqueda. El objetivo de estas tecnologías es poder entrenar modelos computacionales que sean capaces de clasificar, agrupar o realizar búsquedas sobre documentos digitales. Así, se describen las tareas de clasificación, agrupamiento y búsqueda. Cuando utilizamos tecnologías de inteligencia artificial en los sistemas de
clasificación esperamos que la herramienta nos devuelva etiquetas semánticas; en sistemas de agrupamiento que nos devuelva documentos agrupados en clusters significativos; y en sistemas de búsqueda esperamos que dada una consulta, nos devuelva una lista ordenada de documentos en función de la relevancia. A continuación se da una visión de conjunto de los métodos que nos permiten describir los documentos digitales, tanto de manera visual (cuál es su apariencia), como a partir de sus contenidos semánticos (de qué hablan). En cuanto a la descripción visual de documentos se aborda el estado de la cuestión de las representaciones numéricas de documentos digitalizados
tanto por métodos clásicos como por métodos basados en el aprendizaje profundo (deep learning). Respecto de la descripción semántica de los contenidos se analizan técnicas como el reconocimiento óptico de caracteres (OCR); el cálculo de estadísticas básicas sobre la aparición de las diferentes palabras en un texto (bag-of-words model); y los métodos basados en aprendizaje profundo como el método word2vec, basado en una red neuronal que, dadas unas cuantas palabras de un texto, debe predecir cuál será la
siguiente palabra. Desde el campo de las ingenierías se están transfiriendo conocimientos que se han integrado en productos o servicios en los ámbitos de la archivística, la biblioteconomía, la documentación y las plataformas de gran consumo, sin embargo los algoritmos deben ser lo suficientemente eficientes no sólo para el reconocimiento y transcripción literal sino también para la capacidad de interpretación de los contenidos. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.084; 600.135; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ Rus2019 |
Serial |
3282 |
|
Permanent link to this record |
|
|
|
|
Author |
Oriol Ramos Terrades; Albert Berenguel; Debora Gil |


|
|
Title |
A Flexible Outlier Detector Based on a Topology Given by Graph Communities |
Type |
Journal Article |
|
Year |
2022 |
Publication |
Big Data Research |
Abbreviated Journal  |
BDR |
|
|
Volume |
29 |
Issue |
|
Pages |
100332 |
|
|
Keywords |
Classification algorithms; Detection algorithms; Description of feature space local structure; Graph communities; Machine learning algorithms; Outlier detectors |
|
|
Abstract |
Outlier detection is essential for optimal performance of machine learning methods and statistical predictive models. Their detection is especially determinant in small sample size unbalanced problems, since in such settings outliers become highly influential and significantly bias models. This particular experimental settings are usual in medical applications, like diagnosis of rare pathologies, outcome of experimental personalized treatments or pandemic emergencies. In contrast to population-based methods, neighborhood based local approaches compute an outlier score from the neighbors of each sample, are simple flexible methods that have the potential to perform well in small sample size unbalanced problems. A main concern of local approaches is the impact that the computation of each sample neighborhood has on the method performance. Most approaches use a distance in the feature space to define a single neighborhood that requires careful selection of several parameters, like the number of neighbors.
This work presents a local approach based on a local measure of the heterogeneity of sample labels in the feature space considered as a topological manifold. Topology is computed using the communities of a weighted graph codifying mutual nearest neighbors in the feature space. This way, we provide with a set of multiple neighborhoods able to describe the structure of complex spaces without parameter fine tuning. The extensive experiments on real-world and synthetic data sets show that our approach outperforms, both, local and global strategies in multi and single view settings. |
|
|
Address |
August 28, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; IAM; 600.140; 600.121; 600.139; 600.145; 600.159 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RBG2022a |
Serial |
3718 |
|
Permanent link to this record |
|
|
|
|
Author |
Antonio Clavelli; Dimosthenis Karatzas; Josep Llados; Mario Ferraro; Giuseppe Boccignone |


|
|
Title |
Modelling task-dependent eye guidance to objects in pictures |
Type |
Journal Article |
|
Year |
2014 |
Publication |
Cognitive Computation |
Abbreviated Journal  |
CoCom |
|
|
Volume |
6 |
Issue |
3 |
Pages |
558-584 |
|
|
Keywords |
Visual attention; Gaze guidance; Value; Payoff; Stochastic fixation prediction |
|
|
Abstract |
5Y Impact Factor: 1.14 / 3rd (Computer Science, Artificial Intelligence)
We introduce a model of attentional eye guidance based on the rationale that the deployment of gaze is to be considered in the context of a general action-perception loop relying on two strictly intertwined processes: sensory processing, depending on current gaze position, identifies sources of information that are most valuable under the given task; motor processing links such information with the oculomotor act by sampling the next gaze position and thus performing the gaze shift. In such a framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the payoff of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects. The different levels of the action-perception loop are represented in probabilistic form and eventually give rise to a stochastic process that generates the gaze sequence. This way the model also accounts for statistical properties of gaze shifts such as individual scan path variability. Results of the simulations are compared either with experimental data derived from publicly available datasets and from our own experiments. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer US |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1866-9956 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.056; 600.045; 605.203; 601.212; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CKL2014 |
Serial |
2419 |
|
Permanent link to this record |