|
Anguelos Nicolaou, Sounak Dey, V.Christlein, A.Maier and Dimosthenis Karatzas. 2018. Non-deterministic Behavior of Ranking-based Metrics when Evaluating Embeddings. International Workshop on Reproducible Research in Pattern Recognition.71–82. (LNCS.)
Abstract: Embedding data into vector spaces is a very popular strategy of pattern recognition methods. When distances between embeddings are quantized, performance metrics become ambiguous. In this paper, we present an analysis of the ambiguity quantized distances introduce and provide bounds on the effect. We demonstrate that it can have a measurable effect in empirical data in state-of-the-art systems. We also approach the phenomenon from a computer security perspective and demonstrate how someone being evaluated by a third party can exploit this ambiguity and greatly outperform a random predictor without even access to the input data. We also suggest a simple solution making the performance metrics, which rely on ranking, totally deterministic and impervious to such exploits.
|
|
|
Dena Bazazian, Dimosthenis Karatzas and Andrew Bagdanov. 2018. Word Spotting in Scene Images based on Character Recognition. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.1872–1874.
Abstract: In this paper we address the problem of unconstrained Word Spotting in scene images. We train a Fully Convolutional Network to produce heatmaps of all the character classes. Then, we employ the Text Proposals approach and, via a rectangle classifier, detect the most likely rectangle for each query word based on the character attribute maps. We evaluate the proposed method on ICDAR2015 and show that it is capable of identifying and recognizing query words in natural scene images.
|
|
|
Suman Ghosh. 2018. Word Spotting and Recognition in Images from Heterogeneous Sources A. (Ph.D. thesis, Ediciones Graficas Rey.)
Abstract: Text is the most common way of information sharing from ages. With recent development of personal images databases and handwritten historic manuscripts the demand for algorithms to make these databases accessible for browsing and indexing are in rise. Enabling search or understanding large collection of manuscripts or image databases needs fast and robust methods. Researchers have found different ways to represent cropped words for understanding and matching, which works well when words are already segmented. However there is no trivial way to extend these for non-segmented documents. In this thesis we explore different methods for text retrieval and recognition from unsegmented document and scene images. Two different ways of representation exist in literature, one uses a fixed length representation learned from cropped words and another a sequence of features of variable length. Throughout this thesis, we have studied both these representation for their suitability in segmentation free understanding of text. In the first part we are focused on segmentation free word spotting using a fixed length representation. We extended the use of the successful PHOC (Pyramidal Histogram of Character) representation to segmentation free retrieval. In the second part of the thesis, we explore sequence based features and finally, we propose a unified solution where the same framework can generate both kind of representations.
|
|
|
Ilke Demir, Dena Bazazian, Adriana Romero, Viktoriia Sharmanska and Lyne P. Tchapmi. 2018. WiCV 2018: The Fourth Women In Computer Vision Workshop. 4th Women in Computer Vision Workshop.1941–19412.
Abstract: We present WiCV 2018 – Women in Computer Vision Workshop to increase the visibility and inclusion of women researchers in computer vision field, organized in conjunction with CVPR 2018. Computer vision and machine learning have made incredible progress over the past years, yet the number of female researchers is still low both in academia and industry. WiCV is organized to raise visibility of female researchers, to increase the collaboration,
and to provide mentorship and give opportunities to femaleidentifying junior researchers in the field. In its fourth year, we are proud to present the changes and improvements over the past years, summary of statistics for presenters and attendees, followed by expectations from future generations.
Keywords: Conferences; Computer vision; Industries; Object recognition; Engineering profession; Collaboration; Machine learning
|
|
|
Arnau Baro, Pau Riba and Alicia Fornes. 2018. A Starting Point for Handwritten Music Recognition. 1st International Workshop on Reading Music Systems.5–6.
Abstract: In the last years, the interest in Optical Music Recognition (OMR) has reawakened, especially since the appearance of deep learning. However, there are very few works addressing handwritten scores. In this work we describe a full OMR pipeline for handwritten music scores by using Convolutional and Recurrent Neural Networks that could serve as a baseline for the research community.
Keywords: Optical Music Recognition; Long Short-Term Memory; Convolutional Neural Networks; MUSCIMA++; CVCMUSCIMA
|
|
|
Anjan Dutta and Hichem Sahbi. 2018. Stochastic Graphlet Embedding. TNNLS, 1–14.
Abstract: Graph-based methods are known to be successful in many machine learning and pattern classification tasks. These methods consider semi-structured data as graphs where nodes correspond to primitives (parts, interest points, segments,
etc.) and edges characterize the relationships between these primitives. However, these non-vectorial graph data cannot be straightforwardly plugged into off-the-shelf machine learning algorithms without a preliminary step of – explicit/implicit –graph vectorization and embedding. This embedding process
should be resilient to intra-class graph variations while being highly discriminant. In this paper, we propose a novel high-order stochastic graphlet embedding (SGE) that maps graphs into vector spaces. Our main contribution includes a new stochastic search procedure that efficiently parses a given graph and extracts/samples unlimitedly high-order graphlets. We consider
these graphlets, with increasing orders, to model local primitives as well as their increasingly complex interactions. In order to build our graph representation, we measure the distribution of these graphlets into a given graph, using particular hash functions that efficiently assign sampled graphlets into isomorphic sets with a very low probability of collision. When
combined with maximum margin classifiers, these graphlet-based representations have positive impact on the performance of pattern comparison and recognition as corroborated through extensive experiments using standard benchmark databases.
Keywords: Stochastic graphlets; Graph embedding; Graph classification; Graph hashing; Betweenness centrality
|
|
|
Arnau Baro, Pau Riba, Jorge Calvo-Zaragoza and Alicia Fornes. 2018. Optical Music Recognition by Long Short-Term Memory Networks. In A. Fornes, B.L., ed. Graphics Recognition. Current Trends and Evolutions. Springer, 81–95. (LNCS.)
Abstract: Optical Music Recognition refers to the task of transcribing the image of a music score into a machine-readable format. Many music scores are written in a single staff, and therefore, they could be treated as a sequence. Therefore, this work explores the use of Long Short-Term Memory (LSTM) Recurrent Neural Networks for reading the music score sequentially, where the LSTM helps in keeping the context. For training, we have used a synthetic dataset of more than 40000 images, labeled at primitive level. The experimental results are promising, showing the benefits of our approach.
Keywords: Optical Music Recognition; Recurrent Neural Network; Long ShortTerm Memory
|
|
|
Marçal Rusiñol and Lluis Gomez. 2018. Avances en clasificación de imágenes en los últimos diez años. Perspectivas y limitaciones en el ámbito de archivos fotográficos históricos.
|
|
|
Francisco Cruz and Oriol Ramos Terrades. 2018. A probabilistic framework for handwritten text line segmentation.
Abstract: We successfully combine Expectation-Maximization algorithm and variational
approaches for parameter learning and computing inference on Markov random fields. This is a general method that can be applied to many computer
vision tasks. In this paper, we apply it to handwritten text line segmentation.
We conduct several experiments that demonstrate that our method deal with
common issues of this task, such as complex document layout or non-latin
scripts. The obtained results prove that our method achieve state-of-theart performance on different benchmark datasets without any particular fine
tuning step.
Keywords: Document Analysis; Text Line Segmentation; EM algorithm; Probabilistic Graphical Models; Parameter Learning
|
|
|
Pau Riba, Josep Llados and Alicia Fornes. 2017. Error-tolerant coarse-to-fine matching model for hierarchical graphs. In Pasquale Foggia, Cheng-Lin Liu and Mario Vento, eds. 11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition. Springer International Publishing, 107–117.
Abstract: Graph-based representations are effective tools to capture structural information from visual elements. However, retrieving a query graph from a large database of graphs implies a high computational complexity. Moreover, these representations are very sensitive to noise or small changes. In this work, a novel hierarchical graph representation is designed. Using graph clustering techniques adapted from graph-based social media analysis, we propose to generate a hierarchy able to deal with different levels of abstraction while keeping information about the topology. For the proposed representations, a coarse-to-fine matching method is defined. These approaches are validated using real scenarios such as classification of colour images and handwritten word spotting.
Keywords: Graph matching; Hierarchical graph; Graph-based representation; Coarse-to-fine matching
|
|