|
Christophe Rigaud, Dimosthenis Karatzas, Jean-Christophe Burie and Jean-Marc Ogier. 2013. Speech balloon contour classification in comics. 10th IAPR International Workshop on Graphics Recognition.
Abstract: Comic books digitization combined with subsequent comic book understanding create a variety of new applications, including mobile reading and data mining. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. In this work we detail a novel approach for classifying speech balloon in scanned comics book pages based on their contour time series.
|
|
|
Lluis Pere de las Heras, David Fernandez, Alicia Fornes, Ernest Valveny, Gemma Sanchez and Josep Llados. 2013. Runlength Histogram Image Signature for Perceptual Retrieval of Architectural Floor Plans. 10th IAPR International Workshop on Graphics Recognition.
|
|
|
Lluis Pere de las Heras, Ernest Valveny and Gemma Sanchez. 2013. Unsupervised and Notation-Independent Wall Segmentation in Floor Plans Using a Combination of Statistical and Structural Strategies. 10th IAPR International Workshop on Graphics Recognition.
|
|
|
Pau Riba, Alicia Fornes and Josep Llados. 2015. Towards the Alignment of Handwritten Music Scores. In Bart Lamiroy and Rafael Dueire Lins, eds. 11th IAPR International Workshop on Graphics Recognition. Springer International Publishing. (LNCS.)
Abstract: It is very common to find different versions of the same music work in archives of Opera Theaters. These differences correspond to modifications and annotations from the musicians. From the musicologist point of view, these variations are very interesting and deserve study. This paper explores the alignment of music scores as a tool for automatically detecting the passages that contain such differences. Given the difficulties in the recognition of handwritten music scores, our goal is to align the music scores and at the same time, avoid the recognition of music elements as much as possible. After removing the staff lines, braces and ties, the bar lines are detected. Then, the bar units are described as a whole using the Blurred Shape Model. The bar units alignment is performed by using Dynamic Time Warping. The analysis of the alignment path is used to detect the variations in the music scores. The method has been evaluated on a subset of the CVC-MUSCIMA dataset, showing encouraging results.
|
|
|
Hana Jarraya, Muhammad Muzzamil Luqman and Jean-Yves Ramel. 2017. Improving Fuzzy Multilevel Graph Embedding Technique by Employing Topological Node Features: An Application to Graphics Recognition. In B. Lamiroy and R Dueire Lins, eds. Graphics Recognition. Current Trends and Challenges. Springer. (LNCS.)
|
|
|
Hana Jarraya, Oriol Ramos Terrades and Josep Llados. 2017. Learning structural loss parameters on graph embedding applied on symbolic graphs. 12th IAPR International Workshop on Graphics Recognition.
Abstract: We propose an amelioration of proposed Graph Embedding (GEM) method in previous work that takes advantages of structural pattern representation and the structured distortion. it models an Attributed Graph (AG) as a Probabilistic Graphical Model (PGM). Then, it learns the parameters of this PGM presented by a vector, as new signature of AG in a lower dimensional vectorial space. We focus to adapt the structured learning algorithm via 1_slack formulation with a suitable risk function, called Graph Edit Distance (GED). It defines the dissimilarity of the ground truth and predicted graph labels. It determines by the error tolerant graph matching using bipartite graph matching algorithm. We apply Structured Support Vector Machines (SSVM) to process classification task. During our experiments, we got our results on the GREC dataset.
|
|
|
Adria Rico and Alicia Fornes. 2017. Camera-based Optical Music Recognition using a Convolutional Neural Network. 12th IAPR International Workshop on Graphics Recognition.27–28.
Abstract: Optical Music Recognition (OMR) consists in recognizing images of music scores. Contrary to expectation, the current OMR systems usually fail when recognizing images of scores captured by digital cameras and smartphones. In this work, we propose a camera-based OMR system based on Convolutional Neural Networks, showing promising preliminary results
Keywords: optical music recognition; document analysis; convolutional neural network; deep learning
|
|
|
Arnau Baro, Pau Riba, Jorge Calvo-Zaragoza and Alicia Fornes. 2018. Optical Music Recognition by Long Short-Term Memory Networks. In A. Fornes, B.L., ed. Graphics Recognition. Current Trends and Evolutions. Springer, 81–95. (LNCS.)
Abstract: Optical Music Recognition refers to the task of transcribing the image of a music score into a machine-readable format. Many music scores are written in a single staff, and therefore, they could be treated as a sequence. Therefore, this work explores the use of Long Short-Term Memory (LSTM) Recurrent Neural Networks for reading the music score sequentially, where the LSTM helps in keeping the context. For training, we have used a synthetic dataset of more than 40000 images, labeled at primitive level. The experimental results are promising, showing the benefits of our approach.
Keywords: Optical Music Recognition; Recurrent Neural Network; Long ShortTerm Memory
|
|
|
Asma Bensalah, Pau Riba, Alicia Fornes and Josep Llados. 2019. Shoot less and Sketch more: An Efficient Sketch Classification via Joining Graph Neural Networks and Few-shot Learning. 13th IAPR International Workshop on Graphics Recognition.80–85.
Abstract: With the emergence of the touchpad devices and drawing tablets, a new era of sketching started afresh. However, the recognition of sketches is still a tough task due to the variability of the drawing styles. Moreover, in some application scenarios there is few labelled data available for training,
which imposes a limitation for deep learning architectures. In addition, in many cases there is a need to generate models able to adapt to new classes. In order to cope with these limitations, we propose a method based on few-shot learning and graph neural networks for classifying sketches aiming for an efficient neural model. We test our approach with several databases of
sketches, showing promising results.
Keywords: Sketch classification; Convolutional Neural Network; Graph Neural Network; Few-shot learning
|
|
|
Pau Torras, Mohamed Ali Souibgui, Jialuo Chen and Alicia Fornes. 2021. A Transcription Is All You Need: Learning to Align through Attention. 14th IAPR International Workshop on Graphics Recognition.141–146. (LNCS.)
Abstract: Historical ciphered manuscripts are a type of document where graphical symbols are used to encrypt their content instead of regular text. Nowadays, expert transcriptions can be found in libraries alongside the corresponding manuscript images. However, those transcriptions are not aligned, so these are barely usable for training deep learning-based recognition methods. To solve this issue, we propose a method to align each symbol in the transcript of an image with its visual representation by using an attention-based Sequence to Sequence (Seq2Seq) model. The core idea is that, by learning to recognise symbols sequence within a cipher line image, the model also identifies their position implicitly through an attention mechanism. Thus, the resulting symbol segmentation can be later used for training algorithms. The experimental evaluation shows that this method is promising, especially taking into account the small size of the cipher dataset.
|
|