|
Marçal Rusiñol, David Aldavert, Ricardo Toledo and Josep Llados. 2015. Towards Query-by-Speech Handwritten Keyword Spotting. 13th International Conference on Document Analysis and Recognition ICDAR2015.501–505.
Abstract: In this paper, we present a new querying paradigm for handwritten keyword spotting. We propose to represent handwritten word images both by visual and audio representations, enabling a query-by-speech keyword spotting system. The two representations are merged together and projected to a common sub-space in the training phase. This transform allows to, given a spoken query, retrieve word instances that were only represented by the visual modality. In addition, the same method can be used backwards at no additional cost to produce a handwritten text-tospeech system. We present our first results on this new querying mechanism using synthetic voices over the George Washington
dataset.
|
|
|
Rahat Khan, Joost Van de Weijer, Dimosthenis Karatzas and Damien Muselet. 2013. Towards multispectral data acquisition with hand-held devices. 20th IEEE International Conference on Image Processing.2053–2057.
Abstract: We propose a method to acquire multispectral data with handheld devices with front-mounted RGB cameras. We propose to use the display of the device as an illuminant while the camera captures images illuminated by the red, green and
blue primaries of the display. Three illuminants and three response functions of the camera lead to nine response values which are used for reflectance estimation. Results are promising and show that the accuracy of the spectral reconstruction improves in the range from 30-40% over the spectral
reconstruction based on a single illuminant. Furthermore, we propose to compute sensor-illuminant aware linear basis by discarding the part of the reflectances that falls in the sensorilluminant null-space. We show experimentally that optimizing reflectance estimation on these new basis functions decreases
the RMSE significantly over basis functions that are independent to sensor-illuminant. We conclude that, multispectral data acquisition is potentially possible with consumer hand-held devices such as tablets, mobiles, and laptops, opening up applications which are currently considered to be unrealistic.
Keywords: Multispectral; mobile devices; color measurements
|
|
|
Antonio Clavelli, Dimosthenis Karatzas, Josep Llados, Mario Ferraro and Giuseppe Boccignone. 2013. Towards Modelling an Attention-Based Text Localization Process. 6th Iberian Conference on Pattern Recognition and Image Analysis. Springer Berlin Heidelberg, 296–303. (LNCS.)
Abstract: This note introduces a visual attention model of text localization in real-world scenes. The core of the model built upon the proto-object concept is discussed. It is shown how such dynamic mid-level representation of the scene can be derived in the framework of an action-perception loop engaging salience, text information value computation, and eye guidance mechanisms.
Preliminary results that compare model generated scanpaths with those eye-tracked from human subjects are presented.
Keywords: text localization; visual attention; eye guidance
|
|
|
Arnau Baro, Jialuo Chen, Alicia Fornes and Beata Megyesi. 2019. Towards a generic unsupervised method for transcription of encoded manuscripts. 3rd International Conference on Digital Access to Textual Cultural Heritage.73–78.
Abstract: Historical ciphers, a special type of manuscripts, contain encrypted information, important for the interpretation of our history. The first step towards decipherment is to transcribe the images, either manually or by automatic image processing techniques. Despite the improvements in handwritten text recognition (HTR) thanks to deep learning methodologies, the need of labelled data to train is an important limitation. Given that ciphers often use symbol sets across various alphabets and unique symbols without any transcription scheme available, these supervised HTR techniques are not suitable to transcribe ciphers. In this paper we propose an un-supervised method for transcribing encrypted manuscripts based on clustering and label propagation, which has been successfully applied to community detection in networks. We analyze the performance on ciphers with various symbol sets, and discuss the advantages and drawbacks compared to supervised HTR methods.
Keywords: A. Baró, J. Chen, A. Fornés, B. Megyesi.
|
|
|
Partha Pratim Roy, Umapada Pal and Josep Llados. 2009. Touching Text Character Localization in Graphical Documents using SIFT. In proceedings 8th IAPR International Workshop on Graphics Recognition.
Abstract: Interpretation of graphical document images is a challenging task as it requires proper understanding of text/graphics symbols present in such documents. Difficulties arise in graphical document recognition when text and symbol overlapped/touched. Intersection of text and symbols with graphical lines and curves occur frequently in graphical documents and hence separation of such symbols is very difficult.
Several pattern recognition and classification techniques exist to recognize isolated text/symbol. But, the touching/overlapping text and symbol recognition has not yet been dealt successfully. An interesting technique, Scale Invariant Feature Transform (SIFT), originally devised for object recognition can take care of overlapping problems. Even if SIFT features have emerged as a very powerful object descriptors, their employment in graphical documents context has not been investigated much. In this paper we present the adaptation of the SIFT approach in the context of text character localization (spotting) in graphical documents. We evaluate the applicability of this technique in such documents and discuss the scope of improvement by combining some state-of-the-art approaches.
|
|
|
Partha Pratim Roy, Umapada Pal and Josep Llados. 2010. Touching Text Character Localization in Graphical Documents using SIFT. Graphics Recognition. Achievements, Challenges, and Evolution. 8th International Workshop, GREC 2009. Selected Papers. Springer Berlin Heidelberg, 199–211. (LNCS.)
Abstract: Interpretation of graphical document images is a challenging task as it requires proper understanding of text/graphics symbols present in such documents. Difficulties arise in graphical document recognition when text and symbol overlapped/touched. Intersection of text and symbols with graphical lines and curves occur frequently in graphical documents and hence separation of such symbols is very difficult.
Several pattern recognition and classification techniques exist to recognize isolated text/symbol. But, the touching/overlapping text and symbol recognition has not yet been dealt successfully. An interesting technique, Scale Invariant Feature Transform (SIFT), originally devised for object recognition can take care of overlapping problems. Even if SIFT features have emerged as a very powerful object descriptors, their employment in graphical documents context has not been investigated much. In this paper we present the adaptation of the SIFT approach in the context of text character localization (spotting) in graphical documents. We evaluate the applicability of this technique in such documents and discuss the scope of improvement by combining some state-of-the-art approaches.
Keywords: Support Vector Machine; Text Component; Graphical Line; Document Image; Scale Invariant Feature Transform
|
|
|
Debora Gil, Oriol Ramos Terrades and Raquel Perez. 2020. Topological Radiomics (TOPiomics): Early Detection of Genetic Abnormalities in Cancer Treatment Evolution. Women in Geometry and Topology.
|
|
|
Debora Gil, Oriol Ramos Terrades and Raquel Perez. 2021. Topological Radiomics (TOPiomics): Early Detection of Genetic Abnormalities in Cancer Treatment Evolution. Extended Abstracts GEOMVAP 2019, Trends in Mathematics 15. Springer Nature, 89–93.
Abstract: Abnormalities in radiomic measures correlate to genomic alterations prone to alter the outcome of personalized anti-cancer treatments. TOPiomics is a new method for the early detection of variations in tumor imaging phenotype from a topological structure in multi-view radiomic spaces.
|
|
|
Alicia Fornes and 11 others. 2022. The RPM3D Project: 3D Kinematics for Remote Patient Monitoring. Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022.217–226. (LNCS.)
Abstract: This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches. We base our analysis on the Kinematic Theory of Rapid Human Movement. We have validated our research in a real case scenario for stroke rehabilitation at the Guttmann Institute (https://www.guttmann.com/en/) (neurorehabilitation hospital), showing promising results. Our work could have a great impact in remote healthcare applications, improving the medical efficiency and reducing the healthcare costs. Future steps include more clinical validation, developing multi-modal analysis architectures (analysing data from sensors, images, audio, etc.), and exploring the application of our technology to monitor other neurodegenerative diseases.
Keywords: Healthcare applications; Kinematic; Theory of Rapid Human Movements; Human activity recognition; Stroke rehabilitation; 3D kinematics
|
|
|
Marçal Rusiñol and Josep Llados. 2012. The Role of the Users in Handwritten Word Spotting Applications: Query Fusion and Relevance Feedback. 13th International Conference on Frontiers in Handwriting Recognition.55–60.
Abstract: In this paper we present the importance of including the user in the loop in a handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in the handwritten word spotting context. The increase in terms of precision when the user is included in the loop is assessed using two datasets of historical handwritten documents and a baseline word spotting approach based on a bag-of-visual-words model.
|
|