Oriol Ramos Terrades, Alejandro Hector Toselli, Nicolas Serrano, Veronica Romero, Enrique Vidal, & Alfons Juan. (2010). Interactive layout analysis and transcription systems for historic handwritten documents. In 10th ACM Symposium on Document Engineering (219–222).
Abstract: The amount of digitized legacy documents has been rising dramatically over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents, waiting to be classified and finally transcribed into a textual electronic format (such as ASCII or PDF). Nevertheless, most of the available fully-automatic applications addressing this task are far from being perfect and heavy and inefficient human intervention is often required to check and correct the results of such systems. In contrast, multimodal interactive-predictive approaches may allow the users to participate in the process helping the system to improve the overall performance. With this in mind, two sets of recent advances are introduced in this work: a novel interactive method for text block detection and two multimodal interactive handwritten text transcription systems which use active learning and interactive-predictive technologies in the recognition process.
Keywords: Handwriting recognition; Interactive predictive processing; Partial supervision; Interactive layout analysis
|
Fadi Dornaika, Francisco Javier Orozco, & Jordi Gonzalez. (2006). Combined Head, Lips, Eyebrows, and Eyelids Tracking Using Adaptive Appearance Models. In IV Conference on Articulated Motion and Deformable Objects (AMDO´06), LNCS 4069: 110–119.
|
Ivan Huerta, Dani Rowe, Jordi Gonzalez, & Juan J. Villanueva. (2006). Efficient Incorporation of Motionless Foreground Objects for Adaptive Background Segmentation. In IV Conference on Articulated Motion and Deformable Objects (AMDO´06), LNCS 4069: 424–433.
|
Ignasi Rius, Javier Varona, Xavier Roca, & Jordi Gonzalez. (2006). Posture Constraints for Bayesian Human Motion Tracking. In IV Conference on Articulated Motion and Deformable Objects (AMDO´06), LNCS 4069: 414–423.
|
Albert Clapes, Miguel Reyes, & Sergio Escalera. (2012). User Identification and Object Recognition in Clutter Scenes Based on RGB-Depth Analysis. In 7th Conference on Articulated Motion and Deformable Objects (Vol. 7378, pp. 1–11). LNCS. Springer Berlin Heidelberg.
Abstract: We propose an automatic system for user identification and object recognition based on multi-modal RGB-Depth data analysis. We model a RGBD environment learning a pixel-based background Gaussian distribution. Then, user and object candidate regions are detected and recognized online using robust statistical approaches over RGBD descriptions. Finally, the system saves the historic of user-object assignments, being specially useful for surveillance scenarios. The system has been evaluated on a novel data set containing different indoor/outdoor scenarios, objects, and users, showing accurate recognition and better performance than standard state-of-the-art approaches.
|
Wenjuan Gong, Jordi Gonzalez, Joao Manuel R. S. Taveres, & Xavier Roca. (2012). A New Image Dataset on Human Interactions. In 7th Conference on Articulated Motion and Deformable Objects (Vol. 7378, pp. 204–209). Springer Berlin Heidelberg.
Abstract: This article describes a new collection of still image dataset which are dedicated to interactions between people. Human action recognition from still images have been a hot topic recently, but most of them are actions performed by a single person, like running, walking, riding bikes, phoning and so on and there is no interactions between people in one image. The dataset collected in this paper are concentrating on human interaction between two people aiming to explore this new topic in the research area of action recognition from still images.
|
Sergio Escalera. (2012). Human Behavior Analysis From Depth Maps. In F.J. Perales, R.B. Fisher, & T.B. Moeslund (Eds.), 7th Conference on Articulated Motion and Deformable Objects (Vol. 7378, pp. 282–292). Springer Heidelberg.
Abstract: Pose Recovery (PR) and Human Behavior Analysis (HBA) have been a main focus of interest from the beginnings of Computer Vision and Machine Learning. PR and HBA were originally addressed by the analysis of still images and image sequences. More recent strategies consisted of Motion Capture technology (MOCAP), based on the synchronization of multiple cameras in controlled environments; and the analysis of depth maps from Time-of-Flight (ToF) technology, based on range image recording from distance sensor measurements. Recently, with the appearance of the multi-modal RGBD information provided by the low cost Kinect \textsfTM sensor (from RGB and Depth, respectively), classical methods for PR and HBA have been redefined, and new strategies have been proposed. In this paper, the recent contributions and future trends of multi-modal RGBD data analysis for PR and HBA are reviewed and discussed.
|
Xavier Perez Sala, Cecilio Angulo, & Sergio Escalera. (2011). Biologically Inspired Path Execution Using SURF Flow in Robot Navigation. In 11th International Work Conference on Artificial Neural Networks (Vol. II, pp. 581–588). Springer Berlin Heidelberg.
Abstract: An exportable and robust system using only camera images is proposed for path execution in robot navigation. Motion information is extracted in the form of optical flow from SURF robust descriptors of consecutive frames, so the method is called SURF flow. This information is used to correct robot displacement when a straight forward path command is sent to the robot, but it is not really executed due to several robot and environmental concerns. The proposed system has been successfully tested on the legged robot Aibo.
|
Alicia Fornes, Josep Llados, Joan Mas, Joana Maria Pujadas-Mora, & Anna Cabre. (2014). A Bimodal Crowdsourcing Platform for Demographic Historical Manuscripts. In Digital Access to Textual Cultural Heritage Conference (pp. 103–108).
Abstract: In this paper we present a crowdsourcing web-based application for extracting information from demographic handwritten document images. The proposed application integrates two points of view: the semantic information for demographic research, and the ground-truthing for document analysis research. Concretely, the application has the contents view, where the information is recorded into forms, and the labeling view, with the word labels for evaluating document analysis techniques. The crowdsourcing architecture allows to accelerate the information extraction (many users can work simultaneously), validate the information, and easily provide feedback to the users. We finally show how the proposed application can be extended to other kind of demographic historical manuscripts.
|
Eduardo Aguilar, & Petia Radeva. (2019). Food Recognition by Integrating Local and Flat Classifiers. In 9th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 11867, pp. 65–74). LNCS.
Abstract: The recognition of food image is an interesting research topic, in which its applicability in the creation of nutritional diaries stands out with the aim of improving the quality of life of people with a chronic disease (e.g. diabetes, heart disease) or prone to acquire it (e.g. people with overweight or obese). For a food recognition system to be useful in real applications, it is necessary to recognize a huge number of different foods. We argue that for very large scale classification, a traditional flat classifier is not enough to acquire an acceptable result. To address this, we propose a method that performs prediction with local classifiers, based on a class hierarchy, or with flat classifier. We decide which approach to use, depending on the analysis of both the Epistemic Uncertainty obtained for the image in the children classifiers and the prediction of the parent classifier. When our criterion is met, the final prediction is obtained with the respective local classifier; otherwise, with the flat classifier. From the results, we can see that the proposed method improves the classification performance compared to the use of a single flat classifier.
|
Gemma Rotger, Francesc Moreno-Noguer, Felipe Lumbreras, & Antonio Agudo. (2019). Single view facial hair 3D reconstruction. In 9th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 11867, pp. 423–436). LNCS.
Abstract: n this work, we introduce a novel energy-based framework that addresses the challenging problem of 3D reconstruction of facial hair from a single RGB image. To this end, we identify hair pixels over the image via texture analysis and then determine individual hair fibers that are modeled by means of a parametric hair model based on 3D helixes. We propose to minimize an energy composed of several terms, in order to adapt the hair parameters that better fit the image detections. The final hairs respond to the resulting fibers after a post-processing step where we encourage further realism. The resulting approach generates realistic facial hair fibers from solely an RGB image without assuming any training data nor user interaction. We provide an experimental evaluation on real-world pictures where several facial hair styles and image conditions are observed, showing consistent results and establishing a comparison with respect to competing approaches.
Keywords: 3D Vision; Shape Reconstruction; Facial Hair Modeling
|
M. Gomez, J. Mauri, E. Fernandez-Nofrerias, Oriol Rodriguez-Leor, Carme Julia, Misael Rosales, et al. (2002). Modelo fisico para la simulacion de ultrasonido Intravascular. XXXVIII Congreso Nacional de la Sociedad Española de Cardiologia..
|
M. Gomez, J. Mauri, E. Fernandez-Nofrerias, Oriol Rodriguez-Leor, Carme Julia, Petia Radeva, et al. (2002). Nuevos Avances para la correlacion de imagenes angiograficas y de ecograia intracoronaria..
|
Robert Benavente, & Maria Vanrell. (2007). Parametrizacion del Espacio de Categorias de Color.
|
Jorge Bernal, F. Javier Sanchez, & Fernando Vilariño. (2010). Reduction of Pattern Search Area in Colonoscopy Images by Merging Non-Informative Regions. In 28th Congreso Anual de la Sociedad Española de Ingeniería Biomédica.
Abstract: One of the first usual steps in pattern recognition schemas is image segmentation, in order to reduce the dimensionality of the problem and manage smaller quantity of data. In our case as we are pursuing real-time colon cancer polyp detection, this step is crucial. In this paper we present a non-informative region estimation algorithm that will let us discard some parts of the image where we will not expect to find colon cancer polyps. The performance of our approach will be measured in terms of both non-informative areas elimination and polyps’ areas preserving. The results obtained show the importance of having correct non- informative region estimation in order to fasten the whole recognition process.
|