|
Anders Hast, & Alicia Fornes. (2016). A Segmentation-free Handwritten Word Spotting Approach by Relaxed Feature Matching. In 12th IAPR Workshop on Document Analysis Systems (pp. 150–155).
Abstract: The automatic recognition of historical handwritten documents is still considered challenging task. For this reason, word spotting emerges as a good alternative for making the information contained in these documents available to the user. Word spotting is defined as the task of retrieving all instances of the query word in a document collection, becoming a useful tool for information retrieval. In this paper we propose a segmentation-free word spotting approach able to deal with large document collections. Our method is inspired on feature matching algorithms that have been applied to image matching and retrieval. Since handwritten words have different shape, there is no exact transformation to be obtained. However, the sufficient degree of relaxation is achieved by using a Fourier based descriptor and an alternative approach to RANSAC called PUMA. The proposed approach is evaluated on historical marriage records, achieving promising results.
|
|
|
Aura Hernandez-Sabate, Lluis Albarracin, Daniel Calvo, & Nuria Gorgorio. (2016). EyeMath: Identifying Mathematics Problem Solving Processes in a RTS Video Game. In 5th International Conference Games and Learning Alliance (Vol. 10056, pp. 50–59). LNCS.
Abstract: Photorealistic virtual environments are crucial for developing and testing automated driving systems in a safe way during trials. As commercially available simulators are expensive and bulky, this paper presents a low-cost, extendable, and easy-to-use (LEE) virtual environment with the aim to highlight its utility for level 3 driving automation. In particular, an experiment is performed using the presented simulator to explore the influence of different variables regarding control transfer of the car after the system was driving autonomously in a highway scenario. The results show that the speed of the car at the time when the system needs to transfer the control to the human driver is critical.
Keywords: Simulation environment; Automated Driving; Driver-Vehicle interaction
|
|
|
Alejandro Gonzalez Alzate, Sebastian Ramos, David Vazquez, Antonio Lopez, & Jaume Amores. (2015). Spatiotemporal Stacked Sequential Learning for Pedestrian Detection. In Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 (pp. 3–12).
Abstract: Pedestrian classifiers decide which image windows contain a pedestrian. In practice, such classifiers provide a relatively high response at neighbor windows overlapping a pedestrian, while the responses around potential false positives are expected to be lower. An analogous reasoning applies for image sequences. If there is a pedestrian located within a frame, the same pedestrian is expected to appear close to the same location in neighbor frames. Therefore, such a location has chances of receiving high classification scores during several frames, while false positives are expected to be more spurious. In this paper we propose to exploit such correlations for improving the accuracy of base pedestrian classifiers. In particular, we propose to use two-stage classifiers which not only rely on the image descriptors required by the base classifiers but also on the response of such base classifiers in a given spatiotemporal neighborhood. More specifically, we train pedestrian classifiers using a stacked sequential learning (SSL) paradigm. We use a new pedestrian dataset we have acquired from a car to evaluate our proposal at different frame rates. We also test on a well known dataset: Caltech. The obtained results show that our SSL proposal boosts detection accuracy significantly with a minimal impact on the computational cost. Interestingly, SSL improves more the accuracy at the most dangerous situations, i.e. when a pedestrian is close to the camera.
Keywords: SSL; Pedestrian Detection
|
|
|
Lluis Gomez, Anguelos Nicolaou, Marçal Rusiñol, & Dimosthenis Karatzas. (2020). 12 years of ICDAR Robust Reading Competitions: The evolution of reading systems for unconstrained text understanding. In K. Alahari, & C.V. Jawahar (Eds.), Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis. Series on Advances in Computer Vision and Pattern Recognition. Springer.
|
|
|
David Geronimo, Frederic Lerasle, & Antonio Lopez. (2012). State-driven particle filter for multi-person tracking. In J. Blanc-Talon et al. (Ed.), 11th International Conference on Advanced Concepts for Intelligent Vision Systems (Vol. 7517, pp. 467–478). Heidelberg: Springer.
Abstract: Multi-person tracking can be exploited in applications such as driver assistance, surveillance, multimedia and human-robot interaction. With the help of human detectors, particle filters offer a robust method able to filter noisy detections and provide temporal coherence. However, some traditional problems such as occlusions with other targets or the scene, temporal drifting or even the lost targets detection are rarely considered, making the systems performance decrease. Some authors propose to overcome these problems using heuristics not explained
and formalized in the papers, for instance by defining exceptions to the model updating depending on tracks overlapping. In this paper we propose to formalize these events by the use of a state-graph, defining the current state of the track (e.g., potential , tracked, occluded or lost) and the transitions between states in an explicit way. This approach has the advantage of linking track actions such as the online underlying models updating, which gives flexibility to the system. It provides an explicit representation to adapt the multiple parallel trackers depending on the context, i.e., each track can make use of a specific filtering strategy, dynamic model, number of particles, etc. depending on its state. We implement this technique in a single-camera multi-person tracker and test
it in public video sequences.
Keywords: human tracking
|
|
|
David Geronimo, & Antonio Lopez. (2014). Vision-based Pedestrian Protection Systems for Intelligent Vehicles. Springer Briefs in Computer Vision.
Abstract: Pedestrian Protection Systems (PPSs) are on-board systems aimed at detecting and tracking people in the surroundings of a vehicle in order to avoid potentially dangerous situations. These systems, together with other Advanced Driver Assistance Systems (ADAS) such as lane departure warning or adaptive cruise control, are one of the most promising ways to improve traffic safety. By the use of computer vision, cameras working either in the visible or infra-red spectra have been demonstrated as a reliable sensor to perform this task. Nevertheless, the variability of human’s appearance, not only in terms of clothing and sizes but also as a result of their dynamic shape, makes pedestrians one of the most complex classes even for computer vision. Moreover, the unstructured changing and unpredictable environment in which such on-board systems must work makes detection a difficult task to be carried out with the demanded robustness. In this brief, the state of the art in PPSs is introduced through the review of the most relevant papers of the last decade. A common computational architecture is presented as a framework to organize each method according to its main contribution. More than 300 papers are referenced, most of them addressing pedestrian detection and others corresponding to the descriptors (features), pedestrian models, and learning machines used. In addition, an overview of topics such as real-time aspects, systems benchmarking and future challenges of this research area are presented.
Keywords: Computer Vision; Driver Assistance Systems; Intelligent Vehicles; Pedestrian Detection; Vulnerable Road Users
|
|
|
Isabel Guitart, Jordi Conesa, Luis Villarejo, Agata Lapedriza, David Masip, Antoni Perez, et al. (2013). Opinion Mining on Educational Resources at the Open University of Catalonia. In 3rd International Workshop on Adaptive Learning via Interactive, Collaborative and Emotional approaches. In conjunction with CISIS 2013: The 7th International Conference on Complex, Intelligent, and Software Intensive Systems (pp. 385–390).
Abstract: In order to make improvements to teaching, it is vital to know what students think of the way they are taught. With that purpose in mind, exhaustively analyzing the forums associated with the subjects taught at the Universitat Oberta de Cataluya (UOC) would be extremely helpful, as the university's students often post comments on their learning experiences in them. Exploiting the content of such forums is not a simple undertaking. The volume of data involved is very large, and performing the task manually would require a great deal of effort from lecturers. As a first step to solve this problem, we propose a tool to automatically analyze the posts in forums of communities of UOC students and teachers, with a view to systematically mining the opinions they contain. This article defines the architecture of such tool and explains how lexical-semantic and language technology resources can be used to that end. For pilot testing purposes, the tool has been used to identify students' opinions on the UOC's Business Intelligence master's degree course during the last two years. The paper discusses the results of such test. The contribution of this paper is twofold. Firstly, it demonstrates the feasibility of using natural language parsing techniques to help teachers to make decisions. Secondly, it introduces a simple tool that can be refined and adapted to a virtual environment for the purpose in question.
|
|
|
Raul Gomez, Ali Furkan Biten, Lluis Gomez, Jaume Gibert, Marçal Rusiñol, & Dimosthenis Karatzas. (2019). Selective Style Transfer for Text. In 15th International Conference on Document Analysis and Recognition (pp. 805–812).
Abstract: This paper explores the possibilities of image style transfer applied to text maintaining the original transcriptions. Results on different text domains (scene text, machine printed text and handwritten text) and cross-modal results demonstrate that this is feasible, and open different research lines. Furthermore, two architectures for selective style transfer, which means
transferring style to only desired image pixels, are proposed. Finally, scene text selective style transfer is evaluated as a data augmentation technique to expand scene text detection datasets, resulting in a boost of text detectors performance. Our implementation of the described models is publicly available.
Keywords: transfer; text style transfer; data augmentation; scene text detection
|
|
|
Graham D. Finlayson, Javier Vazquez, & Fufu Fang. (2021). The Discrete Cosine Maximum Ignorance Assumption. In 29th Color and Imaging Conference (pp. 13–18).
Abstract: the performance of colour correction algorithms are dependent on the reflectance sets used. Sometimes, when the testing reflectance set is changed the ranking of colour correction algorithms also changes. To remove dependence on dataset we can
make assumptions about the set of all possible reflectances. In the Maximum Ignorance with Positivity (MIP) assumption we assume that all reflectances with per wavelength values between 0 and 1 are equally likely. A weakness in the MIP is that it fails to take into account the correlation of reflectance functions between
wavelengths (many of the assumed reflectances are, in reality, not possible).
In this paper, we take the view that the maximum ignorance assumption has merit but, hitherto it has been calculated with respect to the wrong coordinate basis. Here, we propose the Discrete Cosine Maximum Ignorance assumption (DCMI), where
all reflectances that have coordinates between max and min bounds in the Discrete Cosine Basis coordinate system are equally likely.
Here, the correlation between wavelengths is encoded and this results in the set of all plausible reflectances ’looking like’ typical reflectances that occur in nature. This said the DCMI model is also a superset of all measured reflectance sets.
Experiments show that, in colour correction, adopting the DCMI results in similar colour correction performance as using a particular reflectance set.
|
|
|
Alicia Fornes, Josep Llados, Gemma Sanchez, Xavier Otazu, & Horst Bunke. (2010). A Combination of Features for Symbol-Independent Writer Identification in Old Music Scores. IJDAR - International Journal on Document Analysis and Recognition, 13(4), 243–259.
Abstract: The aim of writer identification is determining the writer of a piece of handwriting from a set of writers. In this paper, we present an architecture for writer identification in old handwritten music scores. Even though an important amount of music compositions contain handwritten text, the aim of our work is to use only music notation to determine the author. The main contribution is therefore the use of features extracted from graphical alphabets. Our proposal consists in combining the identification results of two different approaches, based on line and textural features. The steps of the ensemble architecture are the following. First of all, the music sheet is preprocessed for removing the staff lines. Then, music lines and texture images are generated for computing line features and textural features. Finally, the classification results are combined for identifying the writer. The proposed method has been tested on a database of old music scores from the seventeenth to nineteenth centuries, achieving a recognition rate of about 92% with 20 writers.
|
|
|
Fernando Vilariño, Stephan Ameling, Gerard Lacey, Stephen Patchett, & Hugh Mulcahy. (2009). Eye Tracking Search Patterns in Expert and Trainee Colonoscopists: A Novel Method of Assessing Endoscopic Competency? GI - Gastrointestinal Endoscopy, 69(5), 370.
|
|
|
Rozenn Dhayot, Fernando Vilariño, & Gerard Lacey. (2008). Improving the Quality of Color Colonoscopy Videos. EURASIP JIVP - EURASIP Journal on Image and Video Processing, 139429(1), 1–9.
|
|
|
Mirko Arnold, Anarta Ghosh, Stephen Ameling, & G Lacey. (2010). Automatic segmentation and inpainting of specular highlights for endoscopic imaging. EURASIP JIVP - EURASIP Journal on Image and Video Processing, 2010(9).
|
|
|
Mirko Arnold, Anarta Ghosh, Gerard Lacey, Stephen Patchett, & Hugh Mulcahy. (2009). Indistinct frame detection in colonoscopy videos. In Machine Vision and Image Processing Conference (pp. 47–52).
|
|
|
Mirko Arnold, Stephan Ameling, Anarta Ghosh, & Gerard Lacey. (2011). Quality Improvement of Endoscopy Videos. In Proceedings of the 8th IASTED International Conference on Biomedical Engineering (Vol. 723).
|
|