|
Alicia Fornes, Beata Megyesi, & Joan Mas. (2017). Transcription of Encoded Manuscripts with Image Processing Techniques. In Digital Humanities Conference (pp. 441–443).
|
|
|
Andreas Fischer, Volkmar Frinken, Alicia Fornes, & Horst Bunke. (2011). Transcription Alignment of Latin Manuscripts Using Hidden Markov Models. In Proceedings of the 2011 Workshop on Historical Document Imaging and Processing (pp. 29–36). ACM.
Abstract: Transcriptions of historical documents are a valuable source for extracting labeled handwriting images that can be used for training recognition systems. In this paper, we introduce the Saint Gall database that includes images as well as the transcription of a Latin manuscript from the 9th century written in Carolingian script. Although the available transcription is of high quality for a human reader, the spelling of the words is not accurate when compared with the handwriting image. Hence, the transcription poses several challenges for alignment regarding, e.g., line breaks, abbreviations, and capitalization. We propose an alignment system based on character Hidden Markov Models that can cope with these challenges and efficiently aligns complete document pages. On the Saint Gall database, we demonstrate that a considerable alignment accuracy can be achieved, even with weakly trained character models.
|
|
|
Pau Baiget, Carles Fernandez, Xavier Roca, & Jordi Gonzalez. (2012). Trajectory-Based Abnormality Categorization for Learning Route Patterns in Surveillance. In Detection and Identification of Rare Audiovisual Cues, Studies in Computational Intelligence (Vol. 384, pp. 87–95). Springer Berlin Heidelberg.
Abstract: The recognition of abnormal behaviors in video sequences has raised as a hot topic in video understanding research. Particularly, an important challenge resides on automatically detecting abnormality. However, there is no convention about the types of anomalies that training data should derive. In surveillance, these are typically detected when new observations differ substantially from observed, previously learned behavior models, which represent normality. This paper focuses on properly defining anomalies within trajectory analysis: we propose a hierarchical representation conformed by Soft, Intermediate, and Hard Anomaly, which are identified from the extent and nature of deviation from learned models. Towards this end, a novel Gaussian Mixture Model representation of learned route patterns creates a probabilistic map of the image plane, which is applied to detect and classify anomalies in real-time. Our method overcomes limitations of similar existing approaches, and performs correctly even when the tracking is affected by different sources of noise. The reliability of our approach is demonstrated experimentally.
|
|
|
Mikhail Mozerov, Ariel Amato, Xavier Roca, & Jordi Gonzalez. (2008). Trajectory Occlusion Handling with Multiple View Distance Minimisation Clustering. Optical Engineering, vol. 47(04)04702, DOI:10.11781.2909665.
|
|
|
Ekta Vats, Anders Hast, & Alicia Fornes. (2019). Training-Free and Segmentation-Free Word Spotting using Feature Matching and Query Expansion. In 15th International Conference on Document Analysis and Recognition (pp. 1294–1299).
Abstract: Historical handwritten text recognition is an interesting yet challenging problem. In recent times, deep learning based methods have achieved significant performance in handwritten text recognition. However, handwriting recognition using deep learning needs training data, and often, text must be previously segmented into lines (or even words). These limitations constrain the application of HTR techniques in document collections, because training data or segmented words are not always available. Therefore, this paper proposes a training-free and segmentation-free word spotting approach that can be applied in unconstrained scenarios. The proposed word spotting framework is based on document query word expansion and relaxed feature matching algorithm, which can easily be parallelised. Since handwritten words posses distinct shape and characteristics, this work uses a combination of different keypoint detectors
and Fourier-based descriptors to obtain a sufficient degree of relaxed matching. The effectiveness of the proposed method is empirically evaluated on well-known benchmark datasets using standard evaluation measures. The use of informative features along with query expansion significantly contributed in efficient performance of the proposed method.
Keywords: Word spotting; Segmentation-free; Trainingfree; Query expansion; Feature matching
|
|
|
Antonio Lopez, Gabriel Villalonga, Laura Sellart, German Ros, David Vazquez, Jiaolong Xu, et al. (2017). Training my car to see using virtual worlds. IMAVIS - Image and Vision Computing, 38, 102–118.
Abstract: Computer vision technologies are at the core of different advanced driver assistance systems (ADAS) and will play a key role in oncoming autonomous vehicles too. One of the main challenges for such technologies is to perceive the driving environment, i.e. to detect and track relevant driving information in a reliable manner (e.g. pedestrians in the vehicle route, free space to drive through). Nowadays it is clear that machine learning techniques are essential for developing such a visual perception for driving. In particular, the standard working pipeline consists of collecting data (i.e. on-board images), manually annotating the data (e.g. drawing bounding boxes around pedestrians), learning a discriminative data representation taking advantage of such annotations (e.g. a deformable part-based model, a deep convolutional neural network), and then assessing the reliability of such representation with the acquired data. In the last two decades most of the research efforts focused on representation learning (first, designing descriptors and learning classifiers; later doing it end-to-end). Hence, collecting data and, especially, annotating it, is essential for learning good representations. While this has been the case from the very beginning, only after the disruptive appearance of deep convolutional neural networks that it became a serious issue due to their data hungry nature. In this context, the problem is that manual data annotation is a tiresome work prone to errors. Accordingly, in the late 00’s we initiated a research line consisting of training visual models using photo-realistic computer graphics, especially focusing on assisted and autonomous driving. In this paper, we summarize such a work and show how it has become a new tendency with increasing acceptance.
|
|
|
Jiaolong Xu, Peng Wang, Heng Yang, & Antonio Lopez. (2019). Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving. In IEEE International Conference on Robotics and Automation (pp. 2379–2384).
Abstract: Autonomous driving has harsh requirements of small model size and energy efficiency, in order to enable the embedded system to achieve real-time on-board object detection. Recent deep convolutional neural network based object detectors have achieved state-of-the-art accuracy. However, such models are trained with numerous parameters and their high computational costs and large storage prohibit the deployment to memory and computation resource limited systems. Low-precision neural networks are popular techniques for reducing the computation requirements and memory footprint. Among them, binary weight neural network (BWN) is the extreme case which quantizes the float-point into just bit. BWNs are difficult to train and suffer from accuracy deprecation due to the extreme low-bit representation. To address this problem, we propose a knowledge transfer (KT) method to aid the training of BWN using a full-precision teacher network. We built DarkNet-and MobileNet-based binary weight YOLO-v2 detectors and conduct experiments on KITTI benchmark for car, pedestrian and cyclist detection. The experimental results show that the proposed method maintains high detection accuracy while reducing the model size of DarkNet-YOLO from 257 MB to 8.8 MB and MobileNet-YOLO from 193 MB to 7.9 MB.
|
|
|
Sergio Escalera, Xavier Baro, Oriol Pujol, Jordi Vitria, & Petia Radeva. (2011). Traffic-Sign Recognition Systems. Springer London.
|
|
|
Xavier Baro, Sergio Escalera, Jordi Vitria, Oriol Pujol, & Petia Radeva. (2009). Traffic Sign Recognition Using Evolutionary Adaboost Detection and Forest-ECOC Classification. TITS - IEEE Transactions on Intelligent Transportation Systems, 10(1), 113–126.
Abstract: The high variability of sign appearance in uncontrolled environments has made the detection and classification of road signs a challenging problem in computer vision. In this paper, we introduce a novel approach for the detection and classification of traffic signs. Detection is based on a boosted detectors cascade, trained with a novel evolutionary version of Adaboost, which allows the use of large feature spaces. Classification is defined as a multiclass categorization problem. A battery of classifiers is trained to split classes in an Error-Correcting Output Code (ECOC) framework. We propose an ECOC design through a forest of optimal tree structures that are embedded in the ECOC matrix. The novel system offers high performance and better accuracy than the state-of-the-art strategies and is potentially better in terms of noise, affine deformation, partial occlusions, and reduced illumination.
|
|
|
Oriol Pujol, Petia Radeva, & Jordi Vitria. (2005). Traffic sign recognition using an adaptive boosting multiclass framework.
|
|
|
Sergio Escalera, Oriol Pujol, & Petia Radeva. (2010). Traffic sign recognition system with β -correction. MVA - Machine Vision and Applications, 21(2), 99–111.
Abstract: Traffic sign classification represents a classical application of multi-object recognition processing in uncontrolled adverse environments. Lack of visibility, illumination changes, and partial occlusions are just a few problems. In this paper, we introduce a novel system for multi-class classification of traffic signs based on error correcting output codes (ECOC). ECOC is based on an ensemble of binary classifiers that are trained on bi-partition of classes. We classify a wide set of traffic signs types using robust error correcting codings. Moreover, we introduce the novel β-correction decoding strategy that outperforms the state-of-the-art decoding techniques, classifying a high number of classes with great success.
|
|
|
David Geronimo, Joan Serrat, Antonio Lopez, & Ramon Baldrich. (2013). Traffic sign recognition for computer vision project-based learning. T-EDUC - IEEE Transactions on Education, 56(3), 364–371.
Abstract: This paper presents a graduate course project on computer vision. The aim of the project is to detect and recognize traffic signs in video sequences recorded by an on-board vehicle camera. This is a demanding problem, given that traffic sign recognition is one of the most challenging problems for driving assistance systems. Equally, it is motivating for the students given that it is a real-life problem. Furthermore, it gives them the opportunity to appreciate the difficulty of real-world vision problems and to assess the extent to which this problem can be solved by modern computer vision and pattern classification techniques taught in the classroom. The learning objectives of the course are introduced, as are the constraints imposed on its design, such as the diversity of students' background and the amount of time they and their instructors dedicate to the course. The paper also describes the course contents, schedule, and how the project-based learning approach is applied. The outcomes of the course are discussed, including both the students' marks and their personal feedback.
Keywords: traffic signs
|
|
|
Sergio Escalera, Oriol Pujol, & Petia Radeva. (2007). Traffic Sign Classification using Error Correcting Techniques. In 2nd International Conference on Computer Vision Theory and Applications (281–285).
|
|
|
Jordi Gonzalez, & Thomas B. Moeslund. (2008). Tracking Humans for the Evaluation of their Motion in Image Sequences.
|
|
|
Ricardo Toledo, X. Orriols, X. Binefa, Petia Radeva, Jordi Vitria, & Juan J. Villanueva. (2000). Tracking Elongated Structures using Statistical Snakes..
|
|