Q. Xue, Laura Igual, A. Berenguel, M. Guerrieri, & L. Garrido. (2014). Active Contour Segmentation with Affine Coordinate-Based Parametrization. In 9th International Conference on Computer Vision Theory and Applications (Vol. 1, pp. 5–14).
Abstract: In this paper, we present a new framework for image segmentation based on parametrized active contours. The contour and the points of the image space are parametrized using a set of reduced control points that have to form a closed polygon in two dimensional problems and a closed surface in three dimensional problems. By moving the control points, the active contour evolves. We use mean value coordinates as the parametrization tool for the interface, which allows to parametrize any point of the space, inside or outside the closed polygon
or surface. Region-based energies such as the one proposed by Chan and Vese can be easily implemented in both two and three dimensional segmentation problems. We show the usefulness of our approach with several experiments.
Keywords: Active Contours; Affine Coordinates; Mean Value Coordinates
|
M. Danelljan, Fahad Shahbaz Khan, Michael Felsberg, & Joost Van de Weijer. (2014). Adaptive color attributes for real-time visual tracking. In 27th IEEE Conference on Computer Vision and Pattern Recognition (pp. 1090–1097).
Abstract: Visual tracking is a challenging problem in computer vision. Most state-of-the-art visual trackers either rely on luminance information or use simple color representations for image description. Contrary to visual tracking, for object
recognition and detection, sophisticated color features when combined with luminance have shown to provide excellent performance. Due to the complexity of the tracking problem, the desired color feature should be computationally
efficient, and possess a certain amount of photometric invariance while maintaining high discriminative power.
This paper investigates the contribution of color in a tracking-by-detection framework. Our results suggest that color attributes provides superior performance for visual tracking. We further propose an adaptive low-dimensional
variant of color attributes. Both quantitative and attributebased evaluations are performed on 41 challenging benchmark color sequences. The proposed approach improves the baseline intensity-based tracker by 24% in median distance precision. Furthermore, we show that our approach outperforms
state-of-the-art tracking methods while running at more than 100 frames per second.
|
Dimosthenis Karatzas, Sergi Robles, & Lluis Gomez. (2014). An on-line platform for ground truthing and performance evaluation of text extraction systems. In 11th IAPR International Workshop on Document Analysis and Systems (pp. 242–246).
Abstract: This paper presents a set of on-line software tools for creating ground truth and calculating performance evaluation metrics for text extraction tasks such as localization, segmentation and recognition. The platform supports the definition of comprehensive ground truth information at different text representation levels while it offers centralised management and quality control of the ground truthing effort. It implements a range of state of the art performance evaluation algorithms and offers functionality for the definition of evaluation scenarios, on-line calculation of various performance metrics and visualisation of the results. The
presented platform, which comprises the backbone of the ICDAR 2011 (challenge 1) and 2013 (challenges 1 and 2) Robust Reading competitions, is now made available for public use.
|
Salvatore Tabbone, & Oriol Ramos Terrades. (2014). An Overview of Symbol Recognition. In D. Doermann, & K. Tombre (Eds.), Handbook of Document Image Processing and Recognition (Vol. D, pp. 523–551). Springer London.
Abstract: According to the Cambridge Dictionaries Online, a symbol is a sign, shape, or object that is used to represent something else. Symbol recognition is a subfield of general pattern recognition problems that focuses on identifying, detecting, and recognizing symbols in technical drawings, maps, or miscellaneous documents such as logos and musical scores. This chapter aims at providing the reader an overview of the different existing ways of describing and recognizing symbols and how the field has evolved to attain a certain degree of maturity.
Keywords: Pattern recognition; Shape descriptors; Structural descriptors; Symbolrecognition; Symbol spotting
|
Alicia Fornes, & Gemma Sanchez. (2014). Analysis and Recognition of Music Scores. In D. Doermann, & K. Tombre (Eds.), Handbook of Document Image Processing and Recognition (Vol. E, pp. 749–774). Springer London.
Abstract: The analysis and recognition of music scores has attracted the interest of researchers for decades. Optical Music Recognition (OMR) is a classical research field of Document Image Analysis and Recognition (DIAR), whose aim is to extract information from music scores. Music scores contain both graphical and textual information, and for this reason, techniques are closely related to graphics recognition and text recognition. Since music scores use a particular diagrammatic notation that follow the rules of music theory, many approaches make use of context information to guide the recognition and solve ambiguities. This chapter overviews the main Optical Music Recognition (OMR) approaches. Firstly, the different methods are grouped according to the OMR stages, namely, staff removal, music symbol recognition, and syntactical analysis. Secondly, specific approaches for old and handwritten music scores are reviewed. Finally, online approaches and commercial systems are also commented.
|
Sergio Vera, Debora Gil, & Miguel Angel Gonzalez Ballester. (2014). Anatomical parameterization for volumetric meshing of the liver. In SPIE – Medical Imaging (Vol. 9036).
Abstract: A coordinate system describing the interior of organs is a powerful tool for a systematic localization of injured tissue. If the same coordinate values are assigned to specific anatomical landmarks, the coordinate system allows integration of data across different medical image modalities. Harmonic mappings have been used to produce parametric coordinate systems over the surface of anatomical shapes, given their flexibility to set values
at specific locations through boundary conditions. However, most of the existing implementations in medical imaging restrict to either anatomical surfaces, or the depth coordinate with boundary conditions is given at sites
of limited geometric diversity. In this paper we present a method for anatomical volumetric parameterization that extends current harmonic parameterizations to the interior anatomy using information provided by the
volume medial surface. We have applied the methodology to define a common reference system for the liver shape and functional anatomy. This reference system sets a solid base for creating anatomical models of the patient’s liver, and allows comparing livers from several patients in a common framework of reference.
Keywords: Coordinate System; Anatomy Modeling; Parameterization
|
Pierluigi Casale, Oriol Pujol, & Petia Radeva. (2014). Approximate polytope ensemble for one-class classification. PR - Pattern Recognition, 47(2), 854–864.
Abstract: In this work, a new one-class classification ensemble strategy called approximate polytope ensemble is presented. The main contribution of the paper is threefold. First, the geometrical concept of convex hull is used to define the boundary of the target class defining the problem. Expansions and contractions of this geometrical structure are introduced in order to avoid over-fitting. Second, the decision whether a point belongs to the convex hull model in high dimensional spaces is approximated by means of random projections and an ensemble decision process. Finally, a tiling strategy is proposed in order to model non-convex structures. Experimental results show that the proposed strategy is significantly better than state of the art one-class classification methods on over 200 datasets.
Keywords: One-class classification; Convex hull; High-dimensionality; Random projections; Ensemble learning
|
Joan Marc Llargues Asensio, Juan Peralta, Raul Arrabales, Manuel Gonzalez Bedia, Paulo Cortez, & Antonio Lopez. (2014). Artificial Intelligence Approaches for the Generation and Assessment of Believable Human-Like Behaviour in Virtual Characters. EXSY - Expert Systems With Applications, 41(16), 7281–7290.
Abstract: Having artificial agents to autonomously produce human-like behaviour is one of the most ambitious original goals of Artificial Intelligence (AI) and remains an open problem nowadays. The imitation game originally proposed by Turing constitute a very effective method to prove the indistinguishability of an artificial agent. The behaviour of an agent is said to be indistinguishable from that of a human when observers (the so-called judges in the Turing test) cannot tell apart humans and non-human agents. Different environments, testing protocols, scopes and problem domains can be established to develop limited versions or variants of the original Turing test. In this paper we use a specific version of the Turing test, based on the international BotPrize competition, built in a First-Person Shooter video game, where both human players and non-player characters interact in complex virtual environments. Based on our past experience both in the BotPrize competition and other robotics and computer game AI applications we have developed three new more advanced controllers for believable agents: two based on a combination of the CERA–CRANIUM and SOAR cognitive architectures and other based on ADANN, a system for the automatic evolution and adaptation of artificial neural networks. These two new agents have been put to the test jointly with CCBot3, the winner of BotPrize 2010 competition (Arrabales et al., 2012), and have showed a significant improvement in the humanness ratio. Additionally, we have confronted all these bots to both First-person believability assessment (BotPrize original judging protocol) and Third-person believability assessment, demonstrating that the active involvement of the judge has a great impact in the recognition of human-like behaviour.
Keywords: Turing test; Human-like behaviour; Believability; Non-player characters; Cognitive architectures; Genetic algorithm; Artificial neural networks
|
Albert Gordo, Florent Perronnin, Yunchao Gong, & Svetlana Lazebnik. (2014). Asymmetric Distances for Binary Embeddings. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(1), 33–47.
Abstract: In large-scale query-by-example retrieval, embedding image signatures in a binary space offers two benefits: data compression and search efficiency. While most embedding algorithms binarize both query and database signatures, it has been noted that this is not strictly a requirement. Indeed, asymmetric schemes which binarize the database signatures but not the query still enjoy the same two benefits but may provide superior accuracy. In this work, we propose two general asymmetric distances which are applicable to a wide variety of embedding techniques including Locality Sensitive Hashing (LSH), Locality Sensitive Binary Codes (LSBC), Spectral Hashing (SH), PCA Embedding (PCAE), PCA Embedding with random rotations (PCAE-RR), and PCA Embedding with iterative quantization (PCAE-ITQ). We experiment on four public benchmarks containing up to 1M images and show that the proposed asymmetric distances consistently lead to large improvements over the symmetric Hamming distance for all binary embedding techniques.
|
David Masip, Michael S. North, Alexander Todorov, & Daniel N. Osherson. (2014). Automated Prediction of Preferences Using Facial Expressions. Plos - PloS one, 9(2), e87434.
Abstract: We introduce a computer vision problem from social cognition, namely, the automated detection of attitudes from a person's spontaneous facial expressions. To illustrate the challenges, we introduce two simple algorithms designed to predict observers’ preferences between images (e.g., of celebrities) based on covert videos of the observers’ faces. The two algorithms are almost as accurate as human judges performing the same task but nonetheless far from perfect. Our approach is to locate facial landmarks, then predict preference on the basis of their temporal dynamics. The database contains 768 videos involving four different kinds of preferences. We make it publically available.
|
Maedeh Aghaei, & Petia Radeva. (2014). Bag-of-Tracklets for Person Tracking in Life-Logging Data. In 17th International Conference of the Catalan Association for Artificial Intelligence (Vol. 269, pp. 35–44).
Abstract: By increasing popularity of wearable cameras, life-logging data analysis is becoming more and more important and useful to derive significant events out of this substantial collection of images. In this study, we introduce a new tracking method applied to visual life-logging, called bag-of-tracklets, which is based on detecting, localizing and tracking of people. Given the low spatial and temporal resolution of the image data, our model generates and groups tracklets in a unsupervised framework and extracts image sequences of person appearance according to a similarity score of the bag-of-tracklets. The model output is a meaningful sequence of events expressing human appearance and tracking them in life-logging data. The achieved results prove the robustness of our model in terms of efficiency and accuracy despite the low spatial and temporal resolution of the data.
|
David Fernandez, Jon Almazan, Nuria Cirera, Alicia Fornes, & Josep Llados. (2014). BH2M: the Barcelona Historical Handwritten Marriages database. In 22nd International Conference on Pattern Recognition (pp. 256–261).
Abstract: This paper presents an image database of historical handwritten marriages records stored in the archives of Barcelona cathedral, and the corresponding meta-data addressed to evaluate the performance of document analysis algorithms. The contribution of this paper is twofold. First, it presents a complete ground truth which covers the whole pipeline of handwriting
recognition research, from layout analysis to recognition and understanding. Second, it is the first dataset in the emerging area of genealogical document analysis, where documents are manuscripts pseudo-structured with specific lexicons and the interest is beyond pure transcriptions but context dependent.
|
Marçal Rusiñol, & Josep Llados. (2014). Boosting the Handwritten Word Spotting Experience by Including the User in the Loop. PR - Pattern Recognition, 47(3), 1063–1072.
Abstract: In this paper, we study the effect of taking the user into account in a query-by-example handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in the handwritten word spotting context. The increase in terms of precision when the user is included in the loop is assessed using two datasets of historical handwritten documents and two baseline word spotting approaches both based on the bag-of-visual-words model. We finally present two alternative ways of presenting the results to the user that might be more attractive and suitable to the user's needs than the classic ranked list.
Keywords: Handwritten word spotting; Query by example; Relevance feedback; Query fusion; Multidimensional scaling
|
Sergio Escalera, Xavier Baro, Jordi Gonzalez, Miguel Angel Bautista, Meysam Madadi, Miguel Reyes, et al. (2014). ChaLearn Looking at People Challenge 2014: Dataset and Results. In ECCV Workshop on ChaLearn Looking at People (Vol. 8925, pp. 459–473). LNCS.
Abstract: This paper summarizes the ChaLearn Looking at People 2014 challenge data and the results obtained by the participants. The competition was split into three independent tracks: human pose recovery from RGB data, action and interaction recognition from RGB data sequences, and multi-modal gesture recognition from RGB-Depth sequences. For all the tracks, the goal was to perform user-independent recognition in sequences of continuous images using the overlapping Jaccard index as the evaluation measure. In this edition of the ChaLearn challenge, two large novel data sets were made publicly available and the Microsoft Codalab platform were used to manage the competition. Outstanding results were achieved in the three challenge tracks, with accuracy results of 0.20, 0.50, and 0.85 for pose recovery, action/interaction recognition, and multi-modal gesture recognition, respectively.
Keywords: Human Pose Recovery; Behavior Analysis; Action and in- teractions; Multi-modal gestures; recognition
|
Marçal Rusiñol, V. Poulain d'Andecy, Dimosthenis Karatzas, & Josep Llados. (2014). Classification of Administrative Document Images by Logo Identification. In Bart Lamiroy, & Jean-Marc Ogier (Eds.), Graphics Recognition. Current Trends and Challenges (Vol. 8746, pp. 49–58). Springer Berlin Heidelberg.
Abstract: This paper is focused on the categorization of administrative document images (such as invoices) based on the recognition of the supplier’s graphical logo. Two different methods are proposed, the first one uses a bag-of-visual-words model whereas the second one tries to locate logo images described by the blurred shape model descriptor within documents by a sliding-window technique. Preliminar results are reported with a dataset of real administrative documents.
Keywords: Administrative Document Classification; Logo Recognition; Logo Spotting
|