|
David Aldavert, Marçal Rusiñol, Ricardo Toledo and Josep Llados. 2015. A Study of Bag-of-Visual-Words Representations for Handwritten Keyword Spotting. IJDAR, 18(3), 223–234.
Abstract: The Bag-of-Visual-Words (BoVW) framework has gained popularity among the document image analysis community, specifically as a representation of handwritten words for recognition or spotting purposes. Although in the computer vision field the BoVW method has been greatly improved, most of the approaches in the document image analysis domain still rely on the basic implementation of the BoVW method disregarding such latest refinements. In this paper, we present a review of those improvements and its application to the keyword spotting task. We thoroughly evaluate their impact against a baseline system in the well-known George Washington dataset and compare the obtained results against nine state-of-the-art keyword spotting methods. In addition, we also compare both the baseline and improved systems with the methods presented at the Handwritten Keyword Spotting Competition 2014.
Keywords: Bag-of-Visual-Words; Keyword spotting; Handwritten documents; Performance evaluation
|
|
|
Fahad Shahbaz Khan, Muhammad Anwer Rao, Joost Van de Weijer, Andrew Bagdanov, Antonio Lopez and Michael Felsberg. 2013. Coloring Action Recognition in Still Images. IJCV, 105(3), 205–221.
Abstract: In this article we investigate the problem of human action recognition in static images. By action recognition we intend a class of problems which includes both action classification and action detection (i.e. simultaneous localization and classification). Bag-of-words image representations yield promising results for action classification, and deformable part models perform very well object detection. The representations for action recognition typically use only shape cues and ignore color information. Inspired by the recent success of color in image classification and object detection, we investigate the potential of color for action classification and detection in static images. We perform a comprehensive evaluation of color descriptors and fusion approaches for action recognition. Experiments were conducted on the three datasets most used for benchmarking action recognition in still images: Willow, PASCAL VOC 2010 and Stanford-40. Our experiments demonstrate that incorporating color information considerably improves recognition performance, and that a descriptor based on color names outperforms pure color descriptors. Our experiments demonstrate that late fusion of color and shape information outperforms other approaches on action recognition. Finally, we show that the different color–shape fusion approaches result in complementary information and combining them yields state-of-the-art performance for action classification.
|
|
|
Xavier Boix, Josep M. Gonfaus, Joost Van de Weijer, Andrew Bagdanov, Joan Serrat and Jordi Gonzalez. 2012. Harmony Potentials: Fusing Global and Local Scale for Semantic Image Segmentation. IJCV, 96(1), 83–102.
Abstract: The Hierarchical Conditional Random Field(HCRF) model have been successfully applied to a number of image labeling problems, including image segmentation. However, existing HCRF models of image segmentation do not allow multiple classes to be assigned to a single region, which limits their ability to incorporate contextual information across multiple scales.
At higher scales in the image, this representation yields an oversimplied model since multiple classes can be reasonably expected to appear within large regions. This simplied model particularly limits the impact of information at higher scales. Since class-label information at these scales is usually more reliable than at lower, noisier scales, neglecting this information is undesirable. To
address these issues, we propose a new consistency potential for image labeling problems, which we call the harmony potential. It can encode any possible combi-
nation of labels, penalizing only unlikely combinations of classes. We also propose an eective sampling strategy over this expanded label set that renders tractable the underlying optimization problem. Our approach obtains state-of-the-art results on two challenging, standard benchmark datasets for semantic image segmentation: PASCAL VOC 2010, and MSRC-21.
|
|
|
A. Pujol, Jordi Vitria, Felipe Lumbreras and Juan J. Villanueva. 2001. Topological principal component analysis for face encoding and recognition. PRL, 22(6-7), 769–776.
|
|
|
Jaume Amores and Petia Radeva. 2005. Registration and Retrieval of Highly Elastic Bodies using Contextual Information. PRL, 26(11), 1720–1731.
|
|
|
Jaume Amores and Petia Radeva. 2005. Retrieval of IVUS Images Using Contextual Information and Elastic Matching.
|
|
|
Jaume Amores, N. Sebe and Petia Radeva. 2006. Boosting the distance estimation: Application to the K-Nearest Neighbor Classifier. PRL, 27(3), 201–209.
|
|
|
Jaume Amores, N. Sebe and Petia Radeva. 2007. Context-Based Object-Class Recognition and Retrieval by Generalized Correlograms.
|
|
|
Yu Jie, Jaume Amores, N. Sebe, Petia Radeva and Tian Qi. 2008. Distance Learning for Similarity Estimation.
|
|
|
Daniel Ponsa, Robert Benavente, Felipe Lumbreras, J. Martinez and Xavier Roca. 2003. Quality control of safety belts by machine vision inspection for real-time production.
|
|