Fahad Shahbaz Khan, Joost Van de Weijer, Andrew Bagdanov, & Michael Felsberg. (2014). Scale Coding Bag-of-Words for Action Recognition. In 22nd International Conference on Pattern Recognition (pp. 1514–1519).
Abstract: Recognizing human actions in still images is a challenging problem in computer vision due to significant amount of scale, illumination and pose variation. Given the bounding box of a person both at training and test time, the task is to classify the action associated with each bounding box in an image.
Most state-of-the-art methods use the bag-of-words paradigm for action recognition. The bag-of-words framework employing a dense multi-scale grid sampling strategy is the de facto standard for feature detection. This results in a scale invariant image representation where all the features at multiple-scales are binned in a single histogram. We argue that such a scale invariant
strategy is sub-optimal since it ignores the multi-scale information
available with each bounding box of a person.
This paper investigates alternative approaches to scale coding for action recognition in still images. We encode multi-scale information explicitly in three different histograms for small, medium and large scale visual-words. Our first approach exploits multi-scale information with respect to the image size. In our second approach, we encode multi-scale information relative to the size of the bounding box of a person instance. In each approach, the multi-scale histograms are then concatenated into a single representation for action classification. We validate our approaches on the Willow dataset which contains seven action categories: interacting with computer, photography, playing music,
riding bike, riding horse, running and walking. Our results clearly suggest that the proposed scale coding approaches outperform the conventional scale invariant technique. Moreover, we show that our approach obtains promising results compared to more complex state-of-the-art methods.
|
Shida Beigpour, Christian Riess, Joost Van de Weijer, & Elli Angelopoulou. (2014). Multi-Illuminant Estimation with Conditional Random Fields. TIP - IEEE Transactions on Image Processing, 23(1), 83–95.
Abstract: Most existing color constancy algorithms assume uniform illumination. However, in real-world scenes, this is not often the case. Thus, we propose a novel framework for estimating the colors of multiple illuminants and their spatial distribution in the scene. We formulate this problem as an energy minimization task within a conditional random field over a set of local illuminant estimates. In order to quantitatively evaluate the proposed method, we created a novel data set of two-dominant-illuminant images comprised of laboratory, indoor, and outdoor scenes. Unlike prior work, our database includes accurate pixel-wise ground truth illuminant information. The performance of our method is evaluated on multiple data sets. Experimental results show that our framework clearly outperforms single illuminant estimators as well as a recently proposed multi-illuminant estimation approach.
Keywords: color constancy; CRF; multi-illuminant
|
Q. Xue, Laura Igual, A. Berenguel, M. Guerrieri, & L. Garrido. (2014). Active Contour Segmentation with Affine Coordinate-Based Parametrization. In 9th International Conference on Computer Vision Theory and Applications (Vol. 1, pp. 5–14).
Abstract: In this paper, we present a new framework for image segmentation based on parametrized active contours. The contour and the points of the image space are parametrized using a set of reduced control points that have to form a closed polygon in two dimensional problems and a closed surface in three dimensional problems. By moving the control points, the active contour evolves. We use mean value coordinates as the parametrization tool for the interface, which allows to parametrize any point of the space, inside or outside the closed polygon
or surface. Region-based energies such as the one proposed by Chan and Vese can be easily implemented in both two and three dimensional segmentation problems. We show the usefulness of our approach with several experiments.
Keywords: Active Contours; Affine Coordinates; Mean Value Coordinates
|
David Masip, Michael S. North, Alexander Todorov, & Daniel N. Osherson. (2014). Automated Prediction of Preferences Using Facial Expressions. Plos - PloS one, 9(2), e87434.
Abstract: We introduce a computer vision problem from social cognition, namely, the automated detection of attitudes from a person's spontaneous facial expressions. To illustrate the challenges, we introduce two simple algorithms designed to predict observers’ preferences between images (e.g., of celebrities) based on covert videos of the observers’ faces. The two algorithms are almost as accurate as human judges performing the same task but nonetheless far from perfect. Our approach is to locate facial landmarks, then predict preference on the basis of their temporal dynamics. The database contains 768 videos involving four different kinds of preferences. We make it publically available.
|
Alejandro Gonzalez Alzate, Sebastian Ramos, David Vazquez, Antonio Lopez, & Jaume Amores. (2015). Spatiotemporal Stacked Sequential Learning for Pedestrian Detection. In Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 (pp. 3–12).
Abstract: Pedestrian classifiers decide which image windows contain a pedestrian. In practice, such classifiers provide a relatively high response at neighbor windows overlapping a pedestrian, while the responses around potential false positives are expected to be lower. An analogous reasoning applies for image sequences. If there is a pedestrian located within a frame, the same pedestrian is expected to appear close to the same location in neighbor frames. Therefore, such a location has chances of receiving high classification scores during several frames, while false positives are expected to be more spurious. In this paper we propose to exploit such correlations for improving the accuracy of base pedestrian classifiers. In particular, we propose to use two-stage classifiers which not only rely on the image descriptors required by the base classifiers but also on the response of such base classifiers in a given spatiotemporal neighborhood. More specifically, we train pedestrian classifiers using a stacked sequential learning (SSL) paradigm. We use a new pedestrian dataset we have acquired from a car to evaluate our proposal at different frame rates. We also test on a well known dataset: Caltech. The obtained results show that our SSL proposal boosts detection accuracy significantly with a minimal impact on the computational cost. Interestingly, SSL improves more the accuracy at the most dangerous situations, i.e. when a pedestrian is close to the camera.
Keywords: SSL; Pedestrian Detection
|
Sergio Vera, Debora Gil, & Miguel Angel Gonzalez Ballester. (2014). Anatomical parameterization for volumetric meshing of the liver. In SPIE – Medical Imaging (Vol. 9036).
Abstract: A coordinate system describing the interior of organs is a powerful tool for a systematic localization of injured tissue. If the same coordinate values are assigned to specific anatomical landmarks, the coordinate system allows integration of data across different medical image modalities. Harmonic mappings have been used to produce parametric coordinate systems over the surface of anatomical shapes, given their flexibility to set values
at specific locations through boundary conditions. However, most of the existing implementations in medical imaging restrict to either anatomical surfaces, or the depth coordinate with boundary conditions is given at sites
of limited geometric diversity. In this paper we present a method for anatomical volumetric parameterization that extends current harmonic parameterizations to the interior anatomy using information provided by the
volume medial surface. We have applied the methodology to define a common reference system for the liver shape and functional anatomy. This reference system sets a solid base for creating anatomical models of the patient’s liver, and allows comparing livers from several patients in a common framework of reference.
Keywords: Coordinate System; Anatomy Modeling; Parameterization
|
Enric Marti, Antoni Gurgui, Debora Gil, Aura Hernandez-Sabate, Jaume Rocarias, & Ferran Poveda. (2014). ABP on line: Seguimiento, estregas y evaluación en aprendizaje basado en proyectos.
|
Carles Sanchez, Oriol Ramos Terrades, Patricia Marquez, Enric Marti, Jaume Rocarias, & Debora Gil. (2014). Evaluación automática de prácticas en Moodle para el aprendizaje autónomo en Ingenierías.
|
David Fernandez, Josep Llados, & Alicia Fornes. (2014). A graph-based approach for segmenting touching lines in historical handwritten documents. IJDAR - International Journal on Document Analysis and Recognition, 17(3), 293–312.
Abstract: Text line segmentation in handwritten documents is an important task in the recognition of historical documents. Handwritten document images contain text lines with multiple orientations, touching and overlapping characters between consecutive text lines and different document structures, making line segmentation a difficult task. In this paper, we present a new approach for handwritten text line segmentation solving the problems of touching components, curvilinear text lines and horizontally overlapping components. The proposed algorithm formulates line segmentation as finding the central path in the area between two consecutive lines. This is solved as a graph traversal problem. A graph is constructed using the skeleton of the image. Then, a path-finding algorithm is used to find the optimum path between text lines. The proposed algorithm has been evaluated on a comprehensive dataset consisting of five databases: ICDAR2009, ICDAR2013, UMD, the George Washington and the Barcelona Marriages Database. The proposed method outperforms the state-of-the-art considering the different types and difficulties of the benchmarking data.
Keywords: Text line segmentation; Handwritten documents; Document image processing; Historical document analysis
|
David Fernandez, Pau Riba, Alicia Fornes, & Josep Llados. (2014). On the Influence of Key Point Encoding for Handwritten Word Spotting. In 14th International Conference on Frontiers in Handwriting Recognition (pp. 476–481).
Abstract: In this paper we evaluate the influence of the selection of key points and the associated features in the performance of word spotting processes. In general, features can be extracted from a number of characteristic points like corners, contours, skeletons, maxima, minima, crossings, etc. A number of descriptors exist in the literature using different interest point detectors. But the intrinsic variability of handwriting vary strongly on the performance if the interest points are not stable enough. In this paper, we analyze the performance of different descriptors for local interest points. As benchmarking dataset we have used the Barcelona Marriage Database that contains handwritten records of marriages over five centuries.
Keywords: Local descriptors; Interest points; Handwritten documents; Word spotting; Historical document analysis
|
David Fernandez, Jon Almazan, Nuria Cirera, Alicia Fornes, & Josep Llados. (2014). BH2M: the Barcelona Historical Handwritten Marriages database. In 22nd International Conference on Pattern Recognition (pp. 256–261).
Abstract: This paper presents an image database of historical handwritten marriages records stored in the archives of Barcelona cathedral, and the corresponding meta-data addressed to evaluate the performance of document analysis algorithms. The contribution of this paper is twofold. First, it presents a complete ground truth which covers the whole pipeline of handwriting
recognition research, from layout analysis to recognition and understanding. Second, it is the first dataset in the emerging area of genealogical document analysis, where documents are manuscripts pseudo-structured with specific lexicons and the interest is beyond pure transcriptions but context dependent.
|
David Fernandez, R.Manmatha, Josep Llados, & Alicia Fornes. (2014). Sequential Word Spotting in Historical Handwritten Documents. In 11th IAPR International Workshop on Document Analysis and Systems (pp. 101–105).
Abstract: In this work we present a handwritten word spotting approach that takes advantage of the a priori known order of appearance of the query words. Given an ordered sequence of query word instances, the proposed approach performs a
sequence alignment with the words in the target collection. Although the alignment is quite sparse, i.e. the number of words in the database is higher than the query set, the improvement in the overall performance is sensitively higher than isolated word spotting. As application dataset, we use a collection of handwritten marriage licenses taking advantage of the ordered
index pages of family names.
|
Pau Riba, Jon Almazan, Alicia Fornes, David Fernandez, Ernest Valveny, & Josep Llados. (2014). e-Crowds: a mobile platform for browsing and searching in historical demographyrelated manuscripts. In 14th International Conference on Frontiers in Handwriting Recognition (pp. 228–233).
Abstract: This paper presents a prototype system running on portable devices for browsing and word searching through historical handwritten document collections. The platform adapts the paradigm of eBook reading, where the narrative is not necessarily sequential, but centered on the user actions. The novelty is to replace digitally born books by digitized historical manuscripts of marriage licenses, so document analysis tasks are required in the browser. With an active reading paradigm, the user can cast queries of people names, so he/she can implicitly follow genealogical links. In addition, the system allows combined searches: the user can refine a search by adding more words to search. As a second contribution, the retrieval functionality involves as a core technology a word spotting module with an unified approach, which allows combined query searches, and also two input modalities: query-by-example, and query-by-string.
|
Carlo Gatta, & Francesco Ciompi. (2014). Stacked Sequential Scale-Space Taylor Context. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(8), 1694–1700.
Abstract: We analyze sequential image labeling methods that sample the posterior label field in order to gather contextual information. We propose an effective method that extracts local Taylor coefficients from the posterior at different scales. Results show that our proposal outperforms state-of-the-art methods on MSRC-21, CAMVID, eTRIMS8 and KAIST2 data sets.
|
Pedro Martins, Paulo Carvalho, & Carlo Gatta. (2014). Context-aware features and robust image representations. JVCIR - Journal of Visual Communication and Image Representation, 25(2), 339–348.
Abstract: Local image features are often used to efficiently represent image content. The limited number of types of features that a local feature extractor responds to might be insufficient to provide a robust image representation. To overcome this limitation, we propose a context-aware feature extraction formulated under an information theoretic framework. The algorithm does not respond to a specific type of features; the idea is to retrieve complementary features which are relevant within the image context. We empirically validate the method by investigating the repeatability, the completeness, and the complementarity of context-aware features on standard benchmarks. In a comparison with strictly local features, we show that our context-aware features produce more robust image representations. Furthermore, we study the complementarity between strictly local features and context-aware ones to produce an even more robust representation.
|