Cesar de Souza, Adrien Gaidon, Yohann Cabon, & Antonio Lopez. (2017). Procedural Generation of Videos to Train Deep Action Recognition Networks. In 30th IEEE Conference on Computer Vision and Pattern Recognition (pp. 2594–2604).
Abstract: Deep learning for human action recognition in videos is making significant progress, but is slowed down by its dependency on expensive manual labeling of large video collections. In this work, we investigate the generation of synthetic training data for action recognition, as it has recently shown promising results for a variety of other computer vision tasks. We propose an interpretable parametric generative model of human action videos that relies on procedural generation and other computer graphics techniques of modern game engines. We generate a diverse, realistic, and physically plausible dataset of human action videos, called PHAV for ”Procedural Human Action Videos”. It contains a total of 39, 982 videos, with more than 1, 000 examples for each action of 35 categories. Our approach is not limited to existing motion capture sequences, and we procedurally define 14 synthetic actions. We introduce a deep multi-task representation learning architecture to mix synthetic and real videos, even if the action categories differ. Our experiments on the UCF101 and HMDB51 benchmarks suggest that combining our large set of synthetic videos with small real-world datasets can boost recognition performance, significantly
outperforming fine-tuning state-of-the-art unsupervised generative models of videos.
|
Fernando Barrera, Felipe Lumbreras, & Angel Sappa. (2010). Multimodal Template Matching based on Gradient and Mutual Information using Scale-Space. In 17th IEEE International Conference on Image Processing (2749–2752).
Abstract: This paper presents the combined use of gradient and mutual information for infrared and intensity templates matching. We propose to joint: (i) feature matching in a multiresolution context and (ii) information propagation through scale-space representations. Our method consists in combining mutual information with a shape descriptor based on gradient, and propagate them following a coarse-to-fine strategy. The main contributions of this work are: to offer a theoretical formulation towards a multimodal stereo matching; to show that gradient and mutual information can be reinforced while they are propagated between consecutive levels; and to show that they are valid cost functions in multimodal template matchings. Comparisons are presented showing the improvements and viability of the proposed approach.
|
Mohammad Rouhani, & Angel Sappa. (2010). A Fast accurate Implicit Polynomial Fitting Approach. In 17th IEEE International Conference on Image Processing (1429–1432).
Abstract: This paper presents a novel hybrid approach that combines state of the art fitting algorithms: algebraic-based and geometric-based. It consists of two steps; first, the 3L algorithm is used as an initialization and then, the obtained result, is improved through a geometric approach. The adopted geometric approach is based on a distance estimation that avoids costly search for the real orthogonal distance. Experimental results are presented as well as quantitative comparisons.
|
Jaume Amores, David Geronimo, & Antonio Lopez. (2010). Multiple instance and active learning for weakly-supervised object-class segmentation. In 3rd IEEE International Conference on Machine Vision.
Abstract: In object-class segmentation, one of the most tedious tasks is to manually segment many object examples in order to learn a model of the object category. Yet, there has been little research on reducing the degree of manual annotation for
object-class segmentation. In this work we explore alternative strategies which do not require full manual segmentation of the object in the training set. In particular, we study the use of bounding boxes as a coarser and much cheaper form of segmentation and we perform a comparative study of several Multiple-Instance Learning techniques that allow to obtain a model with this type of weak annotation. We show that some of these methods can be competitive, when used with coarse
segmentations, with methods that require full manual segmentation of the objects. Furthermore, we show how to use active learning combined with this weakly supervised strategy.
As we see, this strategy permits to reduce the amount of annotation and optimize the number of examples that require full manual segmentation in the training set.
Keywords: Multiple Instance Learning; Active Learning; Object-class segmentation.
|
Alicia Fornes, Josep Llados, & Gemma Sanchez. (2005). Primitive Segmentation in Old Handwritten Music Scores.
|
Anton Cervantes, Gemma Sanchez, Josep Llados, Agnes Borras, & A. Rodriguez. (2005). Biometric Recognition Based on Line Shape Descriptors. In Sixth IAPR International Workshop on Graphics Recognition (GREC 2005) (335–344).
|
Joan Mas, Gemma Sanchez, & Josep Llados. (2005). An Incremental Parser to Recognize Diagram Symbols and Gestures represented by Adjacency Grammars.
|
N. Zakaria, Jean-Marc Ogier, & Josep Llados. (2005). On-line Graphics Recognition based on Invariant Spatio-Sequential Descriptor: Fuzzy Matrix.
|
Ignasi Rius, J. Varona, Jordi Gonzalez, & Juan J. Villanueva. (2006). Action Spaces for Efficient Bayesian Tracking of Human Motion.
|
W. Liu, & Josep Llados. (2006). Graphics Recognition. Ten Years Review and Future Perspectives (Vol. 3926). LNCS.
|
Marçal Rusiñol, & Josep Llados. (2005). Symbol Spotting in Technical Drawings Using Vectorial Signatures.
|
Zhong Jin, Franck Davoine, Zhen Lou, & Jing-Yu Yang. (2006). A novel PCA-based Bayes classifier and face analysis. In International Conference on Advances in Biometrics (ICB’06), LNCS 3832: 144–150.
|
Michael Villamizar, A. Sanfeliu, & Juan Andrade. (2006). Computation of Rotation Local Invariant Features using the Integral Image for Real Time Object Detection.
|
Sergio Escalera, Oriol Pujol, & Petia Radeva. (2006). Boosted Landmarks of Contextual Descriptors and Forest-ECOC: a novel framework to detect and classify objects in cluttered scenes.
|
Sergio Escalera, Oriol Pujol, & Petia Radeva. (2006). ECOC-ONE: A novel coding and decoding strategy.
|