Publicacions CVC -- Query Results

Zhong Jin, Zhen Lou, Jing-Yu Yang, & Quan-sen Sun. (2005). Face detection using template matching and skin color information. http://refbase.cvc.uab.es/show.php?record=627
Zhong Jin, Jing-Yu Yang, & Zhen Lou. (2005). A luminance-conditional distribution model of skin color information. http://refbase.cvc.uab.es/show.php?record=628
Zhong Jin, Franck Davoine, & Zhen Lou. (2003). Facial expression analysis by using KPCA. http://refbase.cvc.uab.es/show.php?record=431
Zhong Jin, Franck Davoine, & Zhen Lou. (2004). An Effective EM Algorithm for PCA Mixture Model. http://refbase.cvc.uab.es/show.php?record=482
Zhong Jin, & Franck Davoine. (2004). Orthogonal ICA Representation Of Images. http://refbase.cvc.uab.es/show.php?record=499
Y. Patel, Lluis Gomez, Raul Gomez, Marçal Rusiñol, Dimosthenis Karatzas, & C.V. Jawahar. (2018). TextTopicNet-Self-Supervised Learning of Visual Features Through Embedding Images on Semantic Text Spaces. Abstract: The immense success of deep learning based methods in computer vision heavily relies on large scale training datasets. These richly annotated datasets help the network learn discriminative visual features. Collecting and annotating such datasets requires a tremendous amount of human effort and annotations are limited to popular set of classes. As an alternative, learning visual features by designing auxiliary tasks which make use of freely available self-supervision has become increasingly popular in the computer vision community. In this paper, we put forward an idea to take advantage of multi-modal context to provide self-supervision for the training of computer vision algorithms. We show that adequate visual features can be learned efficiently by training a CNN to predict the semantic textual context in which a particular image is more probable to appear as an illustration. More specifically we use popular text embedding techniques to provide the self-supervision for the training of deep CNN. http://refbase.cvc.uab.es/show.php?record=3177
Xose M. Pardo, Petia Radeva, & Juan J. Villanueva. (1999). Self-Training Statistic Snake for Image Segmentation and Tracking.. Abstract: . http://refbase.cvc.uab.es/show.php?record=26
Xavier Roca, X. Binefa, & Jordi Vitria. (1998). A New Autofocus Algorithm for Cytological Tissue in a Microscopy Environment.. http://refbase.cvc.uab.es/show.php?record=16
Xavier Roca, Jordi Vitria, Maria Vanrell, & Juan J. Villanueva. (1999). Visual behaviours for binocular navigation with autonomous systems.. http://refbase.cvc.uab.es/show.php?record=13
Xavier Roca, Jordi Vitria, Maria Vanrell, & Juan J. Villanueva. (2000). Visual behaviours for binocular navigation with autonomous systems.. http://refbase.cvc.uab.es/show.php?record=245
Xavier Otazu, & Maria Vanrell. (2004). Building Perceived Colour Images.. http://refbase.cvc.uab.es/show.php?record=450
Xavier Otazu, & Maria Vanrell. (2005). A surround-induction function to unify assimilation and contrast in a computational model of color apearance. http://refbase.cvc.uab.es/show.php?record=568
Xavier Otazu, & J. Nuñez. (2001). Algoritmo de Clasificacion no Supervisada Basado en Wavelets.. http://refbase.cvc.uab.es/show.php?record=147
Xavier Baro, & Jordi Vitria. (2005). Feature Selection with Non-Parametric Mutual Information for Adaboost Learning. http://refbase.cvc.uab.es/show.php?record=582
Xavier Baro, David Masip, Elena Planas, & Julia Minguillon. (2013). PeLP: Plataforma para el Aprendizaje de Lenguajes de Programación. http://refbase.cvc.uab.es/show.php?record=2237

Abstract: The immense success of deep learning based methods in computer vision heavily relies on large scale training datasets. These richly annotated datasets help the network learn discriminative visual features. Collecting and annotating such datasets requires a tremendous amount of human effort and annotations are limited to popular set of classes. As an alternative, learning visual features by designing auxiliary tasks which make use of freely available self-supervision has become increasingly popular in the computer vision community.
In this paper, we put forward an idea to take advantage of multi-modal context to provide self-supervision for the training of computer vision algorithms. We show that adequate visual features can be learned efficiently by training a CNN to predict the semantic textual context in which a particular image is more probable to appear as an illustration. More specifically we use popular text embedding techniques to provide the self-supervision for the training of deep CNN.

http://refbase.cvc.uab.es/show.php?record=3177