|
Francisco Javier Orozco, Ognjen Rudovic, Jordi Gonzalez, & Maja Pantic. (2013). Hierarchical On-line Appearance-Based Tracking for 3D Head Pose, Eyebrows, Lips, Eyelids and Irises. IMAVIS - Image and Vision Computing, 31(4), 322–340.
Abstract: In this paper, we propose an On-line Appearance-Based Tracker (OABT) for simultaneous tracking of 3D head pose, lips, eyebrows, eyelids and irises in monocular video sequences. In contrast to previously proposed tracking approaches, which deal with face and gaze tracking separately, our OABT can also be used for eyelid and iris tracking, as well as 3D head pose, lips and eyebrows facial actions tracking. Furthermore, our approach applies an on-line learning of changes in the appearance of the tracked target. Hence, the prior training of appearance models, which usually requires a large amount of labeled facial images, is avoided. Moreover, the proposed method is built upon a hierarchical combination of three OABTs, which are optimized using a Levenberg–Marquardt Algorithm (LMA) enhanced with line-search procedures. This, in turn, makes the proposed method robust to changes in lighting conditions, occlusions and translucent textures, as evidenced by our experiments. Finally, the proposed method achieves head and facial actions tracking in real-time.
Keywords: On-line appearance models; Levenberg–Marquardt algorithm; Line-search optimization; 3D face tracking; Facial action tracking; Eyelid tracking; Iris tracking
|
|
|
Ivan Huerta, Michael Holte, Thomas B. Moeslund, & Jordi Gonzalez. (2015). Chromatic shadow detection and tracking for moving foreground segmentation. IMAVIS - Image and Vision Computing, 41, 42–53.
Abstract: Advanced segmentation techniques in the surveillance domain deal with shadows to avoid distortions when detecting moving objects. Most approaches for shadow detection are still typically restricted to penumbra shadows and cannot cope well with umbra shadows. Consequently, umbra shadow regions are usually detected as part of moving objects, thus aecting the performance of the nal detection. In this paper we address the detection of both penumbra and umbra shadow regions. First, a novel bottom-up approach is presented based on gradient and colour models, which successfully discriminates between chromatic moving cast shadow regions and those regions detected as moving objects. In essence, those regions corresponding to potential shadows are detected based on edge partitioning and colour statistics. Subsequently (i) temporal similarities between textures and (ii) spatial similarities between chrominance angle and brightness distortions are analysed for each potential shadow region for detecting the umbra shadow regions. Our second contribution renes even further the segmentation results: a tracking-based top-down approach increases the performance of our bottom-up chromatic shadow detection algorithm by properly correcting non-detected shadows.
To do so, a combination of motion lters in a data association framework exploits the temporal consistency between objects and shadows to increase
the shadow detection rate. Experimental results exceed current state-of-the-
art in shadow accuracy for multiple well-known surveillance image databases which contain dierent shadowed materials and illumination conditions.
Keywords: Detecting moving objects; Chromatic shadow detection; Temporal local gradient; Spatial and Temporal brightness and angle distortions; Shadow tracking
|
|
|
Pau Rodriguez, Miguel Angel Bautista, Sergio Escalera, & Jordi Gonzalez. (2018). Beyond Oneshot Encoding: lower dimensional target embedding. IMAVIS - Image and Vision Computing, 75, 21–31.
Abstract: Target encoding plays a central role when learning Convolutional Neural Networks. In this realm, one-hot encoding is the most prevalent strategy due to its simplicity. However, this so widespread encoding schema assumes a flat label space, thus ignoring rich relationships existing among labels that can be exploited during training. In large-scale datasets, data does not span the full label space, but instead lies in a low-dimensional output manifold. Following this observation, we embed the targets into a low-dimensional space, drastically improving convergence speed while preserving accuracy. Our contribution is two fold: (i) We show that random projections of the label space are a valid tool to find such lower dimensional embeddings, boosting dramatically convergence rates at zero computational cost; and (ii) we propose a normalized eigenrepresentation of the class manifold that encodes the targets with minimal information loss, improving the accuracy of random projections encoding while enjoying the same convergence rates. Experiments on CIFAR-100, CUB200-2011, Imagenet, and MIT Places demonstrate that the proposed approach drastically improves convergence speed while reaching very competitive accuracy rates.
Keywords: Error correcting output codes; Output embeddings; Deep learning; Computer vision
|
|
|
Pau Rodriguez, Jordi Gonzalez, Josep M. Gonfaus, & Xavier Roca. (2019). Integrating Vision and Language in Social Networks for Identifying Visual Patterns of Personality Traits. IJSSH - International Journal of Social Science and Humanity, 6–12.
Abstract: Social media, as a major platform for communication and information exchange, is a rich repository of the opinions and sentiments of 2.3 billion users about a vast spectrum of topics. In this sense, user text interactions are widely used to sense the whys of certain social user’s demands and cultural- driven interests. However, the knowledge embedded in the 1.8 billion pictures which are uploaded daily in public profiles has just started to be exploited. Following this trend on visual-based social analysis, we present a novel methodology based on neural networks to build a combined image-and-text based personality trait model, trained with images posted together with words found highly correlated to specific personality traits. So, the key contribution in this work is to explore whether OCEAN personality trait modeling can be addressed based on images, here called MindPics, appearing with certain tags with psychological insights. We found that there is a correlation between posted images and the personality estimated from their accompanying texts. Thus, the experimental results are consistent with previous cyber-psychology results based on texts, suggesting that images could also be used for personality estimation: classification results on some personality traits show that specific and characteristic visual patterns emerge, in essence representing abstract concepts. These results open new avenues of research for further refining the proposed personality model under the supervision of psychology experts, and to further substitute current textual personality questionnaires by image-based ones.
|
|
|
Wenjuan Gong, W.Zhang, Jordi Gonzalez, Y.Ren, & Z.Li. (2015). Enhanced Asymmetric Bilinear Model for Face Recognition. IJDSN - International Journal of Distributed Sensor Networks, , Article ID 218514.
Abstract: Bilinear models have been successfully applied to separate two factors, for example, pose variances and different identities in face recognition problems. Asymmetric model is a type of bilinear model which models a system in the most concise way. But seldom there are works exploring the applications of asymmetric bilinear model on face recognition problem with illumination changes. In this work, we propose enhanced asymmetric model for illumination-robust face recognition. Instead of initializing the factor probabilities randomly, we initialize them with nearest neighbor method and optimize them for the test data. Above that, we update the factor model to be identified. We validate the proposed method on a designed data sample and extended Yale B dataset. The experiment results show that the enhanced asymmetric models give promising results and good recognition accuracies.
|
|