|
Mikhail Mozerov, Ignasi Rius, Xavier Roca, & Jordi Gonzalez. (2006). 3D Human Motion Sequences Synchronization Using Dense Matching Algorithm. In 28th Annual Symposium of the German Association for Pattern Recognition, LNCS 4174: 485–494, ISBN 978–3–540–44412–1.
|
|
|
Ignasi Rius, Javier Varona, Xavier Roca, & Jordi Gonzalez. (2006). Posture Constraints for Bayesian Human Motion Tracking. In IV Conference on Articulated Motion and Deformable Objects (AMDO´06), LNCS 4069: 414–423.
|
|
|
Francisco Javier Orozco, Jordi Gonzalez, Ignasi Rius, & Xavier Roca. (2007). Hierarchical Eyelid and Face Tracking. In 3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2007), J. Marti et al. (Eds.) LNCS 4477:499–506.
|
|
|
Ivan Huerta, Dani Rowe, Mikhail Mozerov, & Jordi Gonzalez. (2007). Improving Background Subtraction based on a Casuistry of Colour-Motion Segmentation Problems. In 3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2007), J. Marti et al. (Eds.) LNCS 4478:475–482.
|
|
|
Pau Baiget, Carles Fernandez, Xavier Roca, & Jordi Gonzalez. (2007). Automatic Learning of Conceptual Knowledge for the Interpretation of Human Behavior in Video Sequences. In 3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2007), J. Marti et al. (Eds.) LNCS 4477:507–514.
|
|
|
Ivan Huerta, Ariel Amato, Jordi Gonzalez, & Juan J. Villanueva. (2008). Fusing Edge Cues to Handle Colour Problems in Image Segmentation. In Articulated Motion and Deformable Objects, 5th International Conference (Vol. 5098, 279–288). LNCS.
|
|
|
Pau Baiget, Xavier Roca, & Jordi Gonzalez. (2008). Autonomous Virtual Agents for Performance Evaluation of Tracking Algorithms. In Articulated Motion and Deformable Objects, 5th International Conference AMDO 2008, (Vol. 5098, pp. 299–308). LNCS.
|
|
|
Bhaskar Chakraborty, Marco Pedersoli, & Jordi Gonzalez. (2008). View-Invariant Human Action Detection using Component-Wise HMM of Body Parts. In Articulated Motion and Deformable Objects, 5th International Conference (Vol. 5098, 208–217). LNCS.
|
|
|
Carles Fernandez, Pau Baiget, Xavier Roca, & Jordi Gonzalez. (2009). Exploiting Natural Language Generation in Scene Interpretation. In Human–Centric Interfaces for Ambient Intelligence (Vol. 4, 71–93). Elsevier Science and Tech.
|
|
|
Nataliya Shapovalova, Carles Fernandez, Xavier Roca, & Jordi Gonzalez. (2011). Semantics of Human Behavior in Image Sequences. In Albert Ali Salah, & (Ed.), Computer Analysis of Human Behavior (pp. 151–182). Springer London.
Abstract: Human behavior is contextualized and understanding the scene of an action is crucial for giving proper semantics to behavior. In this chapter we present a novel approach for scene understanding. The emphasis of this work is on the particular case of Human Event Understanding. We introduce a new taxonomy to organize the different semantic levels of the Human Event Understanding framework proposed. Such a framework particularly contributes to the scene understanding domain by (i) extracting behavioral patterns from the integrative analysis of spatial, temporal, and contextual evidence and (ii) integrative analysis of bottom-up and top-down approaches in Human Event Understanding. We will explore how the information about interactions between humans and their environment influences the performance of activity recognition, and how this can be extrapolated to the temporal domain in order to extract higher inferences from human events observed in sequences of images.
|
|
|
Murad Al Haj, Carles Fernandez, Zhanwu Xiong, Ivan Huerta, Jordi Gonzalez, & Xavier Roca. (2011). Beyond the Static Camera: Issues and Trends in Active Vision. In Th.B. Moeslund, A. Hilton, V. Krüger, & L. Sigal (Eds.), Visual Analysis of Humans: Looking at People (pp. 11–30). Springer London.
Abstract: Maximizing both the area coverage and the resolution per target is highly desirable in many applications of computer vision. However, with a limited number of cameras viewing a scene, the two objectives are contradictory. This chapter is dedicated to active vision systems, trying to achieve a trade-off between these two aims and examining the use of high-level reasoning in such scenarios. The chapter starts by introducing different approaches to active cameras configurations. Later, a single active camera system to track a moving object is developed, offering the reader first-hand understanding of the issues involved. Another section discusses practical considerations in building an active vision platform, taking as an example a multi-camera system developed for a European project. The last section of the chapter reflects upon the future trends of using semantic factors to drive smartly coordinated active systems.
|
|
|
Pau Baiget, Carles Fernandez, Xavier Roca, & Jordi Gonzalez. (2012). Trajectory-Based Abnormality Categorization for Learning Route Patterns in Surveillance. In Detection and Identification of Rare Audiovisual Cues, Studies in Computational Intelligence (Vol. 384, pp. 87–95). Springer Berlin Heidelberg.
Abstract: The recognition of abnormal behaviors in video sequences has raised as a hot topic in video understanding research. Particularly, an important challenge resides on automatically detecting abnormality. However, there is no convention about the types of anomalies that training data should derive. In surveillance, these are typically detected when new observations differ substantially from observed, previously learned behavior models, which represent normality. This paper focuses on properly defining anomalies within trajectory analysis: we propose a hierarchical representation conformed by Soft, Intermediate, and Hard Anomaly, which are identified from the extent and nature of deviation from learned models. Towards this end, a novel Gaussian Mixture Model representation of learned route patterns creates a probabilistic map of the image plane, which is applied to detect and classify anomalies in real-time. Our method overcomes limitations of similar existing approaches, and performs correctly even when the tracking is affected by different sources of noise. The reliability of our approach is demonstrated experimentally.
|
|
|
Carles Fernandez, Jordi Gonzalez, Joao Manuel R. S. Taveres, & Xavier Roca. (2013). Towards Ontological Cognitive System. In Topics in Medical Image Processing and Computational Vision (Vol. 8, pp. 87–99). Springer Netherlands.
Abstract: The increasing ubiquitousness of digital information in our daily lives has positioned video as a favored information vehicle, and given rise to an astonishing generation of social media and surveillance footage. This raises a series of technological demands for automatic video understanding and management, which together with the compromising attentional limitations of human operators, have motivated the research community to guide its steps towards a better attainment of such capabilities. As a result, current trends on cognitive vision promise to recognize complex events and self-adapt to different environments, while managing and integrating several types of knowledge. Future directions suggest to reinforce the multi-modal fusion of information sources and the communication with end-users.
|
|
|
Marc Castello, Jordi Gonzalez, Ariel Amato, Pau Baiget, Carles Fernandez, Josep M. Gonfaus, et al. (2013). Exploiting Multimodal Interaction Techniques for Video-Surveillance. In Multimodal Interaction in Image and Video Applications Intelligent Systems Reference Library (Vol. 48, pp. 135–151). Springer Berlin Heidelberg.
Abstract: In this paper we present an example of a video surveillance application that exploits Multimodal Interactive (MI) technologies. The main objective of the so-called VID-Hum prototype was to develop a cognitive artificial system for both the detection and description of a particular set of human behaviours arising from real-world events. The main procedure of the prototype described in this chapter entails: (i) adaptation, since the system adapts itself to the most common behaviours (qualitative data) inferred from tracking (quantitative data) thus being able to recognize abnormal behaviors; (ii) feedback, since an advanced interface based on Natural Language understanding allows end-users the communicationwith the prototype by means of conceptual sentences; and (iii) multimodality, since a virtual avatar has been designed to describe what is happening in the scene, based on those textual interpretations generated by the prototype. Thus, the MI methodology has provided an adequate framework for all these cooperating processes.
|
|
|
Ariel Amato, Ivan Huerta, Mikhail Mozerov, Xavier Roca, & Jordi Gonzalez. (2014). Moving Cast Shadows Detection Methods for Video Surveillance Applications. In Augmented Vision and Reality (Vol. 6, pp. 23–47). Springer Berlin Heidelberg.
Abstract: Moving cast shadows are a major concern in today’s performance from broad range of many vision-based surveillance applications because they highly difficult the object classification task. Several shadow detection methods have been reported in the literature during the last years. They are mainly divided into two domains. One usually works with static images, whereas the second one uses image sequences, namely video content. In spite of the fact that both cases can be analogously analyzed, there is a difference in the application field. The first case, shadow detection methods can be exploited in order to obtain additional geometric and semantic cues about shape and position of its casting object (‘shape from shadows’) as well as the localization of the light source. While in the second one, the main purpose is usually change detection, scene matching or surveillance (usually in a background subtraction context). Shadows can in fact modify in a negative way the shape and color of the target object and therefore affect the performance of scene analysis and interpretation in many applications. This chapter wills mainly reviews shadow detection methods as well as their taxonomies related with the second case, thus aiming at those shadows which are associated with moving objects (moving shadows).
|
|