PT Journal AU Carles Fernandez Pau Baiget Xavier Roca Jordi Gonzalez TI Determining the Best Suited Semantic Events for Cognitive Surveillance SO Expert Systems with Applications JI EXSY PY 2011 BP 4068–4079 VL 38 IS 4 DI 10.1016/j.eswa.2010.09.070 DE Cognitive surveillance; Event modeling; Content-based video retrieval; Ontologies; Advanced user interfaces AB State-of-the-art systems on cognitive surveillance identify and describe complex events in selected domains, thus providing end-users with tools to easily access the contents of massive video footage. Nevertheless, as the complexity of events increases in semantics and the types of indoor/outdoor scenarios diversify, it becomes difficult to assess which events describe better the scene, and how to model them at a pixel level to fulfill natural language requests. We present an ontology-based methodology that guides the identification, step-by-step modeling, and generalization of the most relevant events to a specific domain. Our approach considers three steps: (1) end-users provide textual evidence from surveilled video sequences; (2) transcriptions are analyzed top-down to build the knowledge bases for event description; and (3) the obtained models are used to generalize event detection to different image sequences from the surveillance domain. This framework produces user-oriented knowledge that improves on existing advanced interfaces for video indexing and retrieval, by determining the best suited events for video understanding according to end-users. We have conducted experiments with outdoor and indoor scenes showing thefts, chases, and vandalism, demonstrating the feasibility and generalization of this proposal. ER