Carles Sanchez, Antonio Esteban Lansaque, Agnes Borras, Marta Diez-Ferrer, Antoni Rosell, & Debora Gil. (2017). Towards a Videobronchoscopy Localization System from Airway Centre Tracking. In 12th International Conference on Computer Vision Theory and Applications (pp. 352–359).
Abstract: Bronchoscopists use fluoroscopy to guide flexible bronchoscopy to the lesion to be biopsied without any kind of incision. Being fluoroscopy an imaging technique based on X-rays, the risk of developmental problems and cancer is increased in those subjects exposed to its application, so minimizing radiation is crucial. Alternative guiding systems such as electromagnetic navigation require specific equipment, increase the cost of the clinical procedure and still require fluoroscopy. In this paper we propose an image based guiding system based on the extraction of airway centres from intra-operative videos. Such anatomical landmarks are matched to the airway centreline extracted from a pre-planned CT to indicate the best path to the nodule. We present a
feasibility study of our navigation system using simulated bronchoscopic videos and a multi-expert validation of landmarks extraction in 3 intra-operative ultrathin explorations.
Keywords: Video-bronchoscopy; Lung cancer diagnosis; Airway lumen detection; Region tracking; Guided bronchoscopy navigation
|
Carles Sanchez. (2011). Tracheal ring detection in bronchoscopy (F. J. S. Debora Gil, Ed.) (Vol. 168). Master's thesis, , .
Abstract: Endoscopy is the process in which a camera is introduced inside a human.
Given that endoscopy provides realistic images (in contrast to other modalities) and allows non-invase minimal intervention procedures (which can aid in diagnosis and surgical interventions), its use has spreaded during last decades.
In this project we will focus on bronchoscopic procedures, during which the camera is introduced through the trachea in order to have a diagnostic of the patient. The diagnostic interventions are focused on: degree of stenosis (reduction in tracheal area), prosthesis or early diagnosis of tumors. In the first case, assessment of the luminal area and the calculation of the diameters of the tracheal rings are required. A main limitation is that all the process is done by hand,
which means that the doctor takes all the measurements and decisions just by looking at the screen. As far as we know there is no computational framework for helping the doctors in the diagnosis.
This project will consist of analysing bronchoscopic videos in order to extract useful information for the diagnostic of the degree of stenosis. In particular we will focus on segmentation of the tracheal rings. As a result of this project several strategies (for detecting tracheal rings) had been implemented in order to compare their performance.
Keywords: Bronchoscopy, tracheal ring, segmentation
|
Carles Sanchez. (2014). Tracheal Structure Characterization using Geometric and Appearance Models for Efficient Assessment of Stenosis in Videobronchoscopy (F. Javier Sanchez, Debora Gil, & Jorge Bernal, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: Recent advances in endoscopic devices have increased their use for minimal invasive diagnostic and intervention procedures. Among all endoscopic modalities, bronchoscopy is one of the most frequent with around 261 millions of procedures per year. Although the use of bronchoscopy is spread among clinical facilities it presents some drawbacks, being the visual inspection for the assessment of anatomical measurements the most prevalent of them. In
particular, inaccuracies in the estimation of the degree of stenosis (the percentage of obstructed airway) decreases its diagnostic yield and might lead to erroneous treatments. An objective computation of tracheal stenosis in bronchoscopy videos would constitute a breakthrough for this non-invasive technique and a reduction in treatment cost.
This thesis settles the first steps towards on-line reliable extraction of anatomical information from videobronchoscopy for computation of objective measures. In particular, we focus on the computation of the degree of stenosis, which is obtained by comparing the area delimited by a healthy tracheal ring and the stenosed lumen. Reliable extraction of airway structures in interventional videobronchoscopy is a challenging task. This is mainly due to the large variety of acquisition conditions (positions and illumination), devices (different digitalizations) and in videos acquired at the operating room the unpredicted presence of surgical devices (such as probe ends). This thesis contributes to on-line stenosis assessment in several ways. We
propose a parametric strategy for the extraction of lumen and tracheal rings regions based on the characterization of their geometry and appearance that guide a deformable model. The geometric and appearance characterization is based on a physical model describing the way bronchoscopy images are obtained and includes local and global descriptions. In order to ensure a systematic applicability we present a statistical framework to select the optimal
parameters of our method. Experiments perform on the first public annotated database, show that the performance of our method is comparable to the one provided by clinicians and its computation time allows for a on-line implementation in the operating room.
|
Carles Onielfa, Carles Casacuberta, & Sergio Escalera. (2022). Influence in Social Networks Through Visual Analysis of Image Memes. In Artificial Intelligence Research and Development (Vol. 356, pp. 71–80).
Abstract: Memes evolve and mutate through their diffusion in social media. They have the potential to propagate ideas and, by extension, products. Many studies have focused on memes, but none so far, to our knowledge, on the users that post them, their relationships, and the reach of their influence. In this article, we define a meme influence graph together with suitable metrics to visualize and quantify influence between users who post memes, and we also describe a process to implement our definitions using a new approach to meme detection based on text-to-image area ratio and contrast. After applying our method to a set of users of the social media platform Instagram, we conclude that our metrics add information to already existing user characteristics.
|
Carles Fernandez, Xavier Roca, & Jordi Gonzalez. (2008). Providing Automatic Multilingual Text Generation to Artificial Cognitive Systems.
|
Carles Fernandez, Pau Baiget, Xavier Roca, & Jordi Gonzalez. (2007). Semantic Annotation of Complex Human Scenes for Multimedia Surveillance. In AI* Artificial Intelligence and Human–Oriented Computing. 10th Congress of the Italian Association for Artificial Intelligence, (Vol. 4733, 698–709). LNCS.
|
Carles Fernandez, Pau Baiget, Xavier Roca, & Jordi Gonzalez. (2007). Natural Language Descriptions of Human Behavior from Video Sequences. In Advances in Artificial Intelligence, 30th Annual Conference on Artificial Intelligence (Vol. 4667, 279–292). LNCS.
|
Carles Fernandez, Pau Baiget, Xavier Roca, & Jordi Gonzalez. (2008). Interpretation of Complex Situations in a Semantic-based Surveillance Framework. Signal Processing: Image Communication, Special Issue on Semantic Analysis for Interactive Multimedia Services, 554–569.
Abstract: The integration of cognitive capabilities in computer vision systems requires both to enable high semantic expressiveness and to deal with high computational costs as large amounts of data are involved in the analysis. This contribution describes a cognitive vision system conceived to automatically provide high-level interpretations of complex real-time situations in outdoor and indoor scenarios, and to eventually maintain communication with casual end users in multiple languages. The main contributions are: (i) the design of an integrative multilevel architecture for cognitive surveillance purposes; (ii) the proposal of a coherent taxonomy of knowledge to guide the process of interpretation, which leads to the conception of a situation-based ontology; (iii) the use of situational analysis for content detection and a progressive interpretation of semantically rich scenes, by managing incomplete or uncertain knowledge, and (iv) the use of such an ontological background to enable multilingual capabilities and advanced end-user interfaces. Experimental results are provided to show the feasibility of the proposed approach.
Keywords: Cognitive vision system; Situation analysis; Applied ontologies
|
Carles Fernandez, Pau Baiget, Xavier Roca, & Jordi Gonzalez. (2009). Exploiting Natural Language Generation in Scene Interpretation. In Human–Centric Interfaces for Ambient Intelligence (Vol. 4, 71–93). Elsevier Science and Tech.
|
Carles Fernandez, Pau Baiget, Xavier Roca, & Jordi Gonzalez. (2011). Determining the Best Suited Semantic Events for Cognitive Surveillance. EXSY - Expert Systems with Applications, 38(4), 4068–4079.
Abstract: State-of-the-art systems on cognitive surveillance identify and describe complex events in selected domains, thus providing end-users with tools to easily access the contents of massive video footage. Nevertheless, as the complexity of events increases in semantics and the types of indoor/outdoor scenarios diversify, it becomes difficult to assess which events describe better the scene, and how to model them at a pixel level to fulfill natural language requests. We present an ontology-based methodology that guides the identification, step-by-step modeling, and generalization of the most relevant events to a specific domain. Our approach considers three steps: (1) end-users provide textual evidence from surveilled video sequences; (2) transcriptions are analyzed top-down to build the knowledge bases for event description; and (3) the obtained models are used to generalize event detection to different image sequences from the surveillance domain. This framework produces user-oriented knowledge that improves on existing advanced interfaces for video indexing and retrieval, by determining the best suited events for video understanding according to end-users. We have conducted experiments with outdoor and indoor scenes showing thefts, chases, and vandalism, demonstrating the feasibility and generalization of this proposal.
Keywords: Cognitive surveillance; Event modeling; Content-based video retrieval; Ontologies; Advanced user interfaces
|
Carles Fernandez, Pau Baiget, Xavier Roca, & Jordi Gonzalez. (2011). Augmenting Video Surveillance Footage with Virtual Agents for Incremental Event Evaluation. PRL - Pattern Recognition Letters, 32(6), 878–889.
Abstract: The fields of segmentation, tracking and behavior analysis demand for challenging video resources to test, in a scalable manner, complex scenarios like crowded environments or scenes with high semantics. Nevertheless, existing public databases cannot scale the presence of appearing agents, which would be useful to study long-term occlusions and crowds. Moreover, creating these resources is expensive and often too particularized to specific needs. We propose an augmented reality framework to increase the complexity of image sequences in terms of occlusions and crowds, in a scalable and controllable manner. Existing datasets can be increased with augmented sequences containing virtual agents. Such sequences are automatically annotated, thus facilitating evaluation in terms of segmentation, tracking, and behavior recognition. In order to easily specify the desired contents, we propose a natural language interface to convert input sentences into virtual agent behaviors. Experimental tests and validation in indoor, street, and soccer environments are provided to show the feasibility of the proposed approach in terms of robustness, scalability, and semantics.
|
Carles Fernandez, Pau Baiget, & Jordi Gonzalez. (2008). Cognitive-Guided Semantic Exploitation in Video Surveillance Interfaces. In First International Workshop on Tracking Humans for the Evaluation of their Motion in Image Sequences. BMVC 2008, (53–60).
|
Carles Fernandez, Jordi Gonzalez, & Xavier Roca. (2010). Automatic Learning of Background Semantics in Generic Surveilled Scenes. In 11th European Conference on Computer Vision (Vol. 6313, 678–692). LNCS. Springer Berlin Heidelberg.
Abstract: Advanced surveillance systems for behavior recognition in outdoor traffic scenes depend strongly on the particular configuration of the scenario. Scene-independent trajectory analysis techniques statistically infer semantics in locations where motion occurs, and such inferences are typically limited to abnormality. Thus, it is interesting to design contributions that automatically categorize more specific semantic regions. State-of-the-art approaches for unsupervised scene labeling exploit trajectory data to segment areas like sources, sinks, or waiting zones. Our method, in addition, incorporates scene-independent knowledge to assign more meaningful labels like crosswalks, sidewalks, or parking spaces. First, a spatiotemporal scene model is obtained from trajectory analysis. Subsequently, a so-called GI-MRF inference process reinforces spatial coherence, and incorporates taxonomy-guided smoothness constraints. Our method achieves automatic and effective labeling of conceptual regions in urban scenarios, and is robust to tracking errors. Experimental validation on 5 surveillance databases has been conducted to assess the generality and accuracy of the segmentations. The resulting scene models are used for model-based behavior analysis.
|
Carles Fernandez, Jordi Gonzalez, Joao Manuel R. S. Taveres, & Xavier Roca. (2013). Towards Ontological Cognitive System. In Topics in Medical Image Processing and Computational Vision (Vol. 8, pp. 87–99). Springer Netherlands.
Abstract: The increasing ubiquitousness of digital information in our daily lives has positioned video as a favored information vehicle, and given rise to an astonishing generation of social media and surveillance footage. This raises a series of technological demands for automatic video understanding and management, which together with the compromising attentional limitations of human operators, have motivated the research community to guide its steps towards a better attainment of such capabilities. As a result, current trends on cognitive vision promise to recognize complex events and self-adapt to different environments, while managing and integrating several types of knowledge. Future directions suggest to reinforce the multi-modal fusion of information sources and the communication with end-users.
|
Carles Fernandez, & Jordi Gonzalez. (2007). Ontology for Semantic Integration in a Cognitive Surveillance System. In Semantic Multimedia, 2nd International Conference on Semantics and Digital Media Technologies (Vol. 4816, 263–263). LNCS.
|