Yuhua Luo, Francisco Jose Perales, & Juan J. Villanueva. (1992). An automatic Rotoscopy System for Human Motion Based on a Biomedical Graphical Model. Computer & Graphics, 16(4), 355–362.
|
B. Gotschy, Matthias S. Keil, H. Klos, & I. Rystau. (1994). Transition from static to dynamic Jahn-Teller distortion in (P(C6 H5)4)2 C60|. Solid State Communications, 92(12), 935–938.
|
Antonio Clavelli, Dimosthenis Karatzas, Josep Llados, Mario Ferraro, & Giuseppe Boccignone. (2014). Modelling task-dependent eye guidance to objects in pictures. CoCom - Cognitive Computation, 6(3), 558–584.
Abstract: 5Y Impact Factor: 1.14 / 3rd (Computer Science, Artificial Intelligence)
We introduce a model of attentional eye guidance based on the rationale that the deployment of gaze is to be considered in the context of a general action-perception loop relying on two strictly intertwined processes: sensory processing, depending on current gaze position, identifies sources of information that are most valuable under the given task; motor processing links such information with the oculomotor act by sampling the next gaze position and thus performing the gaze shift. In such a framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the payoff of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects. The different levels of the action-perception loop are represented in probabilistic form and eventually give rise to a stochastic process that generates the gaze sequence. This way the model also accounts for statistical properties of gaze shifts such as individual scan path variability. Results of the simulations are compared either with experimental data derived from publicly available datasets and from our own experiments.
Keywords: Visual attention; Gaze guidance; Value; Payoff; Stochastic fixation prediction
|
Antonio Lopez, Joan Serrat, Cristina Cañero, Felipe Lumbreras, & T. Graf. (2010). Robust lane markings detection and road geometry computation. IJAT - International Journal of Automotive Technology, 11(3), 395–407.
Abstract: Detection of lane markings based on a camera sensor can be a low-cost solution to lane departure and curve-over-speed warnings. A number of methods and implementations have been reported in the literature. However, reliable detection is still an issue because of cast shadows, worn and occluded markings, variable ambient lighting conditions, for example. We focus on increasing detection reliability in two ways. First, we employed an image feature other than the commonly used edges: ridges, which we claim addresses this problem better. Second, we adapted RANSAC, a generic robust estimation method, to fit a parametric model of a pair of lane lines to the image features, based on both ridgeness and ridge orientation. In addition, the model was fitted for the left and right lane lines simultaneously to enforce a consistent result. Four measures of interest for driver assistance applications were directly computed from the fitted parametric model at each frame: lane width, lane curvature, and vehicle yaw angle and lateral offset with regard the lane medial axis. We qualitatively assessed our method in video sequences captured on several road types and under very different lighting conditions. We also quantitatively assessed it on synthetic but realistic video sequences for which road geometry and vehicle trajectory ground truth are known.
Keywords: lane markings
|
Hamdi Dibeklioglu, M.O. Hortas, I. Kosunen, P. Zuzánek, Albert Ali Salah, & Theo Gevers. (2011). Design and implementation of an affect-responsive interactive photo frame. JMUI - Journal on Multimodal User Interfaces, 81–95.
Abstract: This paper describes an affect-responsive interactive photo-frame application that offers its user a different experience with every use. It relies on visual analysis of activity levels and facial expressions of its users to select responses from a database of short video segments. This ever-growing database is automatically prepared by an offline analysis of user-uploaded videos. The resulting system matches its user’s affect along dimensions of valence and arousal, and gradually adapts its response to each specific user. In an extended mode, two such systems are coupled and feed each other with visual content. The strengths and weaknesses of the system are assessed through a usability study, where a Wizard-of-Oz response logic is contrasted with the fully automatic system that uses affective and activity-based features, either alone, or in tandem.
|
Jorge Bernal, Aymeric Histace, Marc Masana, Quentin Angermann, Cristina Sanchez Montes, Cristina Rodriguez de Miguel, et al. (2019). GTCreator: a flexible annotation tool for image-based datasets. IJCAR - International Journal of Computer Assisted Radiology and Surgery, 14(2), 191–201.
Abstract: Abstract Purpose: Methodology evaluation for decision support systems for health is a time consuming-task. To assess performance of polyp detection
methods in colonoscopy videos, clinicians have to deal with the annotation
of thousands of images. Current existing tools could be improved in terms of
exibility and ease of use. Methods:We introduce GTCreator, a exible annotation tool for providing image and text annotations to image-based datasets.
It keeps the main basic functionalities of other similar tools while extending
other capabilities such as allowing multiple annotators to work simultaneously
on the same task or enhanced dataset browsing and easy annotation transfer aiming to speed up annotation processes in large datasets. Results: The
comparison with other similar tools shows that GTCreator allows to obtain
fast and precise annotation of image datasets, being the only one which offers
full annotation editing and browsing capabilites. Conclusions: Our proposed
annotation tool has been proven to be efficient for large image dataset annota-
tion, as well as showing potential of use in other stages of method evaluation
such as experimental setup or results analysis.
Keywords: Annotation tool; Validation Framework; Benchmark; Colonoscopy; Evaluation
|
Debora Gil, Ruth Aris, Agnes Borras, Esmitt Ramirez, Rafael Sebastian, & Mariano Vazquez. (2019). Influence of fiber connectivity in simulations of cardiac biomechanics. IJCAR - International Journal of Computer Assisted Radiology and Surgery, 14(1), 63–72.
Abstract: PURPOSE:
Personalized computational simulations of the heart could open up new improved approaches to diagnosis and surgery assistance systems. While it is fully recognized that myocardial fiber orientation is central for the construction of realistic computational models of cardiac electromechanics, the role of its overall architecture and connectivity remains unclear. Morphological studies show that the distribution of cardiac muscular fibers at the basal ring connects epicardium and endocardium. However, computational models simplify their distribution and disregard the basal loop. This work explores the influence in computational simulations of fiber distribution at different short-axis cuts.
METHODS:
We have used a highly parallelized computational solver to test different fiber models of ventricular muscular connectivity. We have considered two rule-based mathematical models and an own-designed method preserving basal connectivity as observed in experimental data. Simulated cardiac functional scores (rotation, torsion and longitudinal shortening) were compared to experimental healthy ranges using generalized models (rotation) and Mahalanobis distances (shortening, torsion).
RESULTS:
The probability of rotation was significantly lower for ruled-based models [95% CI (0.13, 0.20)] in comparison with experimental data [95% CI (0.23, 0.31)]. The Mahalanobis distance for experimental data was in the edge of the region enclosing 99% of the healthy population.
CONCLUSIONS:
Cardiac electromechanical simulations of the heart with fibers extracted from experimental data produce functional scores closer to healthy ranges than rule-based models disregarding architecture connectivity.
Keywords: Cardiac electromechanical simulations; Diffusion tensor imaging; Fiber connectivity
|
Carles Sanchez, Jorge Bernal, F. Javier Sanchez, Antoni Rosell, Marta Diez-Ferrer, & Debora Gil. (2015). Towards On-line Quantification of Tracheal Stenosis from Videobronchoscopy. IJCAR - International Journal of Computer Assisted Radiology and Surgery, 10(6), 935–945.
|
Sergio Escalera, Oriol Pujol, J. Mauri, & Petia Radeva. (2009). Intravascular Ultrasound Tissue Characterization with Sub-class Error-Correcting Output Codes. Journal of Signal Processing Systems, 55(1-3), 35–47.
Abstract: Intravascular ultrasound (IVUS) represents a powerful imaging technique to explore coronary vessels and to study their morphology and histologic properties. In this paper, we characterize different tissues based on radial frequency, texture-based, and combined features. To deal with the classification of multiple tissues, we require the use of robust multi-class learning techniques. In this sense, error-correcting output codes (ECOC) show to robustly combine binary classifiers to solve multi-class problems. In this context, we propose a strategy to model multi-class classification tasks using sub-classes information in the ECOC framework. The new strategy splits the classes into different sub-sets according to the applied base classifier. Complex IVUS data sets containing overlapping data are learnt by splitting the original set of classes into sub-classes, and embedding the binary problems in a problem-dependent ECOC design. The method automatically characterizes different tissues, showing performance improvements over the state-of-the-art ECOC techniques for different base classifiers. Furthermore, the combination of RF and texture-based features also shows improvements over the state-of-the-art approaches.
|
Meysam Madadi, Hugo Bertiche, & Sergio Escalera. (2021). Deep unsupervised 3D human body reconstruction from a sparse set of landmarks. IJCV - International Journal of Computer Vision, 129, 2499–2512.
Abstract: In this paper we propose the first deep unsupervised approach in human body reconstruction to estimate body surface from a sparse set of landmarks, so called DeepMurf. We apply a denoising autoencoder to estimate missing landmarks. Then we apply an attention model to estimate body joints from landmarks. Finally, a cascading network is applied to regress parameters of a statistical generative model that reconstructs body. Our set of proposed loss functions allows us to train the network in an unsupervised way. Results on four public datasets show that our approach accurately reconstructs the human body from real world mocap data.
|
Cesar de Souza, Adrien Gaidon, Yohann Cabon, Naila Murray, & Antonio Lopez. (2020). Generating Human Action Videos by Coupling 3D Game Engines and Probabilistic Graphical Models. IJCV - International Journal of Computer Vision, 128, 1505–1536.
Abstract: Deep video action recognition models have been highly successful in recent years but require large quantities of manually-annotated data, which are expensive and laborious to obtain. In this work, we investigate the generation of synthetic training data for video action recognition, as synthetic data have been successfully used to supervise models for a variety of other computer vision tasks. We propose an interpretable parametric generative model of human action videos that relies on procedural generation, physics models and other components of modern game engines. With this model we generate a diverse, realistic, and physically plausible dataset of human action videos, called PHAV for “Procedural Human Action Videos”. PHAV contains a total of 39,982 videos, with more than 1000 examples for each of 35 action categories. Our video generation approach is not limited to existing motion capture sequences: 14 of these 35 categories are procedurally-defined synthetic actions. In addition, each video is represented with 6 different data modalities, including RGB, optical flow and pixel-level semantic labels. These modalities are generated almost simultaneously using the Multiple Render Targets feature of modern GPUs. In order to leverage PHAV, we introduce a deep multi-task (i.e. that considers action classes from multiple datasets) representation learning architecture that is able to simultaneously learn from synthetic and real video datasets, even when their action categories differ. Our experiments on the UCF-101 and HMDB-51 benchmarks suggest that combining our large set of synthetic videos with small real-world datasets can boost recognition performance. Our approach also significantly outperforms video representations produced by fine-tuning state-of-the-art unsupervised generative models of videos.
Keywords: Procedural generation; Human action recognition; Synthetic data; Physics
|
Cristina Palmero, Jordi Esquirol, Vanessa Bayo, Miquel Angel Cos, Pouya Ahmadmonfared, Joan Salabert, et al. (2017). Automatic Sleep System Recommendation by Multi-modal RBG-Depth-Pressure Anthropometric Analysis. IJCV - International Journal of Computer Vision, 122(2), 212–227.
Abstract: This paper presents a novel system for automatic sleep system recommendation using RGB, depth and pressure information. It consists of a validated clinical knowledge-based model that, along with a set of prescription variables extracted automatically, obtains a personalized bed design recommendation. The automatic process starts by performing multi-part human body RGB-D segmentation combining GrabCut, 3D Shape Context descriptor and Thin Plate Splines, to then extract a set of anthropometric landmark points by applying orthogonal plates to the segmented human body. The extracted variables are introduced to the computerized clinical model to calculate body circumferences, weight, morphotype and Body Mass Index categorization. Furthermore, pressure image analysis is performed to extract pressure values and at-risk points, which are also introduced to the model to eventually obtain the final prescription of mattress, topper, and pillow. We validate the complete system in a set of 200 subjects, showing accurate category classification and high correlation results with respect to manual measures.
Keywords: Sleep system recommendation; RGB-Depth data Pressure imaging; Anthropometric landmark extraction; Multi-part human body segmentation
|
Cristina Palmero, Albert Clapes, Chris Bahnsen, Andreas Møgelmose, Thomas B. Moeslund, & Sergio Escalera. (2016). Multi-modal RGB-Depth-Thermal Human Body Segmentation. IJCV - International Journal of Computer Vision, 118(2), 217–239.
Abstract: This work addresses the problem of human body segmentation from multi-modal visual cues as a first stage of automatic human behavior analysis. We propose a novel RGB–depth–thermal dataset along with a multi-modal segmentation baseline. The several modalities are registered using a calibration device and a registration algorithm. Our baseline extracts regions of interest using background subtraction, defines a partitioning of the foreground regions into cells, computes a set of image features on those cells using different state-of-the-art feature extractions, and models the distribution of the descriptors per cell using probabilistic models. A supervised learning algorithm then fuses the output likelihoods over cells in a stacked feature vector representation. The baseline, using Gaussian mixture models for the probabilistic modeling and Random Forest for the stacked learning, is superior to other state-of-the-art methods, obtaining an overlap above 75 % on the novel dataset when compared to the manually annotated ground-truth of human segmentations.
Keywords: Human body segmentation; RGB ; Depth Thermal
|
Jiaolong Xu, Sebastian Ramos, David Vazquez, & Antonio Lopez. (2016). Hierarchical Adaptive Structural SVM for Domain Adaptation. IJCV - International Journal of Computer Vision, 119(2), 159–178.
Abstract: A key topic in classification is the accuracy loss produced when the data distribution in the training (source) domain differs from that in the testing (target) domain. This is being recognized as a very relevant problem for many
computer vision tasks such as image classification, object detection, and object category recognition. In this paper, we present a novel domain adaptation method that leverages multiple target domains (or sub-domains) in a hierarchical adaptation tree. The core idea is to exploit the commonalities and differences of the jointly considered target domains.
Given the relevance of structural SVM (SSVM) classifiers, we apply our idea to the adaptive SSVM (A-SSVM), which only requires the target domain samples together with the existing source-domain classifier for performing the desired adaptation. Altogether, we term our proposal as hierarchical A-SSVM (HA-SSVM).
As proof of concept we use HA-SSVM for pedestrian detection, object category recognition and face recognition. In the former we apply HA-SSVM to the deformable partbased model (DPM) while in the rest HA-SSVM is applied to multi-category classifiers. We will show how HA-SSVM is effective in increasing the detection/recognition accuracy with respect to adaptation strategies that ignore the structure of the target data. Since, the sub-domains of the target data are not always known a priori, we shown how HA-SSVM can incorporate sub-domain discovery for object category recognition.
Keywords: Domain Adaptation; Pedestrian Detection
|
Antonio Hernandez, Sergio Escalera, & Stan Sclaroff. (2016). Poselet-basedContextual Rescoring for Human Pose Estimation via Pictorial Structures. IJCV - International Journal of Computer Vision, 118(1), 49–64.
Abstract: In this paper we propose a contextual rescoring method for predicting the position of body parts in a human pose estimation framework. A set of poselets is incorporated in the model, and their detections are used to extract spatial and score-related features relative to other body part hypotheses. A method is proposed for the automatic discovery of a compact subset of poselets that covers the different poses in a set of validation images while maximizing precision. A rescoring mechanism is defined as a set-based boosting classifier that computes a new score for each body joint detection, given its relationship to detections of other body joints and mid-level parts in the image. This new score is incorporated in the pictorial structure model as an additional unary potential, following the recent work of Pishchulin et al. Experiments on two benchmarks show comparable results to Pishchulin et al. while reducing the size of the mid-level representation by an order of magnitude, reducing the execution time by 68 % accordingly.
Keywords: Contextual rescoring; Poselets; Human pose estimation
|