|
Cristina Palmero, Albert Clapes, Chris Bahnsen, Andreas Møgelmose, Thomas B. Moeslund, & Sergio Escalera. (2016). Multi-modal RGB-Depth-Thermal Human Body Segmentation. IJCV - International Journal of Computer Vision, 118(2), 217–239.
Abstract: This work addresses the problem of human body segmentation from multi-modal visual cues as a first stage of automatic human behavior analysis. We propose a novel RGB–depth–thermal dataset along with a multi-modal segmentation baseline. The several modalities are registered using a calibration device and a registration algorithm. Our baseline extracts regions of interest using background subtraction, defines a partitioning of the foreground regions into cells, computes a set of image features on those cells using different state-of-the-art feature extractions, and models the distribution of the descriptors per cell using probabilistic models. A supervised learning algorithm then fuses the output likelihoods over cells in a stacked feature vector representation. The baseline, using Gaussian mixture models for the probabilistic modeling and Random Forest for the stacked learning, is superior to other state-of-the-art methods, obtaining an overlap above 75 % on the novel dataset when compared to the manually annotated ground-truth of human segmentations.
Keywords: Human body segmentation; RGB ; Depth Thermal
|
|
|
Alicia Fornes, Asma Bensalah, Cristina Carmona_Duarte, Jialuo Chen, Miguel A. Ferrer, Andreas Fischer, et al. (2022). The RPM3D Project: 3D Kinematics for Remote Patient Monitoring. In Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022 (Vol. 13424, pp. 217–226). LNCS.
Abstract: This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches. We base our analysis on the Kinematic Theory of Rapid Human Movement. We have validated our research in a real case scenario for stroke rehabilitation at the Guttmann Institute (https://www.guttmann.com/en/) (neurorehabilitation hospital), showing promising results. Our work could have a great impact in remote healthcare applications, improving the medical efficiency and reducing the healthcare costs. Future steps include more clinical validation, developing multi-modal analysis architectures (analysing data from sensors, images, audio, etc.), and exploring the application of our technology to monitor other neurodegenerative diseases.
Keywords: Healthcare applications; Kinematic; Theory of Rapid Human Movements; Human activity recognition; Stroke rehabilitation; 3D kinematics
|
|
|
David Roche, Debora Gil, & Jesus Giraldo. (2011). An inference model for analyzing termination conditions of Evolutionary Algorithms. In 14th Congrès Català en Intel·ligencia Artificial (pp. 216–225).
Abstract: In real-world problems, it is mandatory to design a termination condition for Evolutionary Algorithms (EAs) ensuring stabilization close to the unknown optimum. Distribution-based quantities are good candidates as far as suitable parameters are used. A main limitation for application to real-world problems is that such parameters strongly depend on the topology of the objective function, as well as, the EA paradigm used.
We claim that the termination problem would be fully solved if we had a model measuring to what extent a distribution-based quantity asymptotically behaves like the solution accuracy. We present a regression-prediction model that relates any two given quantities and reports if they can be statistically swapped as termination conditions. Our framework is applied to two issues. First, exploring if the parameters involved in the computation of distribution-based quantities influence their asymptotic behavior. Second, to what extent existing distribution-based quantities can be asymptotically exchanged for the accuracy of the EA solution.
Keywords: Evolutionary Computation Convergence, Termination Conditions, Statistical Inference
|
|
|
Jaume Gibert, Ernest Valveny, & Horst Bunke. (2011). Vocabulary Selection for Graph of Words Embedding. In J. Vitria, J. M. R. Sanches, & M. Hernández (Eds.), 5th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 6669, pp. 216–223). LNCS. Berlin: Springer.
Abstract: The Graph of Words Embedding consists in mapping every graph in a given dataset to a feature vector by counting unary and binary relations between node attributes of the graph. It has been shown to perform well for graphs with discrete label alphabets. In this paper we extend the methodology to graphs with n-dimensional continuous attributes by selecting node representatives. We propose three different discretization procedures for the attribute space and experimentally evaluate the dependence on both the selector and the number of node representatives. In the context of graph classification, the experimental results reveal that on two out of three public databases the proposed extension achieves superior performance over a standard reference system.
|
|
|
Marçal Rusiñol, & Josep Llados. (2010). Efficient Logo Retrieval Through Hashing Shape Context Descriptors. In 9th IAPR International Workshop on Document Analysis Systems (215–222).
Abstract: In this paper, we present an approach towards the retrieval of words from graphical document images. In graphical documents, due to presence of multi-oriented characters in non-structured layout, word indexing is a challenging task. The proposed approach uses recognition results of individual components to form character pairs with the neighboring components. An indexing scheme is designed to store the spatial description of components and to access them efficiently. Given a query text word (ascii/unicode format), the character pairs present in it are searched in the document. Next the retrieved character pairs are linked sequentially to form character string. Dynamic programming is applied to find different instances of query words. A string edit distance is used here to match the query word as the objective function. Recognition of multi-scale and multi-oriented character component is done using Support Vector Machine classifier. To consider multi-oriented character strings the features used in the SVM are invariant to character orientation. Experimental results show that the method is efficient to locate a query word from multi-oriented text in graphical documents.
|
|
|
Oscar Amoros, Sergio Escalera, & Anna Puig. (2011). Adaboost GPU-based Classifier for Direct Volume Rendering. In International Conference on Computer Graphics Theory and Applications (pp. 215–219).
Abstract: In volume visualization, the voxel visibitity and materials are carried out through an interactive editing of Transfer Function. In this paper, we present a two-level GPU-based labeling method that computes in times of rendering a set of labeled structures using the Adaboost machine learning classifier. In a pre-processing step, Adaboost trains a binary classifier from a pre-labeled dataset and, in each sample, takes into account a set of features. This binary classifier is a weighted combination of weak classifiers, which can be expressed as simple decision functions estimated on a single feature values. Then, at the testing stage, each weak classifier is independently applied on the features of a set of unlabeled samples. We propose an alternative representation of these classifiers that allow a GPU-based parallelizated testing stage embedded into the visualization pipeline. The empirical results confirm the OpenCL-based classification of biomedical datasets as a tough problem where an opportunity for further research emerges.
|
|
|
Bogdan Raducanu, & Jordi Vitria. (2007). Incremental Subspace Learning for Cognitive Visual Processes. In Advances in Brain, Vision and Artificial Intelligence, 2nd International Symposium (Vol. 4729, 214–223). LNCS.
|
|
|
Mohamed Ramzy Ibrahim, Robert Benavente, Daniel Ponsa, & Felipe Lumbreras. (2023). Unveiling the Influence of Image Super-Resolution on Aerial Scene Classification. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications (Vol. 14469, 214–228). LNCS.
Abstract: Deep learning has made significant advances in recent years, and as a result, it is now in a stage where it can achieve outstanding results in tasks requiring visual understanding of scenes. However, its performance tends to decline when dealing with low-quality images. The advent of super-resolution (SR) techniques has started to have an impact on the field of remote sensing by enabling the restoration of fine details and enhancing image quality, which could help to increase performance in other vision tasks. However, in previous works, contradictory results for scene visual understanding were achieved when SR techniques were applied. In this paper, we present an experimental study on the impact of SR on enhancing aerial scene classification. Through the analysis of different state-of-the-art SR algorithms, including traditional methods and deep learning-based approaches, we unveil the transformative potential of SR in overcoming the limitations of low-resolution (LR) aerial imagery. By enhancing spatial resolution, more fine details are captured, opening the door for an improvement in scene understanding. We also discuss the effect of different image scales on the quality of SR and its effect on aerial scene classification. Our experimental work demonstrates the significant impact of SR on enhancing aerial scene classification compared to LR images, opening new avenues for improved remote sensing applications.
|
|
|
Debora Gil, Oriol Rodriguez-Leor, Petia Radeva, & Aura Hernandez-Sabate. (2007). Assessing Artery Motion Compensation in IVUS. In Computer Analysis Of Images And Patterns (Vol. 4673, pp. 213–220). Lecture Notes in Computer Science. Heidelberg: Springerlink.
Abstract: Cardiac dynamics suppression is a main issue for visual improvement and computation of tissue mechanical properties in IntraVascular UltraSound (IVUS). Although in recent times several motion compensation techniques have arisen, there is a lack of objective evaluation of motion reduction in in vivo pullbacks. We consider that the assessment protocol deserves special attention for the sake of a clinical applicability as reliable as possible. Our work focuses on defining a quality measure and a validation protocol assessing IVUS motion compensation. On the grounds of continuum mechanics laws we introduce a novel score measuring motion reduction in in vivo sequences. Synthetic experiments validate the proposed score as measure of motion parameters accuracy; while results in in vivo pullbacks show its reliability in clinical cases.
Keywords: validation standards; quality measures; IVUS motion compensation; conservation laws; Fourier development
|
|
|
Cristina Palmero, Jordi Esquirol, Vanessa Bayo, Miquel Angel Cos, Pouya Ahmadmonfared, Joan Salabert, et al. (2017). Automatic Sleep System Recommendation by Multi-modal RBG-Depth-Pressure Anthropometric Analysis. IJCV - International Journal of Computer Vision, 122(2), 212–227.
Abstract: This paper presents a novel system for automatic sleep system recommendation using RGB, depth and pressure information. It consists of a validated clinical knowledge-based model that, along with a set of prescription variables extracted automatically, obtains a personalized bed design recommendation. The automatic process starts by performing multi-part human body RGB-D segmentation combining GrabCut, 3D Shape Context descriptor and Thin Plate Splines, to then extract a set of anthropometric landmark points by applying orthogonal plates to the segmented human body. The extracted variables are introduced to the computerized clinical model to calculate body circumferences, weight, morphotype and Body Mass Index categorization. Furthermore, pressure image analysis is performed to extract pressure values and at-risk points, which are also introduced to the model to eventually obtain the final prescription of mattress, topper, and pillow. We validate the complete system in a set of 200 subjects, showing accurate category classification and high correlation results with respect to manual measures.
Keywords: Sleep system recommendation; RGB-Depth data Pressure imaging; Anthropometric landmark extraction; Multi-part human body segmentation
|
|
|
Fernando Vilariño, & Petia Radeva. (2003). Cardiac Segmentation with Discriminant Active Contours. (211–217). IOS Press.
Abstract: Dynamic tracking of heart moving is one relevant target in medical imag- ing and can be helpful for analyzing heart dynamics in the study of several cardiac diseases. For this aim, a previous segmentation problem of such structures is stated, based on certain relevant features (like edges or intensity levels, textures, etc.) Clas- sical active models have been used, but they fail when overlapping structures or not well-defined contours are present. Automatic feature learning systems may be a pow- erful tool. Discriminant active contours present optimal results in this kind of problem. They are a kind of deformable models that converge to an optimal object segmenta- tion that dynamically adapts to the object contour. The feature space is designed from a filter bank in order to guarantee the search and learning of the set of relevant fea- tures for optimal classification on each part of the object. Tracking of target evolution is obtained through the whole set of images, using information from the actual and previous stages. Feedback systems are implemented to guarantee the minimum well- separable classification set in each segmentation step. Our implementation has been proved with several series of Magnetic Resonance with improved results in segmenta- tion in comparison to previous methods.
|
|
|
Ariel Amato, Felipe Lumbreras, & Angel Sappa. (2014). A General-purpose Crowdsourcing Platform for Mobile Devices. In 9th International Conference on Computer Vision Theory and Applications (Vol. 3, pp. 211–215).
Abstract: This paper presents details of a general purpose micro-task on-demand platform based on the crowdsourcing philosophy. This platform was specifically developed for mobile devices in order to exploit the strengths of such devices; namely: i) massivity, ii) ubiquity and iii) embedded sensors. The combined use of mobile platforms and the crowdsourcing model allows to tackle from the simplest to the most complex tasks. Users experience is the highlighted feature of this platform (this fact is extended to both task-proposer and tasksolver). Proper tools according with a specific task are provided to a task-solver in order to perform his/her job in a simpler, faster and appealing way. Moreover, a task can be easily submitted by just selecting predefined templates, which cover a wide range of possible applications. Examples of its usage in computer vision and computer games are provided illustrating the potentiality of the platform.
Keywords: Crowdsourcing Platform; Mobile Crowdsourcing
|
|
|
David Masip, Alexander Todorov, & Jordi Vitria. (2012). The Role of Facial Regions in Evaluating Social Dime. In Rita Cucchiara V. M. Andrea Fusiello (Ed.), 12th European Conference on Computer Vision – Workshops and Demonstrations (Vol. 7584, pp. 210–219). LNCS. Springer Berlin Heidelberg.
Abstract: Facial trait judgments are an important information cue for people. Recent works in the Psychology field have stated the basis of face evaluation, defining a set of traits that we evaluate from faces (e.g. dominance, trustworthiness, aggressiveness, attractiveness, threatening or intelligence among others). We rapidly infer information from others faces, usually after a short period of time (< 1000ms) we perceive a certain degree of dominance or trustworthiness of another person from the face. Although these perceptions are not necessarily accurate, they influence many important social outcomes (such as the results of the elections or the court decisions). This topic has also attracted the attention of Computer Vision scientists, and recently a computational model to automatically predict trait evaluations from faces has been proposed. These systems try to mimic the human perception by means of applying machine learning classifiers to a set of labeled data. In this paper we perform an experimental study on the specific facial features that trigger the social inferences. Using previous results from the literature, we propose to use simple similarity maps to evaluate which regions of the face influence the most the trait inferences. The correlation analysis is performed using only appearance, and the results from the experiments suggest that each trait is correlated with specific facial characteristics.
Keywords: Workshops and Demonstrations
|
|
|
Bhaskar Chakraborty, Marco Pedersoli, & Jordi Gonzalez. (2008). View-Invariant Human Action Detection using Component-Wise HMM of Body Parts. In Articulated Motion and Deformable Objects, 5th International Conference (Vol. 5098, 208–217). LNCS.
|
|
|
Olivier Penacchio, Laura Dempere-Marco, & Xavier Otazu. (2012). Switching off brightness induction through induction-reversed images. In Perception (Vol. 41, 208).
Abstract: Brightness induction is the modulation of the perceived intensity of an
area by the luminance of surrounding areas. Although V1 is traditionally regarded as
an area mostly responsive to retinal information, neurophysiological evidence
suggests that it may explicitly represent brightness information. In this work, we
investigate possible neural mechanisms underlying brightness induction. To this end,
we consider the model by Z Li (1999 Computation and Neural Systems10187-212)
which is constrained by neurophysiological data and focuses on the part of V1
responsible for contextual influences. This model, which has proven to account for
phenomena such as contour detection and preattentive segmentation, shares with
brightness induction the relevant effect of contextual influences. Importantly, the
input to our network model derives from a complete multiscale and multiorientation
wavelet decomposition, which makes it possible to recover an image reflecting the
perceived luminance and successfully accounts for well known psychophysical
effects for both static and dynamic contexts. By further considering inverse problem
techniques we define induction-reversed images: given a target image, we build an
image whose perceived luminance matches the actual luminance of the original
stimulus, thus effectively canceling out brightness induction effects. We suggest that
induction-reversed images may help remove undesired perceptual effects and can
find potential applications in fields such as radiological image interpretation
|
|