|
Pejman Rasti, Salma Samiei, Mary Agoyi, Sergio Escalera, & Gholamreza Anbarjafari. (2016). Robust non-blind color video watermarking using QR decomposition and entropy analysis. JVCIR - Journal of Visual Communication and Image Representation, 38, 838–847.
Abstract: Issues such as content identification, document and image security, audience measurement, ownership and copyright among others can be settled by the use of digital watermarking. Many recent video watermarking methods show drops in visual quality of the sequences. The present work addresses the aforementioned issue by introducing a robust and imperceptible non-blind color video frame watermarking algorithm. The method divides frames into moving and non-moving parts. The non-moving part of each color channel is processed separately using a block-based watermarking scheme. Blocks with an entropy lower than the average entropy of all blocks are subject to a further process for embedding the watermark image. Finally a watermarked frame is generated by adding moving parts to it. Several signal processing attacks are applied to each watermarked frame in order to perform experiments and are compared with some recent algorithms. Experimental results show that the proposed scheme is imperceptible and robust against common signal processing attacks.
Keywords: Video watermarking; QR decomposition; Discrete Wavelet Transformation; Chirp Z-transform; Singular value decomposition; Orthogonal–triangular decomposition
|
|
|
Cristina Palmero, Albert Clapes, Chris Bahnsen, Andreas Møgelmose, Thomas B. Moeslund, & Sergio Escalera. (2016). Multi-modal RGB-Depth-Thermal Human Body Segmentation. IJCV - International Journal of Computer Vision, 118(2), 217–239.
Abstract: This work addresses the problem of human body segmentation from multi-modal visual cues as a first stage of automatic human behavior analysis. We propose a novel RGB–depth–thermal dataset along with a multi-modal segmentation baseline. The several modalities are registered using a calibration device and a registration algorithm. Our baseline extracts regions of interest using background subtraction, defines a partitioning of the foreground regions into cells, computes a set of image features on those cells using different state-of-the-art feature extractions, and models the distribution of the descriptors per cell using probabilistic models. A supervised learning algorithm then fuses the output likelihoods over cells in a stacked feature vector representation. The baseline, using Gaussian mixture models for the probabilistic modeling and Random Forest for the stacked learning, is superior to other state-of-the-art methods, obtaining an overlap above 75 % on the novel dataset when compared to the manually annotated ground-truth of human segmentations.
Keywords: Human body segmentation; RGB ; Depth Thermal
|
|
|
Gerard Canal, Sergio Escalera, & Cecilio Angulo. (2016). A Real-time Human-Robot Interaction system based on gestures for assistive scenarios. CVIU - Computer Vision and Image Understanding, 149, 65–77.
Abstract: Natural and intuitive human interaction with robotic systems is a key point to develop robots assisting people in an easy and effective way. In this paper, a Human Robot Interaction (HRI) system able to recognize gestures usually employed in human non-verbal communication is introduced, and an in-depth study of its usability is performed. The system deals with dynamic gestures such as waving or nodding which are recognized using a Dynamic Time Warping approach based on gesture specific features computed from depth maps. A static gesture consisting in pointing at an object is also recognized. The pointed location is then estimated in order to detect candidate objects the user may refer to. When the pointed object is unclear for the robot, a disambiguation procedure by means of either a verbal or gestural dialogue is performed. This skill would lead to the robot picking an object in behalf of the user, which could present difficulties to do it by itself. The overall system — which is composed by a NAO and Wifibot robots, a KinectTM v2 sensor and two laptops — is firstly evaluated in a structured lab setup. Then, a broad set of user tests has been completed, which allows to assess correct performance in terms of recognition rates, easiness of use and response times.
Keywords: Gesture recognition; Human Robot Interaction; Dynamic Time Warping; Pointing location estimation
|
|
|
Isabelle Guyon, Imad Chaabane, Hugo Jair Escalante, Sergio Escalera, Damir Jajetic, James Robert Lloyd, et al. (2016). A brief Review of the ChaLearn AutoML Challenge: Any-time Any-dataset Learning without Human Intervention. In AutoML Workshop (pp. 1–8).
Abstract: The ChaLearn AutoML Challenge team conducted a large scale evaluation of fully automatic, black-box learning machines for feature-based classification and regression problems. The test bed was composed of 30 data sets from a wide variety of application domains and ranged across different types of complexity. Over six rounds, participants succeeded in delivering AutoML software capable of being trained and tested without human intervention. Although improvements can still be made to close the gap between human-tweaked and AutoML models, this competition contributes to the development of fully automated environments by challenging practitioners to solve problems under specific constraints and sharing their approaches; the platform will remain available for post-challenge submissions at http://codalab.org/AutoML.
Keywords: AutoML Challenge; machine learning; model selection; meta-learning; repre- sentation learning; active learning
|
|
|
Mohammad Ali Bagheri, Qigang Gao, & Sergio Escalera. (2016). Action Recognition by Pairwise Proximity Function Support Vector Machines with Dynamic Time Warping Kernels. In 29th Canadian Conference on Artificial Intelligence (Vol. 9673, pp. 3–14). Springer International Publishing.
Abstract: In the context of human action recognition using skeleton data, the 3D trajectories of joint points may be considered as multi-dimensional time series. The traditional recognition technique in the literature is based on time series dis(similarity) measures (such as Dynamic Time Warping). For these general dis(similarity) measures, k-nearest neighbor algorithms are a natural choice. However, k-NN classifiers are known to be sensitive to noise and outliers. In this paper, a new class of Support Vector Machine that is applicable to trajectory classification, such as action recognition, is developed by incorporating an efficient time-series distances measure into the kernel function. More specifically, the derivative of Dynamic Time Warping (DTW) distance measure is employed as the SVM kernel. In addition, the pairwise proximity learning strategy is utilized in order to make use of non-positive semi-definite (PSD) kernels in the SVM formulation. The recognition results of the proposed technique on two action recognition datasets demonstrates the ourperformance of our methodology compared to the state-of-the-art methods. Remarkably, we obtained 89 % accuracy on the well-known MSRAction3D dataset using only 3D trajectories of body joints obtained by Kinect
|
|
|
Jun Wan, Yibing Zhao, Shuai Zhou, Isabelle Guyon, & Sergio Escalera. (2016). ChaLearn Looking at People RGB-D Isolated and Continuous Datasets for Gesture Recognition. In 29th IEEE Conference on Computer Vision and Pattern Recognition Worshops.
Abstract: In this paper, we present two large video multi-modal datasets for RGB and RGB-D gesture recognition: the ChaLearn LAP RGB-D Isolated Gesture Dataset (IsoGD)and the Continuous Gesture Dataset (ConGD). Both datasets are derived from the ChaLearn Gesture Dataset
(CGD) that has a total of more than 50000 gestures for the “one-shot-learning” competition. To increase the potential of the old dataset, we designed new well curated datasets composed of 249 gesture labels, and including 47933 gestures manually labeled the begin and end frames in sequences.Using these datasets we will open two competitions
on the CodaLab platform so that researchers can test and compare their methods for “user independent” gesture recognition. The first challenge is designed for gesture spotting
and recognition in continuous sequences of gestures while the second one is designed for gesture classification from segmented data. The baseline method based on the bag of visual words model is also presented.
|
|
|
Florin Popescu, Stephane Ayache, Sergio Escalera, Xavier Baro, Cecile Capponi, Patrick Panciatici, et al. (2016). From geospatial observations of ocean currents to causal predictors of spatio-economic activity using computer vision and machine learning. In European Geosciences Union General Assembly (Vol. 18).
Abstract: The big data transformation currently revolutionizing science and industry forges novel possibilities in multimodal analysis scarcely imaginable only a decade ago. One of the important economic and industrial problems that stand to benefit from the recent expansion of data availability and computational prowess is the prediction of electricity demand and renewable energy generation. Both are correlates of human activity: spatiotemporal energy consumption patterns in society are a factor of both demand (weather dependent) and supply, which determine cost – a relation expected to strengthen along with increasing renewable energy dependence. One of the main drivers of European weather patterns is the activity of the Atlantic Ocean and in particular its dominant Northern Hemisphere current: the Gulf Stream. We choose this particular current as a test case in part due to larger amount of relevant data and scientific literature available for refinement of analysis techniques.
This data richness is due not only to its economic importance but also to its size being clearly visible in radar and infrared satellite imagery, which makes it easier to detect using Computer Vision (CV). The power of CV techniques makes basic analysis thus developed scalable to other smaller and less known, but still influential, currents, which are not just curves on a map, but complex, evolving, moving branching trees in 3D projected onto a 2D image.
We investigate means of extracting, from several image modalities (including recently available Copernicus radar and earlier Infrared satellites), a parameterized presentation of the state of the Gulf Stream and its environment that is useful as feature space representation in a machine learning context, in this case with the EC’s H2020-sponsored ‘See.4C’ project, in the context of which data scientists may find novel predictors of spatiotemporal energy flow. Although automated extractors of Gulf Stream position exist, they differ in methodology and result. We shall attempt to extract more complex feature representation including branching points, eddies and parameterized changes in transport and velocity. Other related predictive features will be similarly developed, such as inference of deep water flux long the current path and wider spatial scale features such as Hough transform, surface turbulence indicators and temperature gradient indexes along with multi-time scale analysis of ocean height and temperature dynamics. The geospatial imaging and ML community may therefore benefit from a baseline of open-source techniques useful and expandable to other related prediction and/or scientific analysis tasks.
|
|
|
Mohammad Ali Bagheri, Qigang Gao, & Sergio Escalera. (2016). Support Vector Machines with Time Series Distance Kernels for Action Classification. In IEEE Winter Conference on Applications of Computer Vision (pp. 1–7).
Abstract: Despite the outperformance of Support Vector Machine (SVM) on many practical classification problems, the algorithm is not directly applicable to multi-dimensional trajectories having different lengths. In this paper, a new class of SVM that is applicable to trajectory classification, such as action recognition, is developed by incorporating two efficient time-series distances measures into the kernel function.
Dynamic Time Warping and Longest Common Subsequence distance measures along with their derivatives are
employed as the SVM kernel. In addition, the pairwise proximity learning strategy is utilized in order to make use of non-positive semi-definite kernels in the SVM formulation. The proposed method is employed for a challenging classification problem: action recognition by depth cameras using only skeleton data; and evaluated on three benchmark action datasets. Experimental results demonstrate the outperformance of our methodology compared to the state-ofthe-art on the considered datasets.
|
|
|
Debora Gil, Sergio Vera, Agnes Borras, Albert Andaluz, & Miguel Angel Gonzalez Ballester. (2017). Anatomical Medial Surfaces with Efficient Resolution of Branches Singularities. MIA - Medical Image Analysis, 35, 390–402.
Abstract: Medial surfaces are powerful tools for shape description, but their use has been limited due to the sensibility existing methods to branching artifacts. Medial branching artifacts are associated to perturbations of the object boundary rather than to geometric features. Such instability is a main obstacle for a condent application in shape recognition and description. Medial branches correspond to singularities of the medial surface and, thus, they are problematic for existing morphological and energy-based algorithms. In this paper, we use algebraic geometry concepts in an energy-based approach to compute a medial surface presenting a stable branching topology. We also present an ecient GPU-CPU implementation using standard image processing tools. We show the method computational eciency and quality on a custom made synthetic database. Finally, we present some results on a medical imaging application for localization of abdominal pathologies.
Keywords: Medial Representations; Shape Recognition; Medial Branching Stability ; Singular Points
|
|
|
Daniel Hernandez, Alejandro Chacon, Antonio Espinosa, David Vazquez, Juan Carlos Moure, & Antonio Lopez. (2016). Stereo Matching using SGM on the GPU.
Abstract: Dense, robust and real-time computation of depth information from stereo-camera systems is a computationally demanding requirement for robotics, advanced driver assistance systems (ADAS) and autonomous vehicles. Semi-Global Matching (SGM) is a widely used algorithm that propagates consistency constraints along several paths across the image. This work presents a real-time system producing reliable disparity estimation results on the new embedded energy efficient GPU devices. Our design runs on a Tegra X1 at 42 frames per second (fps) for an image size of 640x480, 128 disparity levels, and using 4 path directions for the SGM method.
Keywords: CUDA; Stereo; Autonomous Vehicle
|
|
|
Joan M. Nuñez, Jorge Bernal, F. Javier Sanchez, & Fernando Vilariño. (2015). Growing Algorithm for Intersection Detection (GRAID) in branching patterns. MVAP - Machine Vision and Applications, 26(2), 387–400.
Abstract: Analysis of branching structures represents a very important task in fields such as medical diagnosis, road detection or biometrics. Detecting intersection landmarks Becomes crucial when capturing the structure of a branching pattern. We present a very simple geometrical model to describe intersections in branching structures based on two conditions: Bounded Tangency condition (BT) and Shortest Branch (SB) condition. The proposed model precisely sets a geometrical characterization of intersections and allows us to introduce a new unsupervised operator for intersection extraction. We propose an implementation that handles the consequences of digital domain operation that,unlike existing approaches, is not restricted to a particular scale and does not require the computation of the thinned pattern. The new proposal, as well as other existing approaches in the bibliography, are evaluated in a common framework for the first time. The performance analysis is based on two manually segmented image data sets: DRIVE retinal image database and COLON-VESSEL data set, a newly created data set of vascular content in colonoscopy frames. We have created an intersection landmark ground truth for each data set besides comparing our method in the only existing ground truth. Quantitative results confirm that we are able to outperform state-of-the-art performancelevels with the advantage that neither training nor parameter tuning is needed.
Keywords: Bifurcation ; Crossroad; Intersection ;Retina ; Vessel
|
|
|
Gloria Fernandez Esparrach, Jorge Bernal, Maria Lopez Ceron, Henry Cordova, Cristina Sanchez Montes, Cristina Rodriguez de Miguel, et al. (2016). Exploring the clinical potential of an automatic colonic polyp detection method based on the creation of energy maps. END - Endoscopy, 48(9), 837–842.
Abstract: Background and aims: Polyp miss-rate is a drawback of colonoscopy that increases significantly in small polyps. We explored the efficacy of an automatic computer vision method for polyp detection.
Methods: Our method relies on a model that defines polyp boundaries as valleys of image intensity. Valley information is integrated into energy maps which represent the likelihood of polyp presence.
Results: In 24 videos containing polyps from routine colonoscopies, all polyps were detected in at least one frame. Mean values of the maximum of energy map were higher in frames with polyps than without (p<0.001). Performance improved in high quality frames (AUC= 0.79, 95%CI: 0.70-0.87 vs 0.75, 95%CI: 0.66-0.83). Using 3.75 as maximum threshold value, sensitivity and specificity for detection of polyps were 70.4% (95%CI: 60.3-80.8) and 72.4% (95%CI: 61.6-84.6), respectively.
Conclusion: Energy maps showed a good performance for colonic polyp detection. This indicates a potential applicability in clinical practice.
|
|
|
Gloria Fernandez Esparrach, Jorge Bernal, Cristina Rodriguez de Miguel, Debora Gil, Fernando Vilariño, Henry Cordova, et al. (2016). Utilidad de la visión por computador para la localización de pólipos pequeños y planos. In XIX Reunión Nacional de la Asociación Española de Gastroenterología, Gastroenterology Hepatology (Vol. 39, 94).
|
|
|
Jordina Torrents-Barrena, Aida Valls, Petia Radeva, Meritxell Arenas, & Domenec Puig. (2015). Automatic Recognition of Molecular Subtypes of Breast Cancer in X-Ray images using Segmentation-based Fractal Texture Analysis. In Artificial Intelligence Research and Development (Vol. 277, pp. 247–256). Frontiers in Artificial Intelligence and Applications. IOS Press.
Abstract: Breast cancer disease has recently been classified into four subtypes regarding the molecular properties of the affected tumor region. For each patient, an accurate diagnosis of the specific type is vital to decide the most appropriate therapy in order to enhance life prospects. Nowadays, advanced therapeutic diagnosis research is focused on gene selection methods, which are not robust enough. Hence, we hypothesize that computer vision algorithms can offer benefits to address the problem of discriminating among them through X-Ray images. In this paper, we propose a novel approach driven by texture feature descriptors and machine learning techniques. First, we segment the tumour part through an active contour technique and then, we perform a complete fractal analysis to collect qualitative information of the region of interest in the feature extraction stage. Finally, several supervised and unsupervised classifiers are used to perform multiclass classification of the aforementioned data. The experimental results presented in this paper support that it is possible to establish a relation between each tumor subtype and the extracted features of the patterns revealed on mammograms.
|
|
|
E. Tavalera, Mariella Dimiccoli, Marc Bolaños, Maedeh Aghaei, & Petia Radeva. (2015). Regularized Clustering for Egocentric Video Segmentation. In Pattern Recognition and Image Analysis (pp. 327–336). LNCS. Springer International Publishing.
Abstract: In this paper, we present a new method for egocentric video temporal segmentation based on integrating a statistical mean change detector and agglomerative clustering(AC) within an energyminimization framework. Given the tendency of most AC methods to oversegment video sequences when clustering their frames, we combine the clustering with a concept drift detection technique (ADWIN) that has rigorous guarantee of performances. ADWIN serves as a statistical upper bound for the clustering-based video segmentation. We integrate techniques in an energy-minimization framework that serves disambiguate the decision of both techniques and to complete the segmentation taking into account the temporal continuity of video frames We present experiments over egocentric sets of more than 13.000 images acquired with different wearable cameras, showing that our method outperforms state-of-the-art clustering methods.
Keywords: Temporal video segmentation ; Egocentric videos ; Clustering
|
|