|
Albert Clapes, Julio C. S. Jacques Junior, Carla Morral, & Sergio Escalera. (2020). ChaLearn LAP 2020 Challenge on Identity-preserved Human Detection: Dataset and Results. In 15th IEEE International Conference on Automatic Face and Gesture Recognition (pp. 801–808).
Abstract: This paper summarizes the ChaLearn Looking at People 2020 Challenge on Identity-preserved Human Detection (IPHD). For the purpose, we released a large novel dataset containing more than 112K pairs of spatiotemporally aligned depth and thermal frames (and 175K instances of humans) sampled from 780 sequences. The sequences contain hundreds of non-identifiable people appearing in a mix of in-the-wild and scripted scenarios recorded in public and private places. The competition was divided into three tracks depending on the modalities exploited for the detection: (1) depth, (2) thermal, and (3) depth-thermal fusion. Color was also captured but only used to facilitate the groundtruth annotation. Still the temporal synchronization of three sensory devices is challenging, so bad temporal matches across modalities can occur. Hence, the labels provided should considered “weak”, although test frames were carefully selected to minimize this effect and ensure the fairest comparison of the participants’ results. Despite this added difficulty, the results got by the participants demonstrate current fully-supervised methods can deal with that and achieve outstanding detection performance when measured in terms of AP@0.50.
|
|
|
Josep Famadas, Meysam Madadi, Cristina Palmero, & Sergio Escalera. (2020). Generative Video Face Reenactment by AUs and Gaze Regularization. In 15th IEEE International Conference on Automatic Face and Gesture Recognition (pp. 444–451).
Abstract: In this work, we propose an encoder-decoder-like architecture to perform face reenactment in image sequences. Our goal is to transfer the training subject identity to a given test subject. We regularize face reenactment by facial action unit intensity and 3D gaze vector regression. This way, we enforce the network to transfer subtle facial expressions and eye dynamics, providing a more lifelike result. The proposed encoder-decoder receives as input the previous sequence frame stacked to the current frame image of facial landmarks. Thus, the generated frames benefit from appearance and geometry, while keeping temporal coherence for the generated sequence. At test stage, a new target subject with the facial performance of the source subject and the appearance of the training subject is reenacted. Principal component analysis is applied to project the test subject geometry to the closest training subject geometry before reenactment. Evaluation of our proposal shows faster convergence, and more accurate and realistic results in comparison to other architectures without action units and gaze regularization.
|
|
|
Albert Rial-Farras, Meysam Madadi, & Sergio Escalera. (2021). UV-based reconstruction of 3D garments from a single RGB image. In 16th IEEE International Conference on Automatic Face and Gesture Recognition (pp. 1–8).
Abstract: Garments are highly detailed and dynamic objects made up of particles that interact with each other and with other objects, making the task of 2D to 3D garment reconstruction extremely challenging. Therefore, having a lightweight 3D representation capable of modelling fine details is of great importance. This work presents a deep learning framework based on Generative Adversarial Networks (GANs) to reconstruct 3D garment models from a single RGB image. It has the peculiarity of using UV maps to represent 3D data, a lightweight representation capable of dealing with high-resolution details and wrinkles. With this model and kind of 3D representation, we achieve state-of-the-art results on the CLOTH3D++ dataset, generating good quality and realistic garment reconstructions regardless of the garment topology and shape, human pose, occlusions and lightning.
|
|
|
Hugo Bertiche, Meysam Madadi, & Sergio Escalera. (2021). Deep Parametric Surfaces for 3D Outfit Reconstruction from Single View Image. In 16th IEEE International Conference on Automatic Face and Gesture Recognition (pp. 1–8).
Abstract: We present a methodology to retrieve analytical surfaces parametrized as a neural network. Previous works on 3D reconstruction yield point clouds, voxelized objects or meshes. Instead, our approach yields 2-manifolds in the euclidean space through deep learning. To this end, we implement a novel formulation for fully connected layers as parametrized manifolds that allows continuous predictions with differential geometry. Based on this property we propose a novel smoothness loss. Results on CLOTH3D++ dataset show the possibility to infer different topologies and the benefits of the smoothness term based on differential geometry.
|
|
|
Rain Eric Haamer, Kaustubh Kulkarni, Nasrin Imanpour, Mohammad Ahsanul Haque, Egils Avots, Michelle Breisch, et al. (2018). Changes in Facial Expression as Biometric: A Database and Benchmarks of Identification. In 8th International Workshop on Human Behavior Understanding.
Abstract: Facial dynamics can be considered as unique signatures for discrimination between people. These have started to become important topic since many devices have the possibility of unlocking using face recognition or verification. In this work, we evaluate the efficacy of the transition frames of video in emotion as compared to the peak emotion frames for identification. For experiments with transition frames we extract features from each frame of the video from a fine-tuned VGG-Face Convolutional Neural Network (CNN) and geometric features from facial landmark points. To model the temporal context of the transition frames we train a Long-Short Term Memory (LSTM) on the geometric and the CNN features. Furthermore, we employ two fusion strategies: first, an early fusion, in which the geometric and the CNN features are stacked and fed to the LSTM. Second, a late fusion, in which the prediction of the LSTMs, trained independently on the two features, are stacked and used with a Support Vector Machine (SVM). Experimental results show that the late fusion strategy gives the best results and the transition frames give better identification results as compared to the peak emotion frames.
|
|
|
Ciprian Corneanu, Meysam Madadi, Sergio Escalera, & Aleix Martinez. (2020). Explainable Early Stopping for Action Unit Recognition. In Faces and Gestures in E-health and welfare workshop (pp. 693–699).
Abstract: A common technique to avoid overfitting when training deep neural networks (DNN) is to monitor the performance in a dedicated validation data partition and to stop
training as soon as it saturates. This only focuses on what the model does, while completely ignoring what happens inside it.
In this work, we open the “black-box” of DNN in order to perform early stopping. We propose to use a novel theoretical framework that analyses meso-scale patterns in the topology of the functional graph of a network while it trains. Based on it,
we decide when it transitions from learning towards overfitting in a more explainable way. We exemplify the benefits of this approach on a state-of-the art custom DNN that jointly learns local representations and label structure employing an ensemble of dedicated subnetworks. We show that it is practically equivalent in performance to early stopping with patience, the standard early stopping algorithm in the literature. This proves beneficial for AU recognition performance and provides new insights into how learning of AUs occurs in DNNs.
|
|
|
Anna Esposito, Terry Amorese, Nelson Maldonato, Alessandro Vinciarelli, Maria Ines Torres, Sergio Escalera, et al. (2020). Seniors’ ability to decode differently aged facial emotional expressions. In Faces and Gestures in E-health and welfare workshop (pp. 716–722).
|
|
|
Anna Esposito, Italia Cirillo, Antonietta Esposito, Leopoldina Fortunati, Gian Luca Foresti, Sergio Escalera, et al. (2020). Impairments in decoding facial and vocal emotional expressions in high functioning autistic adults and adolescents. In Faces and Gestures in E-health and welfare workshop (pp. 667–674).
|
|
|
Oscar Camara, Estanislao Oubel, Gemma Piella, Simone Balocco, Mathieu De Craene, & Alejandro F. Frangi. (2009). Multi-sequence Registration of Cine, Tagged and Delay-Enhancement MRI with Shift Correction and Steerable Pyramid-Based Detagging. In 5th International Conference on Functional Imaging and Modeling of the Heart (Vol. 5528, 330–338). LNCS. Springer Berlin Heidelberg.
Abstract: In this work, we present a registration framework for cardiac cine MRI (cMRI), tagged (tMRI) and delay-enhancement MRI (deMRI), where the two main issues to find an accurate alignment between these images have been taking into account: the presence of tags in tMRI and respiration artifacts in all sequences. A steerable pyramid image decomposition has been used for detagging purposes since it is suitable to extract high-order oriented structures by directional adaptive filtering. Shift correction of cMRI is achieved by firstly maximizing the similarity between the Long Axis and Short Axis cMRI. Subsequently, these shift-corrected images are used as target images in a rigid registration procedure with their corresponding tMRI/deMRI in order to correct their shift. The proposed registration framework has been evaluated by 840 registration tests, considerably improving the alignment of the MR images (mean RMS error of 2.04mm vs. 5.44mm).
|
|
|
Debora Gil, Aura Hernandez-Sabate, Antoni Carol, Oriol Rodriguez, & Petia Radeva. (2005). A Deterministic-Statistic Adventitia Detection in IVUS Images. In 3rd International workshop on International Workshop on Functional Imaging and Modeling of the Heart (pp. 65–74).
Abstract: Plaque analysis in IVUS planes needs accurate intima and adventitia models. Large variety in adventitia descriptors difficulties its detection and motivates using a classification strategy for selecting points on the structure. Whatever the set of descriptors used, the selection stage suffers from fake responses due to noise and uncompleted true curves. In order to smooth background noise while strengthening responses, we apply a restricted anisotropic filter that homogenizes grey levels along the image significant structures. Candidate points are extracted by means of a simple semi supervised adaptive classification of the filtered image response to edge and calcium detectors. The final model is obtained by interpolating the former line segments with an anisotropic contour closing technique based on functional extension principles.
Keywords: Electron microscopy; Unbending; 2D crystal; Interpolation; Approximation
|
|
|
Laura Lopez-Fuentes, Sebastia Massanet, & Manuel Gonzalez-Hidalgo. (2017). Image vignetting reduction via a maximization of fuzzy entropy. In IEEE International Conference on Fuzzy Systems.
Abstract: In many computer vision applications, vignetting is an undesirable effect which must be removed in a pre-processing step. Recently, an algorithm for image vignetting correction has been presented by means of a minimization of log-intensity entropy. This method relies on an increase of the entropy of the image when it is affected with vignetting. In this paper, we propose a novel algorithm to reduce image vignetting via a maximization of the fuzzy entropy of the image. Fuzzy entropy quantifies the fuzziness degree of a fuzzy set and its value is also modified by the presence of vignetting. The experimental results show that this novel algorithm outperforms in most cases the algorithm based on the minimization of log-intensity entropy both from the qualitative and the quantitative point of view.
|
|
|
Aura Hernandez-Sabate, Lluis Albarracin, Daniel Calvo, & Nuria Gorgorio. (2016). EyeMath: Identifying Mathematics Problem Solving Processes in a RTS Video Game. In 5th International Conference Games and Learning Alliance (Vol. 10056, pp. 50–59). LNCS.
Abstract: Photorealistic virtual environments are crucial for developing and testing automated driving systems in a safe way during trials. As commercially available simulators are expensive and bulky, this paper presents a low-cost, extendable, and easy-to-use (LEE) virtual environment with the aim to highlight its utility for level 3 driving automation. In particular, an experiment is performed using the presented simulator to explore the influence of different variables regarding control transfer of the car after the system was driving autonomously in a highway scenario. The results show that the speed of the car at the time when the system needs to transfer the control to the human driver is critical.
Keywords: Simulation environment; Automated Driving; Driver-Vehicle interaction
|
|
|
Miquel Ferrer, Dimosthenis Karatzas, Ernest Valveny, & Horst Bunke. (2009). A Recursive Embedding Approach to Median Graph Computation. In 7th IAPR – TC–15 Workshop on Graph–Based Representations in Pattern Recognition (Vol. 5534, 113–123). LNCS. Springer Berlin Heidelberg.
Abstract: The median graph has been shown to be a good choice to infer a representative of a set of graphs. It has been successfully applied to graph-based classification and clustering. Nevertheless, its computation is extremely complex. Several approaches have been presented up to now based on different strategies. In this paper we present a new approximate recursive algorithm for median graph computation based on graph embedding into vector spaces. Preliminary experiments on three databases show that this new approach is able to obtain better medians than the previous existing approaches.
|
|
|
Andreas Fischer, Ching Y. Suen, Volkmar Frinken, Kaspar Riesen, & Horst Bunke. (2013). A Fast Matching Algorithm for Graph-Based Handwriting Recognition. In 9th IAPR – TC15 Workshop on Graph-based Representation in Pattern Recognition (Vol. 7877, pp. 194–203). LNCS. Springer Berlin Heidelberg.
Abstract: The recognition of unconstrained handwriting images is usually based on vectorial representation and statistical classification. Despite their high representational power, graphs are rarely used in this field due to a lack of efficient graph-based recognition methods. Recently, graph similarity features have been proposed to bridge the gap between structural representation and statistical classification by means of vector space embedding. This approach has shown a high performance in terms of accuracy but had shortcomings in terms of computational speed. The time complexity of the Hungarian algorithm that is used to approximate the edit distance between two handwriting graphs is demanding for a real-world scenario. In this paper, we propose a faster graph matching algorithm which is derived from the Hausdorff distance. On the historical Parzival database it is demonstrated that the proposed method achieves a speedup factor of 12.9 without significant loss in recognition accuracy.
|
|
|
Jaume Gibert, Ernest Valveny, & Horst Bunke. (2011). Dimensionality Reduction for Graph of Words Embedding. In Xiaoyi Jiang, Miquel Ferrer, & Andrea Torsello (Eds.), 8th IAPR-TC-15 International Workshop. Graph-Based Representations in Pattern Recognition (Vol. 6658, pp. 22–31). LNCS.
Abstract: The Graph of Words Embedding consists in mapping every graph of a given dataset to a feature vector by counting unary and binary relations between node attributes of the graph. While it shows good properties in classification problems, it suffers from high dimensionality and sparsity. These two issues are addressed in this article. Two well-known techniques for dimensionality reduction, kernel principal component analysis (kPCA) and independent component analysis (ICA), are applied to the embedded graphs. We discuss their performance compared to the classification of the original vectors on three different public databases of graphs.
|
|