Wenwen Fu, Zhihong An, Wendong Huang, Haoran Sun, Wenjuan Gong, & Jordi Gonzalez. (2023). A Spatio-Temporal Spotting Network with Sliding Windows for Micro-Expression Detection. ELEC - Electronics, 12(18), 3947.
Abstract: Micro-expressions reveal underlying emotions and are widely applied in political psychology, lie detection, law enforcement and medical care. Micro-expression spotting aims to detect the temporal locations of facial expressions from video sequences and is a crucial task in micro-expression recognition. In this study, the problem of micro-expression spotting is formulated as micro-expression classification per frame. We propose an effective spotting model with sliding windows called the spatio-temporal spotting network. The method involves a sliding window detection mechanism, combines the spatial features from the local key frames and the global temporal features and performs micro-expression spotting. The experiments are conducted on the CAS(ME)2 database and the SAMM Long Videos database, and the results demonstrate that the proposed method outperforms the state-of-the-art method by 30.58% for the CAS(ME)2 and 23.98% for the SAMM Long Videos according to overall F-scores.
Keywords: micro-expression spotting; sliding window; key frame extraction
|
Anders Skaarup Johansen, Kamal Nasrollahi, Sergio Escalera, & Thomas B. Moeslund. (2023). Who Cares about the Weather? Inferring Weather Conditions for Weather-Aware Object Detection in Thermal Images. AS - Applied Sciences, 13(18).
Abstract: Deployments of real-world object detection systems often experience a degradation in performance over time due to concept drift. Systems that leverage thermal cameras are especially susceptible because the respective thermal signatures of objects and their surroundings are highly sensitive to environmental changes. In this study, two types of weather-aware latent conditioning methods are investigated. The proposed method aims to guide two object detectors, (YOLOv5 and Deformable DETR) to become weather-aware. This is achieved by leveraging an auxiliary branch that predicts weather-related information while conditioning intermediate layers of the object detector. While the conditioning methods proposed do not directly improve the accuracy of baseline detectors, it can be observed that conditioned networks manage to extract a weather-related signal from the thermal images, thus resulting in a decreased miss rate at the cost of increased false positives. The extracted signal appears noisy and is thus challenging to regress accurately. This is most likely a result of the qualitative nature of the thermal sensor; thus, further work is needed to identify an ideal method for optimizing the conditioning branch, as well as to further improve the accuracy of the system.
Keywords: thermal; object detection; concept drift; conditioning; weather recognition
|
Joost Van de Weijer, Robert Benavente, Maria Vanrell, Cordelia Schmid, Ramon Baldrich, Jacob Verbeek, et al. (2012). Color Naming. In Theo Gevers, Arjan Gijsenij, Joost Van de Weijer, & Jan-Mark Geusebroek (Eds.), Color in Computer Vision: Fundamentals and Applications (pp. 287–317). John Wiley & Sons, Ltd.
|
J.S. Cope, P.Remagnino, S.Mannan, Katerine Diaz, Francesc J. Ferri, & P.Wilkin. (2013). Reverse Engineering Expert Visual Observations: From Fixations To The Learning Of Spatial Filters With A Neural-Gas Algorithm. EXWA - Expert Systems with Applications, 40(17), 6707–6712.
Abstract: Human beings can become experts in performing specific vision tasks, for example, doctors analysing medical images, or botanists studying leaves. With sufficient knowledge and experience, people can become very efficient at such tasks. When attempting to perform these tasks with a machine vision system, it would be highly beneficial to be able to replicate the process which the expert undergoes. Advances in eye-tracking technology can provide data to allow us to discover the manner in which an expert studies an image. This paper presents a first step towards utilizing these data for computer vision purposes. A growing-neural-gas algorithm is used to learn a set of Gabor filters which give high responses to image regions which a human expert fixated on. These filters can then be used to identify regions in other images which are likely to be useful for a given vision task. The algorithm is evaluated by learning filters for locating specific areas of plant leaves.
Keywords: Neural gas; Expert vision; Eye-tracking; Fixations
|
Cristina Cañero, & Petia Radeva. (2003). Vesselness enhancement diffusion. PRL - Pattern Recognition Letters, 24(16), 3141–3151.
|
Joan Marc Llargues Asensio, Juan Peralta, Raul Arrabales, Manuel Gonzalez Bedia, Paulo Cortez, & Antonio Lopez. (2014). Artificial Intelligence Approaches for the Generation and Assessment of Believable Human-Like Behaviour in Virtual Characters. EXSY - Expert Systems With Applications, 41(16), 7281–7290.
Abstract: Having artificial agents to autonomously produce human-like behaviour is one of the most ambitious original goals of Artificial Intelligence (AI) and remains an open problem nowadays. The imitation game originally proposed by Turing constitute a very effective method to prove the indistinguishability of an artificial agent. The behaviour of an agent is said to be indistinguishable from that of a human when observers (the so-called judges in the Turing test) cannot tell apart humans and non-human agents. Different environments, testing protocols, scopes and problem domains can be established to develop limited versions or variants of the original Turing test. In this paper we use a specific version of the Turing test, based on the international BotPrize competition, built in a First-Person Shooter video game, where both human players and non-player characters interact in complex virtual environments. Based on our past experience both in the BotPrize competition and other robotics and computer game AI applications we have developed three new more advanced controllers for believable agents: two based on a combination of the CERA–CRANIUM and SOAR cognitive architectures and other based on ADANN, a system for the automatic evolution and adaptation of artificial neural networks. These two new agents have been put to the test jointly with CCBot3, the winner of BotPrize 2010 competition (Arrabales et al., 2012), and have showed a significant improvement in the humanness ratio. Additionally, we have confronted all these bots to both First-person believability assessment (BotPrize original judging protocol) and Third-person believability assessment, demonstrating that the active involvement of the judge has a great impact in the recognition of human-like behaviour.
Keywords: Turing test; Human-like behaviour; Believability; Non-player characters; Cognitive architectures; Genetic algorithm; Artificial neural networks
|
Angel Morera, Angel Sanchez, A. Belen Moreno, Angel Sappa, & Jose F. Velez. (2020). SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities. SENS - Sensors, 20(16), 4587.
Abstract: This work compares Single Shot MultiBox Detector (SSD) and You Only Look Once (YOLO) deep neural networks for the outdoor advertisement panel detection problem by handling multiple and combined variabilities in the scenes. Publicity panel detection in images offers important advantages both in the real world as well as in the virtual one. For example, applications like Google Street View can be used for Internet publicity and when detecting these ads panels in images, it could be possible to replace the publicity appearing inside the panels by another from a funding company. In our experiments, both SSD and YOLO detectors have produced acceptable results under variable sizes of panels, illumination conditions, viewing perspectives, partial occlusion of panels, complex background and multiple panels in scenes. Due to the difficulty of finding annotated images for the considered problem, we created our own dataset for conducting the experiments. The major strength of the SSD model was the almost elimination of False Positive (FP) cases, situation that is preferable when the publicity contained inside the panel is analyzed after detecting them. On the other side, YOLO produced better panel localization results detecting a higher number of True Positive (TP) panels with a higher accuracy. Finally, a comparison of the two analyzed object detection models with different types of semantic segmentation networks and using the same evaluation metrics is also included.
|
M. Bressan, & Jordi Vitria. (2003). Nonparametric Discriminant Analysis and Nearest Neighbor Classification. PRL - Pattern Recognition Letters, 24(15), 2743–2749.
|
Antonio Lopez, Felipe Lumbreras, & Joan Serrat. (1997). Efficient computation of local creaseness. CVC, Bellaterra (Spain).
|
Fadi Dornaika, & Angel Sappa. (2007). Rigid and Non-rigid Face Motion Tracking by Aligning Texture Maps and Stereo 3D Models. PRL - Pattern Recognition Letters, 28(15), 2116–2126.
|
Sergio Escalera, Alicia Fornes, O. Pujol, Petia Radeva, Gemma Sanchez, & Josep Llados. (2009). Blurred Shape Model for Binary and Grey-level Symbol Recognition. PRL - Pattern Recognition Letters, 30(15), 1424–1433.
Abstract: Many symbol recognition problems require the use of robust descriptors in order to obtain rich information of the data. However, the research of a good descriptor is still an open issue due to the high variability of symbols appearance. Rotation, partial occlusions, elastic deformations, intra-class and inter-class variations, or high variability among symbols due to different writing styles, are just a few problems. In this paper, we introduce a symbol shape description to deal with the changes in appearance that these types of symbols suffer. The shape of the symbol is aligned based on principal components to make the recognition invariant to rotation and reflection. Then, we present the Blurred Shape Model descriptor (BSM), where new features encode the probability of appearance of each pixel that outlines the symbols shape. Moreover, we include the new descriptor in a system to deal with multi-class symbol categorization problems. Adaboost is used to train the binary classifiers, learning the BSM features that better split symbol classes. Then, the binary problems are embedded in an Error-Correcting Output Codes framework (ECOC) to deal with the multi-class case. The methodology is evaluated on different synthetic and real data sets. State-of-the-art descriptors and classifiers are compared, showing the robustness and better performance of the present scheme to classify symbols with high variability of appearance.
|
Ernest Valveny, & Enric Marti. (2003). A model for image generation and symbol recognition through the deformation of lineal shapes. PRL - Pattern Recognition Letters, 24(15), 2857–2867.
Abstract: We describe a general framework for the recognition of distorted images of lineal shapes, which relies on three items: a model to represent lineal shapes and their deformations, a model for the generation of distorted binary images and the combination of both models in a common probabilistic framework, where the generation of deformations is related to an internal energy, and the generation of binary images to an external energy. Then, recognition consists in the minimization of a global energy function, performed by using the EM algorithm. This general framework has been applied to the recognition of hand-drawn lineal symbols in graphic documents.
|
Jaume Gibert, Ernest Valveny, & Horst Bunke. (2012). Feature Selection on Node Statistics Based Embedding of Graphs. PRL - Pattern Recognition Letters, 33(15), 1980–1990.
Abstract: Representing a graph with a feature vector is a common way of making statistical machine learning algorithms applicable to the domain of graphs. Such a transition from graphs to vectors is known as graphembedding. A key issue in graphembedding is to select a proper set of features in order to make the vectorial representation of graphs as strong and discriminative as possible. In this article, we propose features that are constructed out of frequencies of node label representatives. We first build a large set of features and then select the most discriminative ones according to different ranking criteria and feature transformation algorithms. On different classification tasks, we experimentally show that only a small significant subset of these features is needed to achieve the same classification rates as competing to state-of-the-art methods.
Keywords: Structural pattern recognition; Graph embedding; Feature ranking; PCA; Graph classification
|
Jordi Vitria, & J. Llacer. (1996). Reconstructing 3D light microscopic images using the EM algorithm. Pattern Recognition Letters, 1491–1498.
|
David Guillamet, Jordi Vitria, & B. Shiele. (2003). Introducing a weighted non-negative matrix factorization for image classification. PRL - Pattern Recognition Letters, 24(14), 2447–2454.
|