|
W.Win, B.Bao, Q.Xu, Luis Herranz, & Shuqiang Jiang. (2019). Editorial Note: Efficient Multimedia Processing Methods and Applications (Vol. 78).
|
|
|
Jose Garcia-Rodriguez, Isabelle Guyon, Sergio Escalera, Alexandra Psarrou, Andrew Lewis, & Miguel Cazorla. (2017). Editorial: Special Issue on Computational Intelligence for Vision and Robotics. Neural Computing and Applications - Neural Computing and Applications, 28(5), 853–854.
|
|
|
Aura Hernandez-Sabate, Jose Elias Yauri, Pau Folch, Daniel Alvarez, & Debora Gil. (2024). EEG Dataset Collection for Mental Workload Predictions in Flight-Deck Environment. SENS - Sensors, 24(4), 1174.
Abstract: High mental workload reduces human performance and the ability to correctly carry out complex tasks. In particular, aircraft pilots enduring high mental workloads are at high risk of failure, even with catastrophic outcomes. Despite progress, there is still a lack of knowledge about the interrelationship between mental workload and brain functionality, and there is still limited data on flight-deck scenarios. Although recent emerging deep-learning (DL) methods using physiological data have presented new ways to find new physiological markers to detect and assess cognitive states, they demand large amounts of properly annotated datasets to achieve good performance. We present a new dataset of electroencephalogram (EEG) recordings specifically collected for the recognition of different levels of mental workload. The data were recorded from three experiments, where participants were induced to different levels of workload through tasks of increasing cognition demand. The first involved playing the N-back test, which combines memory recall with arithmetical skills. The second was playing Heat-the-Chair, a serious game specifically designed to emphasize and monitor subjects under controlled concurrent tasks. The third was flying in an Airbus320 simulator and solving several critical situations. The design of the dataset has been validated on three different levels: (1) correlation of the theoretical difficulty of each scenario to the self-perceived difficulty and performance of subjects; (2) significant difference in EEG temporal patterns across the theoretical difficulties and (3) usefulness for the training and evaluation of AI models.
|
|
|
Saad Minhas, Aura Hernandez-Sabate, Shoaib Ehsan, & Klaus McDonald Maier. (2022). Effects of Non-Driving Related Tasks during Self-Driving mode. TITS - IEEE Transactions on Intelligent Transportation Systems, 23(2), 1391–1399.
Abstract: Perception reaction time and mental workload have proven to be crucial in manual driving. Moreover, in highly automated cars, where most of the research is focusing on Level 4 Autonomous driving, take-over performance is also a key factor when taking road safety into account. This study aims to investigate how the immersion in non-driving related tasks affects the take-over performance of drivers in given scenarios. The paper also highlights the use of virtual simulators to gather efficient data that can be crucial in easing the transition between manual and autonomous driving scenarios. The use of Computer Aided Simulations is of absolute importance in this day and age since the automotive industry is rapidly moving towards Autonomous technology. An experiment comprising of 40 subjects was performed to examine the reaction times of driver and the influence of other variables in the success of take-over performance in highly automated driving under different circumstances within a highway virtual environment. The results reflect the relationship between reaction times under different scenarios that the drivers might face under the circumstances stated above as well as the importance of variables such as velocity in the success on regaining car control after automated driving. The implications of the results acquired are important for understanding the criteria needed for designing Human Machine Interfaces specifically aimed towards automated driving conditions. Understanding the need to keep drivers in the loop during automation, whilst allowing drivers to safely engage in other non-driving related tasks is an important research area which can be aided by the proposed study.
|
|
|
David Aldavert. (2021). Efficient and Scalable Handwritten Word Spotting on Historical Documents using Bag of Visual Words (Marçal Rusiñol, & Josep Llados, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: Word spotting can be defined as the pattern recognition tasked aimed at locating and retrieving a specific keyword within a document image collection without explicitly transcribing the whole corpus. Its use is particularly interesting when applied in scenarios where Optical Character Recognition performs poorly or can not be used at all. This thesis focuses on such a scenario, word spotting on historical handwritten documents that have been written by a single author or by multiple authors with a similar calligraphy.
This problem requires a visual signature that is robust to image artifacts, flexible to accommodate script variations and efficient to retrieve information in a rapid manner. For this, we have developed a set of word spotting methods that on their foundation use the well known Bag-of-Visual-Words (BoVW) representation. This representation has gained popularity among the document image analysis community to characterize handwritten words
in an unsupervised manner. However, most approaches on this field rely on a basic BoVW configuration and disregard complex encoding and spatial representations. We determine which BoVW configurations provide the best performance boost to a spotting system.
Then, we extend the segmentation-based word spotting, where word candidates are given a priori, to segmentation-free spotting. The proposed approach seeds the document images with overlapping word location candidates and characterizes them with a BoVW signature. Retrieval is achieved comparing the query and candidate signatures and returning the locations that provide a higher consensus. This is a simple but powerful approach that requires a more compact signature than in a segmentation-based scenario. We first
project the BoVW signature into a reduced semantic topics space and then compress it further using Product Quantizers. The resulting signature only requires a few dozen bytes, allowing us to index thousands of pages on a common desktop computer. The final system still yields a performance comparable to the state-of-the-art despite all the information loss during the compression phases.
Afterwards, we also study how to combine different modalities of information in order to create a query-by-X spotting system where, words are indexed using an information modality and queries are retrieved using another. We consider three different information modalities: visual, textual and audio. Our proposal is to create a latent feature space where features which are semantically related are projected onto the same topics. Creating thus a new feature space where information from different modalities can be compared. Later, we consider the codebook generation and descriptor encoding problem. The codebooks used to encode the BoVW signatures are usually created using an unsupervised clustering algorithm and, they require to test multiple parameters to determine which configuration is best for a certain document collection. We propose a semantic clustering algorithm which allows to estimate the best parameter from data. Since gather annotated data is costly, we use synthetically generated word images. The resulting codebook is database agnostic, i. e. a codebook that yields a good performance on document collections that use the same script. We also propose the use of an additional codebook to approximate descriptors and reduce the descriptor encoding
complexity to sub-linear.
Finally, we focus on the problem of signatures dimensionality. We propose a new symbol probability signature where each bin represents the probability that a certain symbol is present a certain location of the word image. This signature is extremely compact and combined with compression techniques can represent word images with just a few bytes per signature.
|
|
|
Adriana Romero, Simeon Petkov, Carlo Gatta, M.Sabate, & Petia Radeva. (2012). Efficient automatic segmentation of vessels. In 16th Conference on Medical Image Understanding and Analysis.
|
|
|
Angel Sappa. (2005). Efficient Closed Contour Extraction from Range Image Edge Points.
|
|
|
S.Grau, Anna Puig, Sergio Escalera, Maria Salamo, & Oscar Amoros. (2013). Efficient complementary viewpoint selection in volume rendering. In 21st WSCG Conference on Computer Graphics,.
Abstract: A major goal of visualization is to appropriately express knowledge of scientific data. Generally, gathering visual information contained in the volume data often requires a lot of expertise from the final user to setup the parameters of the visualization. One way of alleviating this problem is to provide the position of inner structures with different viewpoint locations to enhance the perception and construction of the mental image. To this end, traditional illustrations use two or three different views of the regions of interest. Similarly, with the aim of assisting the users to easily place a good viewpoint location, this paper proposes an automatic and interactive method that locates different complementary viewpoints from a reference camera in volume datasets. Specifically, the proposed method combines the quantity of information each camera provides for each structure and the shape similarity of the projections of the remaining viewpoints based on Dynamic Time Warping. The selected complementary viewpoints allow a better understanding of the focused structure in several applications. Thus, the user interactively receives feedback based on several viewpoints that helps him to understand the visual information. A live-user evaluation on different data sets show a good convergence to useful complementary viewpoints.
Keywords: Dual camera; Visualization; Interactive Interfaces; Dynamic Time Warping.
|
|
|
A. Pujol, Juan J. Villanueva, & Jose Luis Alba. (2001). Efficient Computation of Face Shape Similarity Using Distance Transform Eigendecomposition and Valleys..
|
|
|
Antonio Lopez, Felipe Lumbreras, & Joan Serrat. (1997). Efficient computation of local creaseness. CVC, Bellaterra (Spain).
|
|
|
J. Martinez, & F. Thomas. (2002). Efficient Computation of Local Geometric Moments. IEEE Transactions on Image Porcessing, 11(9): 1102–1111 (IF: 2.553).
|
|
|
David Dueñas, Mostafa Kamal, & Petia Radeva. (2023). Efficient Deep Learning Ensemble for Skin Lesion Classification. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (pp. 303–314).
Abstract: Vision Transformers (ViTs) are deep learning techniques that have been gaining in popularity in recent years.
In this work, we study the performance of ViTs and Convolutional Neural Networks (CNNs) on skin lesions classification tasks, specifically melanoma diagnosis. We show that regardless of the performance of both architectures, an ensemble of them can improve their generalization. We also present an adaptation to the Gram-OOD* method (detecting Out-of-distribution (OOD) using Gram matrices) for skin lesion images. Moreover, the integration of super-convergence was critical to success in building models with strict computing and training time constraints. We evaluated our ensemble of ViTs and CNNs, demonstrating that generalization is enhanced by placing first in the 2019 and third in the 2020 ISIC Challenge Live Leaderboards
(available at https://challenge.isic-archive.com/leaderboards/live/).
|
|
|
Marco Pedersoli, Jordi Gonzalez, Andrew Bagdanov, & Xavier Roca. (2011). Efficient Discriminative Multiresolution Cascade for Real-Time Human Detection Applications. PRL - Pattern Recognition Letters, 32(13), 1581–1587.
Abstract: Human detection is fundamental in many machine vision applications, like video surveillance, driving assistance, action recognition and scene understanding. However in most of these applications real-time performance is necessary and this is not achieved yet by current detection methods.
This paper presents a new method for human detection based on a multiresolution cascade of Histograms of Oriented Gradients (HOG) that can highly reduce the computational cost of detection search without affecting accuracy. The method consists of a cascade of sliding window detectors. Each detector is a linear Support Vector Machine (SVM) composed of HOG features at different resolutions, from coarse at the first level to fine at the last one.
In contrast to previous methods, our approach uses a non-uniform stride of the sliding window that is defined by the feature resolution and allows the detection to be incrementally refined as going from coarse-to-fine resolution. In this way, the speed-up of the cascade is not only due to the fewer number of features computed at the first levels of the cascade, but also to the reduced number of windows that need to be evaluated at the coarse resolution. Experimental results show that our method reaches a detection rate comparable with the state-of-the-art of detectors based on HOG features, while at the same time the detection search is up to 23 times faster.
|
|
|
Angel Sappa, & Mohammad Rouhani. (2009). Efficient Distance Estimation for Fitting Implicit Quadric Surfaces. In 16th IEEE International Conference on Image Processing (3521–3524).
Abstract: This paper presents a novel approach for estimating the shortest Euclidean distance from a given point to the corresponding implicit quadric fitting surface. It first estimates the orthogonal orientation to the surface from the given point; then the shortest distance is directly estimated by intersecting the implicit surface with a line passing through the given point according to the estimated orthogonal orientation. The proposed orthogonal distance estimation is easily obtained without increasing computational complexity; hence it can be used in error minimization surface fitting frameworks. Comparisons of the proposed metric with previous approaches are provided to show both improvements in CPU time as well as in the accuracy of the obtained results. Surfaces fitted by using the proposed geometric distance estimation and state of the art metrics are presented to show the viability of the proposed approach.
|
|
|
Jon Almazan, Albert Gordo, Alicia Fornes, & Ernest Valveny. (2012). Efficient Exemplar Word Spotting. In 23rd British Machine Vision Conference (67.pp. 1–67.11).
Abstract: In this paper we propose an unsupervised segmentation-free method for word spotting in document images.
Documents are represented with a grid of HOG descriptors, and a sliding window approach is used to locate the document regions that are most similar to the query. We use the exemplar SVM framework to produce a better representation of the query in an unsupervised way. Finally, the document descriptors are precomputed and compressed with Product Quantization. This offers two advantages: first, a large number of documents can be kept in RAM memory at the same time. Second, the sliding window becomes significantly faster since distances between quantized HOG descriptors can be precomputed. Our results significantly outperform other segmentation-free methods in the literature, both in accuracy and in speed and memory usage.
|
|