|
Anders Skaarup Johansen, Kamal Nasrollahi, Sergio Escalera, & Thomas B. Moeslund. (2023). Who Cares about the Weather? Inferring Weather Conditions for Weather-Aware Object Detection in Thermal Images. AS - Applied Sciences, 13(18).
Abstract: Deployments of real-world object detection systems often experience a degradation in performance over time due to concept drift. Systems that leverage thermal cameras are especially susceptible because the respective thermal signatures of objects and their surroundings are highly sensitive to environmental changes. In this study, two types of weather-aware latent conditioning methods are investigated. The proposed method aims to guide two object detectors, (YOLOv5 and Deformable DETR) to become weather-aware. This is achieved by leveraging an auxiliary branch that predicts weather-related information while conditioning intermediate layers of the object detector. While the conditioning methods proposed do not directly improve the accuracy of baseline detectors, it can be observed that conditioned networks manage to extract a weather-related signal from the thermal images, thus resulting in a decreased miss rate at the cost of increased false positives. The extracted signal appears noisy and is thus challenging to regress accurately. This is most likely a result of the qualitative nature of the thermal sensor; thus, further work is needed to identify an ideal method for optimizing the conditioning branch, as well as to further improve the accuracy of the system.
Keywords: thermal; object detection; concept drift; conditioning; weather recognition
|
|
|
Katerine Diaz, Aura Hernandez-Sabate, & Antonio Lopez. (2016). A reduced feature set for driver head pose estimation. ASOC - Applied Soft Computing, 45, 98–107.
Abstract: Evaluation of driving performance is of utmost importance in order to reduce road accident rate. Since driving ability includes visual-spatial and operational attention, among others, head pose estimation of the driver is a crucial indicator of driving performance. This paper proposes a new automatic method for coarse and fine head's yaw angle estimation of the driver. We rely on a set of geometric features computed from just three representative facial keypoints, namely the center of the eyes and the nose tip. With these geometric features, our method combines two manifold embedding methods and a linear regression one. In addition, the method has a confidence mechanism to decide if the classification of a sample is not reliable. The approach has been tested using the CMU-PIE dataset and our own driver dataset. Despite the very few facial keypoints required, the results are comparable to the state-of-the-art techniques. The low computational cost of the method and its robustness makes feasible to integrate it in massive consume devices as a real time application.
Keywords: Head pose estimation; driving performance evaluation; subspace based methods; linear regression
|
|
|
Jaume Amores. (2013). Multiple Instance Classification: review, taxonomy and comparative study. AI - Artificial Intelligence, 201, 81–105.
Abstract: Multiple Instance Learning (MIL) has become an important topic in the pattern recognition community, and many solutions to this problemhave been proposed until now. Despite this fact, there is a lack of comparative studies that shed light into the characteristics and behavior of the different methods. In this work we provide such an analysis focused on the classification task (i.e.,leaving out other learning tasks such as regression). In order to perform our study, we implemented
fourteen methods grouped into three different families. We analyze the performance of the approaches across a variety of well-known databases, and we also study their behavior in synthetic scenarios in order to highlight their characteristics. As a result of this analysis, we conclude that methods that extract global bag-level information show a clearly superior performance in general. In this sense, the analysis permits us to understand why some types of methods are more successful than others, and it permits us to establish guidelines in the design of new MIL
methods.
Keywords: Multi-instance learning; Codebook; Bag-of-Words
|
|
|
A. Martinez, & Jordi Vitria. (1995). A Development Plataform for Autonomous Agents. ASI–AA–95 – Practice and Future of Autonomous Agents., .
|
|
|
Antonio Lopez, Ernest Valveny, & Juan J. Villanueva. (2005). Real-time quality control of surgical material packaging by artificial vision. Assembly Automation, 25(3).
|
|
|
Wenlong Deng, Yongli Mou, Takahiro Kashiwa, Sergio Escalera, Kohei Nagai, Kotaro Nakayama, et al. (2020). Vision based Pixel-level Bridge Structural Damage Detection Using a Link ASPP Network. AC - Automation in Construction, 110, 102973.
Abstract: Structural Health Monitoring (SHM) has greatly benefited from computer vision. Recently, deep learning approaches are widely used to accurately estimate the state of deterioration of infrastructure. In this work, we focus on the problem of bridge surface structural damage detection, such as delamination and rebar exposure. It is well known that the quality of a deep learning model is highly dependent on the quality of the training dataset. Bridge damage detection, our application domain, has the following main challenges: (i) labeling the damages requires knowledgeable civil engineering professionals, which makes it difficult to collect a large annotated dataset; (ii) the damage area could be very small, whereas the background area is large, which creates an unbalanced training environment; (iii) due to the difficulty to exactly determine the extension of the damage, there is often a variation among different labelers who perform pixel-wise labeling. In this paper, we propose a novel model for bridge structural damage detection to address the first two challenges. This paper follows the idea of an atrous spatial pyramid pooling (ASPP) module that is designed as a novel network for bridge damage detection. Further, we introduce the weight balanced Intersection over Union (IoU) loss function to achieve accurate segmentation on a highly unbalanced small dataset. The experimental results show that (i) the IoU loss function improves the overall performance of damage detection, as compared to cross entropy loss or focal loss, and (ii) the proposed model has a better ability to detect a minority class than other light segmentation networks.
Keywords: Semantic image segmentation; Deep learning
|
|
|
Joakim Bruslund Haurum, Meysam Madadi, Sergio Escalera, & Thomas B. Moeslund. (2022). Multi-scale hybrid vision transformer and Sinkhorn tokenizer for sewer defect classification. AC - Automation in Construction, 144, 104614.
Abstract: A crucial part of image classification consists of capturing non-local spatial semantics of image content. This paper describes the multi-scale hybrid vision transformer (MSHViT), an extension of the classical convolutional neural network (CNN) backbone, for multi-label sewer defect classification. To better model spatial semantics in the images, features are aggregated at different scales non-locally through the use of a lightweight vision transformer, and a smaller set of tokens was produced through a novel Sinkhorn clustering-based tokenizer using distinct cluster centers. The proposed MSHViT and Sinkhorn tokenizer were evaluated on the Sewer-ML multi-label sewer defect classification dataset, showing consistent performance improvements of up to 2.53 percentage points.
Keywords: Sewer Defect Classification; Vision Transformers; Sinkhorn-Knopp; Convolutional Neural Networks; Closed-Circuit Television; Sewer Inspection
|
|
|
Arnau Ramisa, Adriana Tapus, David Aldavert, Ricardo Toledo, & Ramon Lopez de Mantaras. (2009). Robust Vision-Based Localization using Combinations of Local Feature Regions Detectors. AR - Autonomous Robots, 27(4), 373–385.
Abstract: This paper presents a vision-based approach for mobile robot localization. The model of the environment is topological. The new approach characterizes a place using a signature. This signature consists of a constellation of descriptors computed over different types of local affine covariant regions extracted from an omnidirectional image acquired rotating a standard camera with a pan-tilt unit. This type of representation permits a reliable and distinctive environment modelling. Our objectives were to validate the proposed method in indoor environments and, also, to find out if the combination of complementary local feature region detectors improves the localization versus using a single region detector. Our experimental results show that if false matches are effectively rejected, the combination of different covariant affine region detectors increases notably the performance of the approach by combining the different strengths of the individual detectors. In order to reduce the localization time, two strategies are evaluated: re-ranking the map nodes using a global similarity measure and using standard perspective view field of 45°.
In order to systematically test topological localization methods, another contribution proposed in this work is a novel method to see the degradation in localization performance as the robot moves away from the point where the original signature was acquired. This allows to know the robustness of the proposed signature. In order for this to be effective, it must be done in several, variated, environments that test all the possible situations in which the robot may have to perform localization.
|
|
|
Oriol Ramos Terrades, Albert Berenguel, & Debora Gil. (2022). A Flexible Outlier Detector Based on a Topology Given by Graph Communities. BDR - Big Data Research, 29, 100332.
Abstract: Outlier detection is essential for optimal performance of machine learning methods and statistical predictive models. Their detection is especially determinant in small sample size unbalanced problems, since in such settings outliers become highly influential and significantly bias models. This particular experimental settings are usual in medical applications, like diagnosis of rare pathologies, outcome of experimental personalized treatments or pandemic emergencies. In contrast to population-based methods, neighborhood based local approaches compute an outlier score from the neighbors of each sample, are simple flexible methods that have the potential to perform well in small sample size unbalanced problems. A main concern of local approaches is the impact that the computation of each sample neighborhood has on the method performance. Most approaches use a distance in the feature space to define a single neighborhood that requires careful selection of several parameters, like the number of neighbors.
This work presents a local approach based on a local measure of the heterogeneity of sample labels in the feature space considered as a topological manifold. Topology is computed using the communities of a weighted graph codifying mutual nearest neighbors in the feature space. This way, we provide with a set of multiple neighborhoods able to describe the structure of complex spaces without parameter fine tuning. The extensive experiments on real-world and synthetic data sets show that our approach outperforms, both, local and global strategies in multi and single view settings.
Keywords: Classification algorithms; Detection algorithms; Description of feature space local structure; Graph communities; Machine learning algorithms; Outlier detectors
|
|
|
Laura Igual, Joan Carles Soliva, Antonio Hernandez, Sergio Escalera, Xavier Jimenez, Oscar Vilarroya, et al. (2011). A fully-automatic caudate nucleus segmentation of brain MRI: Application in volumetric analysis of pediatric attention-deficit/hyperactivity disorder. BEO - BioMedical Engineering Online, 10(105), 1–23.
Abstract: Background
Accurate automatic segmentation of the caudate nucleus in magnetic resonance images (MRI) of the brain is of great interest in the analysis of developmental disorders. Segmentation methods based on a single atlas or on multiple atlases have been shown to suitably localize caudate structure. However, the atlas prior information may not represent the structure of interest correctly. It may therefore be useful to introduce a more flexible technique for accurate segmentations.
Method
We present Cau-dateCut: a new fully-automatic method of segmenting the caudate nucleus in MRI. CaudateCut combines an atlas-based segmentation strategy with the Graph Cut energy-minimization framework. We adapt the Graph Cut model to make it suitable for segmenting small, low-contrast structures, such as the caudate nucleus, by defining new energy function data and boundary potentials. In particular, we exploit information concerning the intensity and geometry, and we add supervised energies based on contextual brain structures. Furthermore, we reinforce boundary detection using a new multi-scale edgeness measure.
Results
We apply the novel CaudateCut method to the segmentation of the caudate nucleus to a new set of 39 pediatric attention-deficit/hyperactivity disorder (ADHD) patients and 40 control children, as well as to a public database of 18 subjects. We evaluate the quality of the segmentation using several volumetric and voxel by voxel measures. Our results show improved performance in terms of segmentation compared to state-of-the-art approaches, obtaining a mean overlap of 80.75%. Moreover, we present a quantitative volumetric analysis of caudate abnormalities in pediatric ADHD, the results of which show strong correlation with expert manual analysis.
Conclusion
CaudateCut generates segmentation results that are comparable to gold-standard segmentations and which are reliable in the analysis of differentiating neuroanatomical abnormalities between healthy controls and pediatric ADHD.
Keywords: Brain caudate nucleus; segmentation; MRI; atlas-based strategy; Graph Cut framework
|
|
|
Manisha Das, Deep Gupta, Petia Radeva, & Ashwini M. Bakde. (2021). Optimized CT-MR neurological image fusion framework using biologically inspired spiking neural model in hybrid ℓ1 - ℓ0 layer decomposition domain. BSPC - Biomedical Signal Processing and Control, 68, 102535.
Abstract: Medical image fusion plays an important role in the clinical diagnosis of several critical neurological diseases by merging complementary information available in multimodal images. In this paper, a novel CT-MR neurological image fusion framework is proposed using an optimized biologically inspired feedforward neural model in two-scale hybrid ℓ1 − ℓ0 decomposition domain using gray wolf optimization to preserve the structural as well as texture information present in source CT and MR images. Initially, the source images are subjected to two-scale ℓ1 − ℓ0 decomposition with optimized parameters, giving a scale-1 detail layer, a scale-2 detail layer and a scale-2 base layer. Two detail layers at scale-1 and 2 are fused using an optimized biologically inspired neural model and weighted average scheme based on local energy and modified spatial frequency to maximize the preservation of edges and local textures, respectively, while the scale-2 base layer gets fused using choose max rule to preserve the background information. To optimize the hyper-parameters of hybrid ℓ1 − ℓ0 decomposition and biologically inspired neural model, a fitness function is evaluated based on spatial frequency and edge index of the resultant fused image obtained by adding all the fused components. The fusion performance is analyzed by conducting extensive experiments on different CT-MR neurological images. Experimental results indicate that the proposed method provides better-fused images and outperforms the other state-of-the-art fusion methods in both visual and quantitative assessments.
|
|
|
Clementine Decamps, Alexis Arnaud, Florent Petitprez, Mira Ayadi, Aurelia Baures, Lucile Armenoult, et al. (2021). DECONbench: a benchmarking platform dedicated to deconvolution methods for tumor heterogeneity quantification. BMC Bioinformatics, 22, 473.
Abstract: Quantification of tumor heterogeneity is essential to better understand cancer progression and to adapt therapeutic treatments to patient specificities. Bioinformatic tools to assess the different cell populations from single-omic datasets as bulk transcriptome or methylome samples have been recently developed, including reference-based and reference-free methods. Improved methods using multi-omic datasets are yet to be developed in the future and the community would need systematic tools to perform a comparative evaluation of these algorithms on controlled data.
|
|
|
David Roche, Debora Gil, & Jesus Giraldo. (2013). Mechanistic analysis of the function of agonists and allosteric modulators: Reconciling two-state and operational models. BJP - British Journal of Pharmacology, 169(6), 1189–202.
Abstract: Two-state and operational models of both agonism and allosterism are compared to identify and characterize common pharmacological parameters. To account for the receptor-dependent basal response, constitutive receptor activity is considered in the operational models. By arranging two-state models as the fraction of active receptors and operational models as the fractional response relative to the maximum effect of the system, a one-by-one correspondence between parameters is found. The comparative analysis allows a better understanding of complex allosteric interactions. In particular, the inclusion of constitutive receptor activity in the operational model of allosterism allows the characterization of modulators able to lower the basal response of the system; that is, allosteric modulators with negative intrinsic efficacy. Theoretical simulations and overall goodness of fit of the models to simulated data suggest that it is feasible to apply the models to experimental data and constitute one step forward in receptor theory formalism.
|
|
|
Katerine Diaz, Konstantia Georgouli, Anastasios Koidis, & Jesus Martinez del Rincon. (2017). Incremental model learning for spectroscopy-based food analysis. CILS - Chemometrics and Intelligent Laboratory Systems, 167, 123–131.
Abstract: In this paper we propose the use of incremental learning for creating and improving multivariate analysis models in the field of chemometrics of spectral data. As main advantages, our proposed incremental subspace-based learning allows creating models faster, progressively improving previously created models and sharing them between laboratories and institutions without requiring transferring or disclosing individual spectra samples. In particular, our approach allows to improve the generalization and adaptability of previously generated models with a few new spectral samples to be applicable to real-world situations. The potential of our approach is demonstrated using vegetable oil type identification based on spectroscopic data as case study. Results show how incremental models maintain the accuracy of batch learning methodologies while reducing their computational cost and handicaps.
Keywords: Incremental model learning; IGDCV technique; Subspace based learning; IdentificationVegetable oils; FT-IR spectroscopy
|
|
|
Marta Diez-Ferrer, Debora Gil, Elena Carreño, Susana Padrones, Samantha Aso, Vanesa Vicens, et al. (2016). Positive Airway Pressure-Enhanced CT to Improve Virtual Bronchoscopic Navigation. CHEST - Chest Journal, 150(4), 1003A.
|
|