|
Jiaolong Xu, David Vazquez, Antonio Lopez, Javier Marin, & Daniel Ponsa. (2014). Learning a Part-based Pedestrian Detector in Virtual World. TITS - IEEE Transactions on Intelligent Transportation Systems, 15(5), 2121–2131.
Abstract: Detecting pedestrians with on-board vision systems is of paramount interest for assisting drivers to prevent vehicle-to-pedestrian accidents. The core of a pedestrian detector is its classification module, which aims at deciding if a given image window contains a pedestrian. Given the difficulty of this task, many classifiers have been proposed during the last fifteen years. Among them, the so-called (deformable) part-based classifiers including multi-view modeling are usually top ranked in accuracy. Training such classifiers is not trivial since a proper aspect clustering and spatial part alignment of the pedestrian training samples are crucial for obtaining an accurate classifier. In this paper, first we perform automatic aspect clustering and part alignment by using virtual-world pedestrians, i.e., human annotations are not required. Second, we use a mixture-of-parts approach that allows part sharing among different aspects. Third, these proposals are integrated in a learning framework which also allows to incorporate real-world training data to perform domain adaptation between virtual- and real-world cameras. Overall, the obtained results on four popular on-board datasets show that our proposal clearly outperforms the state-of-the-art deformable part-based detector known as latent SVM.
Keywords: Domain Adaptation; Pedestrian Detection; Virtual Worlds
|
|
|
Jose Manuel Alvarez, Antonio Lopez, Theo Gevers, & Felipe Lumbreras. (2014). Combining Priors, Appearance and Context for Road Detection. TITS - IEEE Transactions on Intelligent Transportation Systems, 15(3), 1168–1178.
Abstract: Detecting the free road surface ahead of a moving vehicle is an important research topic in different areas of computer vision, such as autonomous driving or car collision warning.
Current vision-based road detection methods are usually based solely on low-level features. Furthermore, they generally assume structured roads, road homogeneity, and uniform lighting conditions, constraining their applicability in real-world scenarios. In this paper, road priors and contextual information are introduced for road detection. First, we propose an algorithm to estimate road priors online using geographical information, providing relevant initial information about the road location. Then, contextual cues, including horizon lines, vanishing points, lane markings, 3-D scene layout, and road geometry, are used in addition to low-level cues derived from the appearance of roads. Finally, a generative model is used to combine these cues and priors, leading to a road detection method that is, to a large degree, robust to varying imaging conditions, road types, and scenarios.
Keywords: Illuminant invariance; lane markings; road detection; road prior; road scene understanding; vanishing point; 3-D scene layout
|
|
|
T. Mouats, N. Aouf, Angel Sappa, Cristhian A. Aguilera-Carrasco, & Ricardo Toledo. (2015). Multi-Spectral Stereo Odometry. TITS - IEEE Transactions on Intelligent Transportation Systems, 16(3), 1210–1224.
Abstract: In this paper, we investigate the problem of visual odometry for ground vehicles based on the simultaneous utilization of multispectral cameras. It encompasses a stereo rig composed of an optical (visible) and thermal sensors. The novelty resides in the localization of the cameras as a stereo setup rather
than two monocular cameras of different spectrums. To the best of our knowledge, this is the first time such task is attempted. Log-Gabor wavelets at different orientations and scales are used to extract interest points from both images. These are then described using a combination of frequency and spatial information within the local neighborhood. Matches between the pairs of multimodal images are computed using the cosine similarity function based
on the descriptors. Pyramidal Lucas–Kanade tracker is also introduced to tackle temporal feature matching within challenging sequences of the data sets. The vehicle egomotion is computed from the triangulated 3-D points corresponding to the matched features. A windowed version of bundle adjustment incorporating
Gauss–Newton optimization is utilized for motion estimation. An outlier removal scheme is also included within the framework to deal with outliers. Multispectral data sets were generated and used as test bed. They correspond to real outdoor scenarios captured using our multimodal setup. Finally, detailed results validating the proposed strategy are illustrated.
Keywords: Egomotion estimation; feature matching; multispectral odometry (MO); optical flow; stereo odometry; thermal imagery
|
|
|
Zhijie Fang, & Antonio Lopez. (2019). Intention Recognition of Pedestrians and Cyclists by 2D Pose Estimation. TITS - IEEE Transactions on Intelligent Transportation Systems, 21(11), 4773–4783.
Abstract: Anticipating the intentions of vulnerable road users (VRUs) such as pedestrians and cyclists is critical for performing safe and comfortable driving maneuvers. This is the case for human driving and, thus, should be taken into account by systems providing any level of driving assistance, from advanced driver assistant systems (ADAS) to fully autonomous vehicles (AVs). In this paper, we show how the latest advances on monocular vision-based human pose estimation, i.e. those relying on deep Convolutional Neural Networks (CNNs), enable to recognize the intentions of such VRUs. In the case of cyclists, we assume that they follow traffic rules to indicate future maneuvers with arm signals. In the case of pedestrians, no indications can be assumed. Instead, we hypothesize that the walking pattern of a pedestrian allows to determine if he/she has the intention of crossing the road in the path of the ego-vehicle, so that the ego-vehicle must maneuver accordingly (e.g. slowing down or stopping). In this paper, we show how the same methodology can be used for recognizing pedestrians and cyclists' intentions. For pedestrians, we perform experiments on the JAAD dataset. For cyclists, we did not found an analogous dataset, thus, we created our own one by acquiring and annotating videos which we share with the research community. Overall, the proposed pipeline provides new state-of-the-art results on the intention recognition of VRUs.
|
|
|
Saad Minhas, Aura Hernandez-Sabate, Shoaib Ehsan, & Klaus McDonald Maier. (2022). Effects of Non-Driving Related Tasks during Self-Driving mode. TITS - IEEE Transactions on Intelligent Transportation Systems, 23(2), 1391–1399.
Abstract: Perception reaction time and mental workload have proven to be crucial in manual driving. Moreover, in highly automated cars, where most of the research is focusing on Level 4 Autonomous driving, take-over performance is also a key factor when taking road safety into account. This study aims to investigate how the immersion in non-driving related tasks affects the take-over performance of drivers in given scenarios. The paper also highlights the use of virtual simulators to gather efficient data that can be crucial in easing the transition between manual and autonomous driving scenarios. The use of Computer Aided Simulations is of absolute importance in this day and age since the automotive industry is rapidly moving towards Autonomous technology. An experiment comprising of 40 subjects was performed to examine the reaction times of driver and the influence of other variables in the success of take-over performance in highly automated driving under different circumstances within a highway virtual environment. The results reflect the relationship between reaction times under different scenarios that the drivers might face under the circumstances stated above as well as the importance of variables such as velocity in the success on regaining car control after automated driving. The implications of the results acquired are important for understanding the criteria needed for designing Human Machine Interfaces specifically aimed towards automated driving conditions. Understanding the need to keep drivers in the loop during automation, whilst allowing drivers to safely engage in other non-driving related tasks is an important research area which can be aided by the proposed study.
|
|
|
Akhil Gurram, Ahmet Faruk Tuna, Fengyi Shen, Onay Urfalioglu, & Antonio Lopez. (2021). Monocular Depth Estimation through Virtual-world Supervision and Real-world SfM Self-Supervision. TITS - IEEE Transactions on Intelligent Transportation Systems, 23(8), 12738–12751.
Abstract: Depth information is essential for on-board perception in autonomous driving and driver assistance. Monocular depth estimation (MDE) is very appealing since it allows for appearance and depth being on direct pixelwise correspondence without further calibration. Best MDE models are based on Convolutional Neural Networks (CNNs) trained in a supervised manner, i.e., assuming pixelwise ground truth (GT). Usually, this GT is acquired at training time through a calibrated multi-modal suite of sensors. However, also using only a monocular system at training time is cheaper and more scalable. This is possible by relying on structure-from-motion (SfM) principles to generate self-supervision. Nevertheless, problems of camouflaged objects, visibility changes, static-camera intervals, textureless areas, and scale ambiguity, diminish the usefulness of such self-supervision. In this paper, we perform monocular depth estimation by virtual-world supervision (MonoDEVS) and real-world SfM self-supervision. We compensate the SfM self-supervision limitations by leveraging virtual-world images with accurate semantic and depth supervision and addressing the virtual-to-real domain gap. Our MonoDEVSNet outperforms previous MDE CNNs trained on monocular and even stereo sequences.
|
|
|
Yi Xiao, Felipe Codevilla, Akhil Gurram, Onay Urfalioglu, & Antonio Lopez. (2020). Multimodal end-to-end autonomous driving. TITS - IEEE Transactions on Intelligent Transportation Systems, , 1–11.
Abstract: A crucial component of an autonomous vehicle (AV) is the artificial intelligence (AI) is able to drive towards a desired destination. Today, there are different paradigms addressing the development of AI drivers. On the one hand, we find modular pipelines, which divide the driving task into sub-tasks such as perception and maneuver planning and control. On the other hand, we find end-to-end driving approaches that try to learn a direct mapping from input raw sensor data to vehicle control signals. The later are relatively less studied, but are gaining popularity since they are less demanding in terms of sensor data annotation. This paper focuses on end-to-end autonomous driving. So far, most proposals relying on this paradigm assume RGB images as input sensor data. However, AVs will not be equipped only with cameras, but also with active sensors providing accurate depth information (e.g., LiDARs). Accordingly, this paper analyses whether combining RGB and depth modalities, i.e. using RGBD data, produces better end-to-end AI drivers than relying on a single modality. We consider multimodality based on early, mid and late fusion schemes, both in multisensory and single-sensor (monocular depth estimation) settings. Using the CARLA simulator and conditional imitation learning (CIL), we show how, indeed, early fusion multimodality outperforms single-modality.
|
|
|
Maria Salamo, & Sergio Escalera. (2011). Increasing Retrieval Quality in Conversational Recommenders. TKDE - IEEE Transactions on Knowledge and Data Engineering, 99, 1.
Abstract: IF JCR CCIA 2.286 2009 24/103
JCR Impact Factor 2010: 1.851
A major task of research in conversational recommender systems is personalization. Critiquing is a common and powerful form of feedback, where a user can express her feature preferences by applying a series of directional critiques over the recommendations instead of providing specific preference values. Incremental Critiquing is a conversational recommender system that uses critiquing as a feedback to efficiently personalize products. The expectation is that in each cycle the system retrieves the products that best satisfy the user’s soft product preferences from a minimal information input. In this paper, we present a novel technique that increases retrieval quality based on a combination of compatibility and similarity scores. Under the hypothesis that a user learns Turing the recommendation process, we propose two novel exponential reinforcement learning approaches for compatibility that take into account both the instant at which the user makes a critique and the number of satisfied critiques. Moreover, we consider that the impact of features on the similarity differs according to the preferences manifested by the user. We propose a global weighting approach that uses a common weight for nearest cases in order to focus on groups of relevant products. We show that our methodology significantly improves recommendation efficiency in four data sets of different sizes in terms of session length in comparison with state-of-the-art approaches. Moreover, our recommender shows higher robustness against noisy user data when compared to classical approaches
|
|
|
A.F. Sole, S. Ngan, G. Sapiro, X. Hu, & Antonio Lopez. (2001). Anisotropic 2-D and 3-D Averaging of fMRI Signals. IEEE Transactions on Medical Imaging, 2020(2), 86–93.
|
|
|
Amir A.Amini, Yasheng Chen, Mohamed Elayyadi, & Petia Radeva. (2001). Tag Surface Reconstruction and Tracking of Myocardial Beads from SPAMM-MRI with Parametric B-Spline Surfaces. TMI - IEEE Transactions on Medical Imaging, 94–103.
Abstract: Magnetic resonance imaging (MRI) is unique in its ability to noninvasively and selectively alter tissue magnetization, and create tag planes intersecting image slices. The resulting grid of signal voids allows for tracking deformations of tissues in otherwise homogeneous-signal myocardial regions. In this paper, we propose a specific spatial modulation of magnetization (SPAMM) imaging protocol together with efficient techniques for measurement of three-dimensional (3-D) motion of material points of the human heart (referred to as myocardial beads) from images collected with the SPAMM method. The techniques make use of tagged images in orthogonal views by explicitly reconstructing 3-D B-spline surface representation of tag planes (tag planes in two orthogonal orientations intersecting the short-axis (SA) image slices and tag planes in an orientation orthogonal to the short-axis tag planes intersecting long-axis (LA) image slices). The developed methods allow for viewing deformations of 3-D tag surfaces, spatial correspondence of long-axis and short-axis image slice and tag positions, as well as nonrigid movement of myocardial beads as a function of time.
Keywords: B-spline surfaces, cardiac motion, myocardial beads, myocardial infarction, tagged MRI.
|
|
|
Fernando Vilariño, Panagiota Spyridonos, Fosca De Iorio, Jordi Vitria, Fernando Azpiroz, & Petia Radeva. (2010). Intestinal Motility Assessment With Video Capsule Endoscopy: Automatic Annotation of Phasic Intestinal Contractions. TMI - IEEE Transactions on Medical Imaging, 29(2), 246–259.
Abstract: Intestinal motility assessment with video capsule endoscopy arises as a novel and challenging clinical fieldwork. This technique is based on the analysis of the patterns of intestinal contractions shown in a video provided by an ingestible capsule with a wireless micro-camera. The manual labeling of all the motility events requires large amount of time for offline screening in search of findings with low prevalence, which turns this procedure currently unpractical. In this paper, we propose a machine learning system to automatically detect the phasic intestinal contractions in video capsule endoscopy, driving a useful but not feasible clinical routine into a feasible clinical procedure. Our proposal is based on a sequential design which involves the analysis of textural, color, and blob features together with SVM classifiers. Our approach tackles the reduction of the imbalance rate of data and allows the inclusion of domain knowledge as new stages in the cascade. We present a detailed analysis, both in a quantitative and a qualitative way, by providing several measures of performance and the assessment study of interobserver variability. Our system performs at 70% of sensitivity for individual detection, whilst obtaining equivalent patterns to those of the experts for density of contractions.
|
|
|
Jaume Garcia, Debora Gil, Luis Badiella, Aura Hernandez-Sabate, Francesc Carreras, Sandra Pujades, et al. (2010). A Normalized Framework for the Design of Feature Spaces Assessing the Left Ventricular Function. TMI - IEEE Transactions on Medical Imaging, 29(3), 733–745.
Abstract: A through description of the left ventricle functionality requires combining complementary regional scores. A main limitation is the lack of multiparametric normality models oriented to the assessment of regional wall motion abnormalities (RWMA). This paper covers two main topics involved in RWMA assessment. We propose a general framework allowing the fusion and comparison across subjects of different regional scores. Our framework is used to explore which combination of regional scores (including 2-D motion and strains) is better suited for RWMA detection. Our statistical analysis indicates that for a proper (within interobserver variability) identification of RWMA, models should consider motion and extreme strains.
|
|
|
Debora Gil, Aura Hernandez-Sabate, Oriol Rodriguez, J. Mauri, & Petia Radeva. (2006). Statistical Strategy for Anisotropic Adventitia Modelling in IVUS. IEEE Transactions on Medical Imaging, 25(6), 768–778.
Abstract: Vessel plaque assessment by analysis of intravascular ultrasound sequences is a useful tool for cardiac disease diagnosis and intervention. Manual detection of luminal (inner) and mediaadventitia (external) vessel borders is the main activity of physicians in the process of lumen narrowing (plaque) quantification. Difficult definition of vessel border descriptors, as well as, shades, artifacts, and blurred signal response due to ultrasound physical properties trouble automated adventitia segmentation. In order to efficiently approach such a complex problem, we propose blending advanced anisotropic filtering operators and statistical classification techniques into a vessel border modelling strategy. Our systematic statistical analysis shows that the reported adventitia detection achieves an accuracy in the range of interobserver variability regardless of plaque nature, vessel geometry, and incomplete vessel borders. Index Terms–-Anisotropic processing, intravascular ultrasound (IVUS), vessel border segmentation, vessel structure classification.
Keywords: Corners; T-junctions; Wavelets
|
|
|
Debora Gil, Oriol Rodriguez-Leor, Petia Radeva, & J. Mauri. (2008). Myocardial Perfusion Characterization From Contrast Angiography Spectral Distribution. IEEE Transactions on Medical Imaging, 27(5), 641–649.
Abstract: Despite recovering a normal coronary flow after acute myocardial infarction, percutaneous coronary intervention does not guarantee a proper perfusion (irrigation) of the infarcted area. This damage in microcirculation integrity may detrimentally affect the patient survival. Visual assessment of the myocardium opacification in contrast angiography serves to define a subjective score of the microcirculation integrity myocardial blush analysis (MBA). Although MBA correlates with patient prognosis its visual assessment is a very difficult task that requires of a highly expertise training in order to achieve a good intraobserver and interobserver agreement. In this paper, we provide objective descriptors of the myocardium staining pattern by analyzing the spectrum of the image local statistics. The descriptors proposed discriminate among the different phenomena observed in the angiographic sequence and allow defining an objective score of the myocardial perfusion.
Keywords: Contrast angiography; myocardial perfusion; spectral analysis.
|
|
|
Aura Hernandez-Sabate, Debora Gil, Eduard Fernandez-Nofrerias, Petia Radeva, & Enric Marti. (2009). Approaching Artery Rigid Dynamics in IVUS. TMI - IEEE Transactions on Medical Imaging, 28(11), 1670–1680.
Abstract: Tissue biomechanical properties (like strain and stress) are playing an increasing role in diagnosis and long-term treatment of intravascular coronary diseases. Their assessment strongly relies on estimation of vessel wall deformation. Since intravascular ultrasound (IVUS) sequences allow visualizing vessel morphology and reflect its dynamics, this technique represents a useful tool for evaluation of tissue mechanical properties. Image misalignment introduced by vessel-catheter motion is a major artifact for a proper tracking of tissue deformation. In this work, we focus on compensating and assessing IVUS rigid in-plane motion due to heart beating. Motion parameters are computed by considering both the vessel geometry and its appearance in the image. Continuum mechanics laws serve to introduce a novel score measuring motion reduction in in vivo sequences. Synthetic experiments validate the proposed score as measure of motion parameters accuracy; whereas results in in vivo pullbacks show the reliability of the presented methodologies in clinical cases.
Keywords: Fourier analysis; intravascular ultrasound (IVUS) dynamics; longitudinal motion; quality measures; tissue deformation.
|
|