|
Murad Al Haj, Andrew Bagdanov, Jordi Gonzalez, & Xavier Roca. (2009). Robust and Efficient Multipose Face Detection Using Skin Color Segmentation. In 4th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 5524). LNCS. Springer Berlin Heidelberg.
Abstract: In this paper we describe an efficient technique for detecting faces in arbitrary images and video sequences. The approach is based on segmentation of images or video frames into skin-colored blobs using a pixel-based heuristic. Scale and translation invariant features are then computed from these segmented blobs which are used to perform statistical discrimination between face and non-face classes. We train and evaluate our method on a standard, publicly available database of face images and analyze its performance over a range of statistical pattern classifiers. The generalization of our approach is illustrated by testing on an independent sequence of frames containing many faces and non-faces. These experiments indicate that our proposed approach obtains false positive rates comparable to more complex, state-of-the-art techniques, and that it generalizes better to new data. Furthermore, the use of skin blobs and invariant features requires fewer training samples since significantly fewer non-face candidate regions must be considered when compared to AdaBoost-based approaches.
|
|
|
Simeon Petkov, Adriana Romero, Xavier Carrillo, Petia Radeva, & Carlo Gatta. (2012). Robust and accurate diaphragm border detection in cardiac X-Ray angiographies. In Statistical Atlases And Computational Models Of The Heart: Imaging and Modelling Challenges (Vol. 7746, pp. 225–234). LNCS.
Abstract: Workshop STACOM, dins del MICCAI
X-ray angiography is the most common imaging modality employed in the diagnosis of coronary diseases prior to or during a catheter-based intervention. The analysis of the patient X-Ray sequence can provide useful information about the degree of arterial stenosis, the myocardial perfusion and other clinical parameters. If the sequence has been acquired to evaluate the perfusion grade, the opacity due to the diaphragm could potentially hinder any kind of visual inspection and make more difficult a computer aided measurements. In this paper we propose an accurate and robust method to automatically identify the diaphragm border in each frame. Quantitative evaluation on a set of 11 sequences shows that the proposed algorithm outperforms previous methods.
|
|
|
Sangeeth Reddy, Minesh Mathew, Lluis Gomez, Marçal Rusiñol, Dimosthenis Karatzas, & C.V. Jawahar. (2020). RoadText-1K: Text Detection and Recognition Dataset for Driving Videos. In IEEE International Conference on Robotics and Automation.
Abstract: Perceiving text is crucial to understand semantics of outdoor scenes and hence is a critical requirement to build intelligent systems for driver assistance and self-driving. Most of the existing datasets for text detection and recognition comprise still images and are mostly compiled keeping text in mind. This paper introduces a new ”RoadText-1K” dataset for text in driving videos. The dataset is 20 times larger than the existing largest dataset for text in videos. Our dataset comprises 1000 video clips of driving without any bias towards text and with annotations for text bounding boxes and transcriptions in every frame. State of the art methods for text detection,
recognition and tracking are evaluated on the new dataset and the results signify the challenges in unconstrained driving videos compared to existing datasets. This suggests that RoadText-1K is suited for research and development of reading systems, robust enough to be incorporated into more complex downstream tasks like driver assistance and self-driving. The dataset can be found at http://cvit.iiit.ac.in/research/
projects/cvit-projects/roadtext-1k
|
|
|
Jose Manuel Alvarez, Theo Gevers, Y. LeCun, & Antonio Lopez. (2012). Road Scene Segmentation from a Single Image. In 12th European Conference on Computer Vision (Vol. 7578, pp. 376–389). LNCS. Springer Berlin Heidelberg.
Abstract: Road scene segmentation is important in computer vision for different applications such as autonomous driving and pedestrian detection. Recovering the 3D structure of road scenes provides relevant contextual information to improve their understanding.
In this paper, we use a convolutional neural network based algorithm to learn features from noisy labels to recover the 3D scene layout of a road image. The novelty of the algorithm relies on generating training labels by applying an algorithm trained on a general image dataset to classify on–board images. Further, we propose a novel texture descriptor based on a learned color plane fusion to obtain maximal uniformity in road areas. Finally, acquired (off–line) and current (on–line) information are combined to detect road areas in single images.
From quantitative and qualitative experiments, conducted on publicly available datasets, it is concluded that convolutional neural networks are suitable for learning 3D scene layout from noisy labels and provides a relative improvement of 7% compared to the baseline. Furthermore, combining color planes provides a statistical description of road areas that exhibits maximal uniformity and provides a relative improvement of 8% compared to the baseline. Finally, the improvement is even bigger when acquired and current information from a single image are combined
Keywords: road detection
|
|
|
Jose Manuel Alvarez, Theo Gevers, Ferran Diego, & Antonio Lopez. (2013). Road Geometry Classification by Adaptative Shape Models. TITS - IEEE Transactions on Intelligent Transportation Systems, 14(1), 459–468.
Abstract: Vision-based road detection is important for different applications in transportation, such as autonomous driving, vehicle collision warning, and pedestrian crossing detection. Common approaches to road detection are based on low-level road appearance (e.g., color or texture) and neglect of the scene geometry and context. Hence, using only low-level features makes these algorithms highly depend on structured roads, road homogeneity, and lighting conditions. Therefore, the aim of this paper is to classify road geometries for road detection through the analysis of scene composition and temporal coherence. Road geometry classification is proposed by building corresponding models from training images containing prototypical road geometries. We propose adaptive shape models where spatial pyramids are steered by the inherent spatial structure of road images. To reduce the influence of lighting variations, invariant features are used. Large-scale experiments show that the proposed road geometry classifier yields a high recognition rate of 73.57% ± 13.1, clearly outperforming other state-of-the-art methods. Including road shape information improves road detection results over existing appearance-based methods. Finally, it is shown that invariant features and temporal information provide robustness against disturbing imaging conditions.
Keywords: road detection
|
|
|
Jose Manuel Alvarez, & Antonio Lopez. (2011). Road Detection Based on Illuminant Invariance. TITS - IEEE Transactions on Intelligent Transportation Systems, 12(1), 184–193.
Abstract: By using an onboard camera, it is possible to detect the free road surface ahead of the ego-vehicle. Road detection is of high relevance for autonomous driving, road departure warning, and supporting driver-assistance systems such as vehicle and pedestrian detection. The key for vision-based road detection is the ability to classify image pixels as belonging or not to the road surface. Identifying road pixels is a major challenge due to the intraclass variability caused by lighting conditions. A particularly difficult scenario appears when the road surface has both shadowed and nonshadowed areas. Accordingly, we propose a novel approach to vision-based road detection that is robust to shadows. The novelty of our approach relies on using a shadow-invariant feature space combined with a model-based classifier. The model is built online to improve the adaptability of the algorithm to the current lighting and the presence of other vehicles in the scene. The proposed algorithm works in still images and does not depend on either road shape or temporal restrictions. Quantitative and qualitative experiments on real-world road sequences with heavy traffic and shadows show that the method is robust to shadows and lighting variations. Moreover, the proposed method provides the highest performance when compared with hue-saturation-intensity (HSI)-based algorithms.
Keywords: road detection
|
|
|
Angel Sappa, Rosa Herrero, Fadi Dornaika, David Geronimo, & Antonio Lopez. (2007). Road Approximation in Euclidean and v-Disparity Space: A Comparative Study. In Computer Aided Systems Theory, (Vol. 4739, 1105–1112). LNCS.
Abstract: This paper presents a comparative study between two road approximation techniques—planar surfaces—from stereo vision data. The first approach is carried out in the v-disparity space and is based on a voting scheme, the Hough transform. The second one consists in computing the best fitting plane for the whole 3D road data points, directly in the Euclidean space, by using least squares fitting. The comparative study is initially performed over a set of different synthetic surfaces
(e.g., plane, quadratic surface, cubic surface) digitized by a virtual stereo head; then real data obtained with a commercial stereo head are used. The comparative study is intended to be used as a criterion for fining the best technique according to the road geometry. Additionally, it highlights common problems driven from a wrong assumption about the scene’s prior knowledge.
|
|
|
Angel Sappa, Rosa Herrero, Fadi Dornaika, David Geronimo, & Antonio Lopez. (2007). Road Approximation in Euclidean and v-Disparity Space: A Comparative Study. In EUROCAST2007, Workshop on Cybercars and Intelligent Vehicles (368–369).
Abstract: This paper presents a comparative study between two road approximation techniques—planar surfaces—from stereo vision data. The first approach is carried out in the v-disparity space and is based on a voting scheme, the Hough transform. The second one consists in computing the best fitting plane for the whole 3D road data points, directly in the Euclidean space, by using least squares fitting. The comparative study is initially performed over a set of different synthetic surfaces
(e.g., plane, quadratic surface, cubic surface) digitized by a virtual stereo head; then real data obtained with a commercial stereo head are used. The comparative study is intended to be used as a criterion for fining the best technique according to the road geometry. Additionally, it highlights common problems driven from a wrong assumption about the scene’s prior knowledge.
|
|
|
Laura Lopez-Fuentes, Claudio Rossi, & Harald Skinnemoen. (2017). River segmentation for flood monitoring. In Data Science for Emergency Management at Big Data 2017.
Abstract: Floods are major natural disasters which cause deaths and material damages every year. Monitoring these events is crucial in order to reduce both the affected people and the economic losses. In this work we train and test three different Deep Learning segmentation algorithms to estimate the water area from river images, and compare their performances. We discuss the implementation of a novel data chain aimed to monitor river water levels by automatically process data collected from surveillance cameras, and to give alerts in case of high increases of the water level or flooding. We also create and openly publish the first image dataset for river water segmentation.
|
|
|
David Lloret, Antonio Lopez, & Joan Serrat. (1997). Rigid Registration of CT and MR volumes based on Rothes creases.
|
|
|
Fadi Dornaika, & Angel Sappa. (2006). Rigid and Non-Rigid Face Motion Tracking by Aligning Texture Maps and Stereo-Based 3D Models. In 8th International Conference on Advanced Concepts for Intelligent Vision Systems (ACIVS´06), LNCS 4179: 675–684.
|
|
|
Fadi Dornaika, & Angel Sappa. (2007). Rigid and Non-rigid Face Motion Tracking by Aligning Texture Maps and Stereo 3D Models. PRL - Pattern Recognition Letters, 28(15), 2116–2126.
|
|
|
A. Pujol, Antonio Lopez, Jose Luis Alba, & Juan J. Villanueva. (2001). Ridges, Valleys and Hausdorff Based Similarity Measures for Face Detection and Matching.
|
|
|
Antonio Lopez, & Joan Serrat. (1998). Ridges and Valleys in Image Analysis.
|
|
|
Antonio Lopez, Joan Serrat, J. Saludes, Cristina Cañero, Felipe Lumbreras, & T. Graf. (2005). Ridgeness for Detecting Lane Markings.
|
|