F. de la Torre, Jordi Vitria, Petia Radeva, & J. Melenchon. (2000). EigenFiltering for flexible Eigentracking. In 15 th International Conference on Pattern Recognition (Vol. 3, pp. 1118–1121).
|
Javier Varona, Jordi Gonzalez, Xavier Roca, & Juan J. Villanueva. (2000). iTrack: Image-based Probabilistic Tracking of People. In 15 th International Conference on Pattern Recognition (Vol. 3, pp. 1122–1125).
|
Ferran Diego, Jose Manuel Alvarez, Joan Serrat, & Antonio Lopez. (2010). Vision-based road detection via on-line video registration. In 13th Annual International Conference on Intelligent Transportation Systems (1135–1140).
Abstract: TB6.2
Road segmentation is an essential functionality for supporting advanced driver assistance systems (ADAS) such as road following and vehicle and pedestrian detection. Significant efforts have been made in order to solve this task using vision-based techniques. The major challenge is to deal with lighting variations and the presence of objects on the road surface. In this paper, we propose a new road detection method to infer the areas of the image depicting road surfaces without performing any image segmentation. The idea is to previously segment manually or semi-automatically the road region in a traffic-free reference video record on a first drive. And then to transfer these regions to the frames of a second video sequence acquired later in a second drive through the same road, in an on-line manner. This is possible because we are able to automatically align the two videos in time and space, that is, to synchronize them and warp each frame of the first video to its corresponding frame in the second one. The geometric transform can thus transfer the road region to the present frame on-line. In order to reduce the different lighting conditions which are present in outdoor scenarios, our approach incorporates a shadowless feature space which represents an image in an illuminant-invariant feature space. Furthermore, we propose a dynamic background subtraction algorithm which removes the regions containing vehicles in the observed frames which are within the transferred road region.
Keywords: video alignment; road detection
|
Josep Llados, Enric Marti, & Juan J.Villanueva. (2001). Symbol recognition by error-tolerant subgraph matching between region adjacency graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(10), 1137–1143.
Abstract: The recognition of symbols in graphic documents is an intensive research activity in the community of pattern recognition and document analysis. A key issue in the interpretation of maps, engineering drawings, diagrams, etc. is the recognition of domain dependent symbols according to a symbol database. In this work we first review the most outstanding symbol recognition methods from two different points of view: application domains and pattern recognition methods. In the second part of the paper, open and unaddressed problems involved in symbol recognition are described, analyzing their current state of art and discussing future research challenges. Thus, issues such as symbol representation, matching, segmentation, learning, scalability of recognition methods and performance evaluation are addressed in this work. Finally, we discuss the perspectives of symbol recognition concerning to new paradigms such as user interfaces in handheld computers or document database and WWW indexing by graphical content.
|
Naveen Onkarappa, & Angel Sappa. (2012). An Empirical Study on Optical Flow Accuracy Depending on Vehicle Speed. In IEEE Intelligent Vehicles Symposium (pp. 1138–1143). IEEE Xplore.
Abstract: Driver assistance and safety systems are getting attention nowadays towards automatic navigation and safety. Optical flow as a motion estimation technique has got major roll in making these systems a reality. Towards this, in the current paper, the suitability of polar representation for optical flow estimation in such systems is demonstrated. Furthermore, the influence of individual regularization terms on the accuracy of optical flow on image sequences of different speeds is empirically evaluated. Also a new synthetic dataset of image sequences with different speeds is generated along with the ground-truth optical flow.
|
Oriol Pujol, & David Masip. (2009). Geometry-Based Ensembles: Toward a Structural Characterization of the Classification Boundary. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(6), 1140–1146.
Abstract: This article introduces a novel binary discriminative learning technique based on the approximation of the non-linear decision boundary by a piece-wise linear smooth additive model. The decision border is geometrically defined by means of the characterizing boundary points – points that belong to the optimal boundary under a certain notion of robustness. Based on these points, a set of locally robust linear classifiers is defined and assembled by means of a Tikhonov regularized optimization procedure in an additive model to create a final lambda-smooth decision rule. As a result, a very simple and robust classifier with a strong geometrical meaning and non-linear behavior is obtained. The simplicity of the method allows its extension to cope with some of nowadays machine learning challenges, such as online learning, large scale learning or parallelization, with linear computational complexity. We validate our approach on the UCI database. Finally, we apply our technique in online and large scale scenarios, and in six real life computer vision and pattern recognition problems: gender recognition, intravascular ultrasound tissue classification, speed traffic sign detection, Chagas' disease severity detection, clef classification and action recognition using a 3D accelerometer data. The results are promising and this paper opens a line of research that deserves further attention
|
Mikhail Mozerov, & Joost Van de Weijer. (2015). Accurate stereo matching by two step global optimization. TIP - IEEE Transactions on Image Processing, 24(3), 1153–1163.
Abstract: In stereo matching cost filtering methods and energy minimization algorithms are considered as two different techniques. Due to their global extend energy minimization methods obtain good stereo matching results. However, they tend to fail in occluded regions, in which cost filtering approaches obtain better results. In this paper we intend to combine both approaches with the aim to improve overall stereo matching results. We show that a global optimization with a fully connected model can be solved by cost fil tering methods. Based on this observation we propose to perform stereo matching as a two-step energy minimization algorithm. We consider two MRF models: a fully connected model defined on the complete set of pixels in an image and a conventional locally connected model. We solve the energy minimization problem for the fully connected model, after which the marginal function of the solution is used as the unary potential in the locally connected MRF model. Experiments on the Middlebury stereo datasets show that the proposed method achieves state-of-the-arts results.
|
Carolina Malagelada, Fosca De Iorio, Fernando Azpiroz, Anna Accarino, Santiago Segui, Petia Radeva, et al. (2008). New Insight Into Intestinal Motor Function via Noninvasive Endoluminal Image Analysis. Gastroenterology, 1155–1162.
|
Jose Manuel Alvarez, Antonio Lopez, & Ramon Baldrich. (2008). Illuminant Invariant Model-Based Road Segmentation. In IEEE Intelligent Vehicles Symposium, (1155–1180).
|
Dimosthenis Karatzas, Lluis Gomez, Anguelos Nicolaou, Suman Ghosh, Andrew Bagdanov, Masakazu Iwamura, et al. (2015). ICDAR 2015 Competition on Robust Reading. In 13th International Conference on Document Analysis and Recognition ICDAR2015 (pp. 1156–1160).
|
Jean-Christophe Burie, J. Chazalon, M. Coustaty, S. Eskenazi, Muhammad Muzzamil Luqman, M. Mehri, et al. (2015). ICDAR2015 Competition on Smartphone Document Capture and OCR (SmartDoc). In 13th International Conference on Document Analysis and Recognition ICDAR2015 (pp. 1161–1165).
Abstract: Smartphones are enabling new ways of capture,
hence arises the need for seamless and reliable acquisition and
digitization of documents, in order to convert them to editable,
searchable and a more human-readable format. Current stateof-the-art
works lack databases and baseline benchmarks for
digitizing mobile captured documents. We have organized a
competition for mobile document capture and OCR in order to
address this issue. The competition is structured into two independent
challenges: smartphone document capture, and smartphone
OCR. This report describes the datasets for both challenges
along with their ground truth, details the performance evaluation
protocols which we used, and presents the final results of the
participating methods. In total, we received 13 submissions: 8
for challenge-I, and 5 for challenge-2.
|
Jose Manuel Alvarez, Antonio Lopez, Theo Gevers, & Felipe Lumbreras. (2014). Combining Priors, Appearance and Context for Road Detection. TITS - IEEE Transactions on Intelligent Transportation Systems, 15(3), 1168–1178.
Abstract: Detecting the free road surface ahead of a moving vehicle is an important research topic in different areas of computer vision, such as autonomous driving or car collision warning.
Current vision-based road detection methods are usually based solely on low-level features. Furthermore, they generally assume structured roads, road homogeneity, and uniform lighting conditions, constraining their applicability in real-world scenarios. In this paper, road priors and contextual information are introduced for road detection. First, we propose an algorithm to estimate road priors online using geographical information, providing relevant initial information about the road location. Then, contextual cues, including horizon lines, vanishing points, lane markings, 3-D scene layout, and road geometry, are used in addition to low-level cues derived from the appearance of roads. Finally, a generative model is used to combine these cues and priors, leading to a road detection method that is, to a large degree, robust to varying imaging conditions, road types, and scenarios.
Keywords: Illuminant invariance; lane markings; road detection; road prior; road scene understanding; vanishing point; 3-D scene layout
|
Aura Hernandez-Sabate, Jose Elias Yauri, Pau Folch, Daniel Alvarez, & Debora Gil. (2024). EEG Dataset Collection for Mental Workload Predictions in Flight-Deck Environment. SENS - Sensors, 24(4), 1174.
Abstract: High mental workload reduces human performance and the ability to correctly carry out complex tasks. In particular, aircraft pilots enduring high mental workloads are at high risk of failure, even with catastrophic outcomes. Despite progress, there is still a lack of knowledge about the interrelationship between mental workload and brain functionality, and there is still limited data on flight-deck scenarios. Although recent emerging deep-learning (DL) methods using physiological data have presented new ways to find new physiological markers to detect and assess cognitive states, they demand large amounts of properly annotated datasets to achieve good performance. We present a new dataset of electroencephalogram (EEG) recordings specifically collected for the recognition of different levels of mental workload. The data were recorded from three experiments, where participants were induced to different levels of workload through tasks of increasing cognition demand. The first involved playing the N-back test, which combines memory recall with arithmetical skills. The second was playing Heat-the-Chair, a serious game specifically designed to emphasize and monitor subjects under controlled concurrent tasks. The third was flying in an Airbus320 simulator and solving several critical situations. The design of the dataset has been validated on three different levels: (1) correlation of the theoretical difficulty of each scenario to the self-perceived difficulty and performance of subjects; (2) significant difference in EEG temporal patterns across the theoretical difficulties and (3) usefulness for the training and evaluation of AI models.
|
Ajian Liu, Zichang Tan, Jun Wan, Sergio Escalera, Guodong Guo, & Stan Z. Li. (2021). CASIA-SURF CeFA: A Benchmark for Multi-modal Cross-Ethnicity Face Anti-Spoofing. In IEEE Winter Conference on Applications of Computer Vision (pp. 1178–1186).
Abstract: The issue of ethnic bias has proven to affect the performance of face recognition in previous works, while it still remains to be vacant in face anti-spoofing. Therefore, in order to study the ethnic bias for face anti-spoofing, we introduce the largest CASIA-SURF Cross-ethnicity Face Anti-spoofing (CeFA) dataset, covering 3 ethnicities, 3 modalities, 1,607 subjects, and 2D plus 3D attack types. Five protocols are introduced to measure the affect under varied evaluation conditions, such as cross-ethnicity, unknown spoofs or both of them. As our knowledge, CASIA-SURF CeFA is the first dataset including explicit ethnic labels in current released datasets. Then, we propose a novel multi-modal fusion method as a strong baseline to alleviate the ethnic bias, which employs a partially shared fusion strategy to learn complementary information from multiple modalities. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability for other existing datasets, i.e., CASIA-SURF, OULU-NPU and SiW datasets. The dataset is available at https://sites.google.com/qq.com/face-anti-spoofing/welcome/challengecvpr2020?authuser=0.
|
Mohamed Ali Souibgui, & Y.Kessentini. (2022). DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3), 1180–1191.
Abstract: Documents often exhibit various forms of degradation, which make it hard to be read and substantially deteriorate the performance of an OCR system. In this paper, we propose an effective end-to-end framework named Document Enhancement Generative Adversarial Networks (DE-GAN) that uses the conditional GANs (cGANs) to restore severely degraded document images. To the best of our knowledge, this practice has not been studied within the context of generative adversarial deep networks. We demonstrate that, in different tasks (document clean up, binarization, deblurring and watermark removal), DE-GAN can produce an enhanced version of the degraded document with a high quality. In addition, our approach provides consistent improvements compared to state-of-the-art methods over the widely used DIBCO 2013, DIBCO 2017 and H-DIBCO 2018 datasets, proving its ability to restore a degraded document image to its ideal condition. The obtained results on a wide variety of degradation reveal the flexibility of the proposed model to be exploited in other document enhancement problems.
|