|
Aura Hernandez-Sabate, Monica Mitiko, Sergio Shiguemi, & Debora Gil. (2010). A validation protocol for assessing cardiac phase retrieval in IntraVascular UltraSound. In Computing in Cardiology (Vol. 37, pp. 899–902). IEEE.
Abstract: A good reliable approach to cardiac triggering is of utmost importance in obtaining accurate quantitative results of atherosclerotic plaque burden from the analysis of IntraVascular UltraSound. Although, in the last years, there has been an increase in research of methods for retrospective gating, there is no general consensus in a validation protocol. Many methods are based on quality assessment of longitudinal cuts appearance and those reporting quantitative numbers do not follow a standard protocol. Such heterogeneity in validation protocols makes faithful comparison across methods a difficult task. We propose a validation protocol based on the variability of the retrieved cardiac phase and explore the capability of several quality measures for quantifying such variability. An ideal detector, suitable for its application in clinical practice, should produce stable phases. That is, it should always sample the same cardiac cycle fraction. In this context, one should measure the variability (variance) of a candidate sampling with respect a ground truth (reference) sampling, since the variance would indicate how spread we are aiming a target. In order to quantify the deviation between the sampling and the ground truth, we have considered two quality scores reported in the literature: signed distance to the closest reference sample and distance to the right of each reference sample. We have also considered the residuals of the regression line of reference against candidate sampling. The performance of the measures has been explored on a set of synthetic samplings covering different cardiac cycle fractions and variabilities. From our simulations, we conclude that the metrics related to distances are sensitive to the shift considered while the residuals are robust against fraction and variabilities as far as one can establish a pair-wise correspondence between candidate and reference. We will further investigate the impact of false positive and negative detections in experimental data.
|
|
|
Adrien Gaidon, Antonio Lopez, & Florent Perronnin. (2018). The Reasonable Effectiveness of Synthetic Visual Data. IJCV - International Journal of Computer Vision, 126(9), 899–901.
|
|
|
Jose Carlos Rubio, Joan Serrat, Antonio Lopez, & Daniel Ponsa. (2010). Multiple-target tracking for the intelligent headlights control. In 13th Annual International Conference on Intelligent Transportation Systems (903–910).
Abstract: TA7.4
Intelligent vehicle lighting systems aim at automatically regulating the headlights' beam to illuminate as much of the road ahead as possible while avoiding dazzling other drivers. A key component of such a system is computer vision software that is able to distinguish blobs due to vehicles' headlights and rear lights from those due to road lamps and reflective elements such as poles and traffic signs. In a previous work, we have devised a set of specialized supervised classifiers to make such decisions based on blob features related to its intensity and shape. Despite the overall good performance, there remain challenging that have yet to be solved: notably, faint and tiny blobs corresponding to quite distant vehicles. In fact, for such distant blobs, classification decisions can be taken after observing them during a few frames. Hence, incorporating tracking could improve the overall lighting system performance by enforcing the temporal consistency of the classifier decision. Accordingly, this paper focuses on the problem of constructing blob tracks, which is actually one of multiple-target tracking (MTT), but under two special conditions: We have to deal with frequent occlusions, as well as blob splits and merges. We approach it in a novel way by formulating the problem as a maximum a posteriori inference on a Markov random field. The qualitative (in video form) and quantitative evaluation of our new MTT method shows good tracking results. In addition, we will also see that the classification performance of the problematic blobs improves due to the proposed MTT algorithm.
Keywords: Intelligent Headlights
|
|
|
Jürgen Brauer, Wenjuan Gong, Jordi Gonzalez, & Michael Arens. (2011). On the Effect of Temporal Information on Monocular 3D Human Pose Estimation. In 2nd IEEE International Workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Streams (pp. 906–913).
Abstract: We address the task of estimating 3D human poses from monocular camera sequences. Many works make use of multiple consecutive frames for the estimation of a 3D pose in a frame. Although such an approach should ease the pose estimation task substantially since multiple consecutive frames allow to solve for 2D projection ambiguities in principle, it has not yet been investigated systematically how much we can improve the 3D pose estimates when using multiple consecutive frames opposed to single frame information. In this paper we analyze the difference in quality of 3D pose estimates based on different numbers of consecutive frames from which 2D pose estimates are available. We validate the use of temporal information on two major different approaches for human pose estimation – modeling and learning approaches. The results of our experiments show that both learning and modeling approaches benefit from using multiple frames opposed to single frame input but that the benefit is small when the 2D pose estimates show a high quality in terms of precision.
|
|
|
C. Sbert, & A.F. Sole. (2000). Stereo reconstruction of 3D curves. In 15 th International Conference on Pattern Recognition (Vol. 1, 912–915).
|
|
|
Javad Zolfaghari Bengar, Abel Gonzalez-Garcia, Gabriel Villalonga, Bogdan Raducanu, Hamed H. Aghdam, Mikhail Mozerov, et al. (2019). Temporal Coherence for Active Learning in Videos. In IEEE International Conference on Computer Vision Workshops (pp. 914–923).
Abstract: Autonomous driving systems require huge amounts of data to train. Manual annotation of this data is time-consuming and prohibitively expensive since it involves human resources. Therefore, active learning emerged as an alternative to ease this effort and to make data annotation more manageable. In this paper, we introduce a novel active learning approach for object detection in videos by exploiting temporal coherence. Our active learning criterion is based on the estimated number of errors in terms of false positives and false negatives. The detections obtained by the object detector are used to define the nodes of a graph and tracked forward and backward to temporally link the nodes. Minimizing an energy function defined on this graphical model provides estimates of both false positives and false negatives. Additionally, we introduce a synthetic video dataset, called SYNTHIA-AL, specially designed to evaluate active learning for video object detection in road scenes. Finally, we show that our approach outperforms active learning baselines tested on two datasets.
|
|
|
Sounak Dey, Anjan Dutta, Suman Ghosh, Ernest Valveny, Josep Llados, & Umapada Pal. (2018). Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch. In 24th International Conference on Pattern Recognition (pp. 916–921).
Abstract: In this work we introduce a cross modal image retrieval system that allows both text and sketch as input modalities for the query. A cross-modal deep network architecture is formulated to jointly model the sketch and text input modalities as well as the the image output modality, learning a common embedding between text and images and between sketches and images. In addition, an attention model is used to selectively focus the attention on the different objects of the image, allowing for retrieval with multiple objects in the query. Experiments show that the proposed method performs the best in both single and multiple object image retrieval in standard datasets.
|
|
|
Eduard Vazquez, Ramon Baldrich, Joost Van de Weijer, & Maria Vanrell. (2011). Describing Reflectances for Colour Segmentation Robust to Shadows, Highlights and Textures. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 917–930.
Abstract: The segmentation of a single material reflectance is a challenging problem due to the considerable variation in image measurements caused by the geometry of the object, shadows, and specularities. The combination of these effects has been modeled by the dichromatic reflection model. However, the application of the model to real-world images is limited due to unknown acquisition parameters and compression artifacts. In this paper, we present a robust model for the shape of a single material reflectance in histogram space. The method is based on a multilocal creaseness analysis of the histogram which results in a set of ridges representing the material reflectances. The segmentation method derived from these ridges is robust to both shadow, shading and specularities, and texture in real-world images. We further complete the method by incorporating prior knowledge from image statistics, and incorporate spatial coherence by using multiscale color contrast information. Results obtained show that our method clearly outperforms state-of-the-art segmentation methods on a widely used segmentation benchmark, having as a main characteristic its excellent performance in the presence of shadows and highlights at low computational cost.
|
|
|
Arjan Gijsenij, Theo Gevers, & Joost Van de Weijer. (2012). Improving Color Constancy by Photometric Edge Weighting. TPAMI - IEEE Transaction on Pattern Analysis and Machine Intelligence, 34(5), 918–929.
Abstract: : Edge-based color constancy methods make use of image derivatives to estimate the illuminant. However, different edge types exist in real-world images such as material, shadow and highlight edges. These different edge types may have a distinctive influence on the performance of the illuminant estimation. Therefore, in this paper, an extensive analysis is provided of different edge types on the performance of edge-based color constancy methods. First, an edge-based taxonomy is presented classifying edge types based on their photometric properties (e.g. material, shadow-geometry and highlights). Then, a performance evaluation of edge-based color constancy is provided using these different edge types. From this performance evaluation it is derived that specular and shadow edge types are more valuable than material edges for the estimation of the illuminant. To this end, the (iterative) weighted Grey-Edge algorithm is proposed in which these edge types are more emphasized for the estimation of the illuminant. Images that are recorded under controlled circumstances demonstrate that the proposed iterative weighted Grey-Edge algorithm based on highlights reduces the median angular error with approximately $25\%$. In an uncontrolled environment, improvements in angular error up to $11\%$ are obtained with respect to regular edge-based color constancy.
|
|
|
Miquel Ferrer, Dimosthenis Karatzas, Ernest Valveny, I. Bardaji, & Horst Bunke. (2011). A Generic Framework for Median Graph Computation based on a Recursive Embedding Approach. CVIU - Computer Vision and Image Understanding, 115(7), 919–928.
Abstract: The median graph has been shown to be a good choice to obtain a represen- tative of a set of graphs. However, its computation is a complex problem. Recently, graph embedding into vector spaces has been proposed to obtain approximations of the median graph. The problem with such an approach is how to go from a point in the vector space back to a graph in the graph space. The main contribution of this paper is the generalization of this previ- ous method, proposing a generic recursive procedure that permits to recover the graph corresponding to a point in the vector space, introducing only the amount of approximation inherent to the use of graph matching algorithms. In order to evaluate the proposed method, we compare it with the set me- dian and with the other state-of-the-art embedding-based methods for the median graph computation. The experiments are carried out using four dif- ferent databases (one semi-artificial and three containing real-world data). Results show that with the proposed approach we can obtain better medi- ans, in terms of the sum of distances to the training graphs, than with the previous existing methods.
Keywords: Median Graph, Graph Embedding, Graph Matching, Structural Pattern Recognition
|
|
|
Shifeng Zhang, Xiaobo Wang, Ajian Liu, Chenxu Zhao, Jun Wan, Sergio Escalera, et al. (2019). A Dataset and Benchmark for Large-scale Multi-modal Face Anti-spoofing. In 32nd IEEE Conference on Computer Vision and Pattern Recognition (pp. 919–928).
Abstract: Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects (≤170) and modalities (≤2), which hinder the further development of the academic community. To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and visual modalities. Specifically, it consists of 1,000 subjects with 21,000 videos and each sample has 3 modalities (i.e., RGB, Depth and IR). We also provide a measurement set, evaluation protocol and training/validation/testing subsets, developing a new benchmark for face anti-spoofing. Moreover, we present a new multi-modal fusion method as baseline, which performs feature re-weighting to select the more informative channel features while suppressing the less useful ones for each modal. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability. The dataset is available at https://sites.google.com/qq.com/chalearnfacespoofingattackdete/.
|
|
|
Hugo Jair Escalante, Victor Ponce, Sergio Escalera, Xavier Baro, Alicia Morales-Reyes, & Jose Martinez-Carranza. (2017). Evolving weighting schemes for the Bag of Visual Words. Neural Computing and Applications - Neural Computing and Applications, 28(5), 925–939.
Abstract: The Bag of Visual Words (BoVW) is an established representation in computer vision. Taking inspiration from text mining, this representation has proved
to be very effective in many domains. However, in most cases, standard term-weighting schemes are adopted (e.g.,term-frequency or TF-IDF). It remains open the question of whether alternative weighting schemes could boost the
performance of methods based on BoVW. More importantly, it is unknown whether it is possible to automatically learn and determine effective weighting schemes from
scratch. This paper brings some light into both of these unknowns. On the one hand, we report an evaluation of the most common weighting schemes used in text mining, but rarely used in computer vision tasks. Besides, we propose an evolutionary algorithm capable of automatically learning weighting schemes for computer vision problems. We report empirical results of an extensive study in several computer vision problems. Results show the usefulness of the proposed method.
Keywords: Bag of Visual Words; Bag of features; Genetic programming; Term-weighting schemes; Computer vision
|
|
|
Fadi Dornaika, & Bogdan Raducanu. (2009). Three-Dimensional Face Pose Detection and Tracking Using Monocular Videos: Tool and Application. TSMCB - IEEE Transactions on Systems, Man and Cybernetics part B, 39(4), 935–944.
Abstract: Recently, we have proposed a real-time tracker that simultaneously tracks the 3-D head pose and facial actions in monocular video sequences that can be provided by low quality cameras. This paper has two main contributions. First, we propose an automatic 3-D face pose initialization scheme for the real-time tracker by adopting a 2-D face detector and an eigenface system. Second, we use the proposed methods-the initialization and tracking-for enhancing the human-machine interaction functionality of an AIBO robot. More precisely, we show how the orientation of the robot's camera (or any active vision system) can be controlled through the estimation of the user's head pose. Applications based on head-pose imitation such as telepresence, virtual reality, and video games can directly exploit the proposed techniques. Experiments on real videos confirm the robustness and usefulness of the proposed methods.
|
|
|
Carles Sanchez, Jorge Bernal, F. Javier Sanchez, Antoni Rosell, Marta Diez-Ferrer, & Debora Gil. (2015). Towards On-line Quantification of Tracheal Stenosis from Videobronchoscopy. IJCAR - International Journal of Computer Assisted Radiology and Surgery, 10(6), 935–945.
|
|
|
Carles Sanchez, Jorge Bernal, F. Javier Sanchez, Marta Diez-Ferrer, Antoni Rosell, & Debora Gil. (2015). Towards On-line Quantification of Tracheal Stenosis from Videobronchoscopy. In 6th International Conference on Information Processing in Computer-Assisted Interventions IPCAI2015 (Vol. 10, pp. 935–945).
Abstract: PURPOSE:
Lack of objective measurement of tracheal obstruction degree has a negative impact on the chosen treatment prone to lead to unnecessary repeated explorations and other scanners. Accurate computation of tracheal stenosis in videobronchoscopy would constitute a breakthrough for this noninvasive technique and a reduction in operation cost for the public health service.
METHODS:
Stenosis calculation is based on the comparison of the region delimited by the lumen in an obstructed frame and the region delimited by the first visible ring in a healthy frame. We propose a parametric strategy for the extraction of lumen and tracheal ring regions based on models of their geometry and appearance that guide a deformable model. To ensure a systematic applicability, we present a statistical framework to choose optimal parametric values and a strategy to choose the frames that minimize the impact of scope optical distortion.
RESULTS:
Our method has been tested in 40 cases covering different stenosed tracheas. Experiments report a non- clinically relevant [Formula: see text] of discrepancy in the calculated stenotic area and a computational time allowing online implementation in the operating room.
CONCLUSIONS:
Our methodology allows reliable measurements of airway narrowing in the operating room. To fully assess its clinical impact, a prospective clinical trial should be done.
|
|