Quentin Angermann, Jorge Bernal, Cristina Sanchez Montes, Gloria Fernandez Esparrach, Xavier Gray, Olivier Romain, et al. (2017). Towards Real-Time Polyp Detection in Colonoscopy Videos: Adapting Still Frame-Based Methodologies for Video Sequences Analysis. In 4th International Workshop on Computer Assisted and Robotic Endoscopy (pp. 29–41).
Abstract: Colorectal cancer is the second cause of cancer death in United States: precursor lesions (polyps) detection is key for patient survival. Though colonoscopy is the gold standard screening tool, some polyps are still missed. Several computational systems have been proposed but none of them are used in the clinical room mainly due to computational constraints. Besides, most of them are built over still frame databases, decreasing their performance on video analysis due to the lack of output stability and not coping with associated variability on image quality and polyp appearance. We propose a strategy to adapt these methods to video analysis by adding a spatio-temporal stability module and studying a combination of features to capture polyp appearance variability. We validate our strategy, incorporated on a real-time detection method, on a public video database. Resulting method detects all
polyps under real time constraints, increasing its performance due to our
adaptation strategy.
Keywords: Polyp detection; colonoscopy; real time; spatio temporal coherence
|
Carles Sanchez, Debora Gil, Jorge Bernal, F. Javier Sanchez, Marta Diez-Ferrer, & Antoni Rosell. (2016). Navigation Path Retrieval from Videobronchoscopy using Bronchial Branches. In 19th International Conference on Medical Image Computing and Computer Assisted Intervention Workshops (Vol. 9401, pp. 62–70). LNCS.
Abstract: Bronchoscopy biopsy can be used to diagnose lung cancer without risking complications of other interventions like transthoracic needle aspiration. During bronchoscopy, the clinician has to navigate through the bronchial tree to the target lesion. A main drawback is the difficulty to check whether the exploration is following the correct path. The usual guidance using fluoroscopy implies repeated radiation of the clinician, while alternative systems (like electromagnetic navigation) require specific equipment that increases intervention costs. We propose to compute the navigated path using anatomical landmarks extracted from the sole analysis of videobronchoscopy images. Such landmarks allow matching the current exploration to the path previously planned on a CT to indicate clinician whether the planning is being correctly followed or not. We present a feasibility study of our landmark based CT-video matching using bronchoscopic videos simulated on a virtual bronchoscopy interactive interface.
Keywords: Bronchoscopy navigation; Lumen center; Brochial branches; Navigation path; Videobronchoscopy
|
Cristhian A. Aguilera-Carrasco, Angel Sappa, & Ricardo Toledo. (2015). LGHD: a Feature Descriptor for Matching Across Non-Linear Intensity Variations. In 22th IEEE International Conference on Image Processing (pp. 178–181).
|
David Guillamet, B. Shiele, & Jordi Vitria. (2002). Analyzing Non-negative Matrix Factorization for Image Classification..
|
David Guillamet, & Jordi Vitria. (2002). Determining a Suitable Metric when using Non-negative Matrix Factorization..
|
Ernest Valveny, & B. Lamiroy. (2002). Automatic Generation of Browsable Technical Documents..
|
Marc Masana, Joost Van de Weijer, & Andrew Bagdanov. (2016). On-the-fly Network pruning for object detection. In International conference on learning representations.
Abstract: Object detection with deep neural networks is often performed by passing a few
thousand candidate bounding boxes through a deep neural network for each image.
These bounding boxes are highly correlated since they originate from the same
image. In this paper we investigate how to exploit feature occurrence at the image scale to prune the neural network which is subsequently applied to all bounding boxes. We show that removing units which have near-zero activation in the image allows us to significantly reduce the number of parameters in the network. Results on the PASCAL 2007 Object Detection Challenge demonstrate that up to 40% of units in some fully-connected layers can be entirely eliminated with little change in the detection result.
|
Petia Radeva, & Jordi Vitria. (2004). Discriminant Projections Embedding for Nearest Neighbor Classification.
|
Oriol Pujol, Petia Radeva, Jordi Vitria, & J. Mauri. (2004). Adaboost to Classify Plaque Appearance in IVUS Images.
|
Fahad Shahbaz Khan, Muhammad Anwer Rao, Joost Van de Weijer, Andrew Bagdanov, Maria Vanrell, & Antonio Lopez. (2012). Color Attributes for Object Detection. In 25th IEEE Conference on Computer Vision and Pattern Recognition (pp. 3306–3313). IEEE Xplore.
Abstract: State-of-the-art object detectors typically use shape information as a low level feature representation to capture the local structure of an object. This paper shows that early fusion of shape and color, as is popular in image classification,
leads to a significant drop in performance for object detection. Moreover, such approaches also yields suboptimal results for object categories with varying importance of color and shape.
In this paper we propose the use of color attributes as an explicit color representation for object detection. Color attributes are compact, computationally efficient, and when combined with traditional shape features provide state-ofthe-
art results for object detection. Our method is tested on the PASCAL VOC 2007 and 2009 datasets and results clearly show that our method improves over state-of-the-art techniques despite its simplicity. We also introduce a new dataset consisting of cartoon character images in which color plays a pivotal role. On this dataset, our approach yields a significant gain of 14% in mean AP over conventional state-of-the-art methods.
Keywords: pedestrian detection
|
Albert Gordo, & Florent Perronnin. (2011). Asymmetric Distances for Binary Embeddings. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 729–736).
Abstract: In large-scale query-by-example retrieval, embedding image signatures in a binary space offers two benefits: data compression and search efficiency. While most embedding algorithms binarize both query and database signatures, it has been noted that this is not strictly a requirement. Indeed, asymmetric schemes which binarize the database signatures but not the query still enjoy the same two benefits but may provide superior accuracy. In this work, we propose two general asymmetric distances which are applicable to a wide variety of embedding techniques including Locality Sensitive Hashing (LSH), Locality Sensitive Binary Codes (LSBC), Spectral Hashing (SH) and Semi-Supervised Hashing (SSH). We experiment on four public benchmarks containing up to 1M images and show that the proposed asymmetric distances consistently lead to large improvements over the symmetric Hamming distance for all binary embedding techniques. We also propose a novel simple binary embedding technique – PCA Embedding (PCAE) – which is shown to yield competitive results with respect to more complex algorithms such as SH and SSH.
|
Marc Serra, Olivier Penacchio, Robert Benavente, & Maria Vanrell. (2012). Names and Shades of Color for Intrinsic Image Estimation. In 25th IEEE Conference on Computer Vision and Pattern Recognition (pp. 278–285). IEEE Xplore.
Abstract: In the last years, intrinsic image decomposition has gained attention. Most of the state-of-the-art methods are based on the assumption that reflectance changes come along with strong image edges. Recently, user intervention in the recovery problem has proved to be a remarkable source of improvement. In this paper, we propose a novel approach that aims to overcome the shortcomings of pure edge-based methods by introducing strong surface descriptors, such as the color-name descriptor which introduces high-level considerations resembling top-down intervention. We also use a second surface descriptor, termed color-shade, which allows us to include physical considerations derived from the image formation model capturing gradual color surface variations. Both color cues are combined by means of a Markov Random Field. The method is quantitatively tested on the MIT ground truth dataset using different error metrics, achieving state-of-the-art performance.
|
Murad Al Haj, Jordi Gonzalez, & Larry S. Davis. (2012). On Partial Least Squares in Head Pose Estimation: How to simultaneously deal with misalignment. In 25th IEEE Conference on Computer Vision and Pattern Recognition (pp. 2602–2609). IEEE Xplore.
Abstract: Head pose estimation is a critical problem in many computer vision applications. These include human computer interaction, video surveillance, face and expression recognition. In most prior work on heads pose estimation, the positions of the faces on which the pose is to be estimated are specified manually. Therefore, the results are reported without studying the effect of misalignment. We propose a method based on partial least squares (PLS) regression to estimate pose and solve the alignment problem simultaneously. The contributions of this paper are two-fold: 1) we show that the kernel version of PLS (kPLS) achieves better than state-of-the-art results on the estimation problem and 2) we develop a technique to reduce misalignment based on the learned PLS factors.
|
Jose Carlos Rubio, Joan Serrat, & Antonio Lopez. (2012). Unsupervised co-segmentation through region matching. In 25th IEEE Conference on Computer Vision and Pattern Recognition (pp. 749–756). IEEE Xplore.
Abstract: Co-segmentation is defined as jointly partitioning multiple images depicting the same or similar object, into foreground and background. Our method consists of a multiple-scale multiple-image generative model, which jointly estimates the foreground and background appearance distributions from several images, in a non-supervised manner. In contrast to other co-segmentation methods, our approach does not require the images to have similar foregrounds and different backgrounds to function properly. Region matching is applied to exploit inter-image information by establishing correspondences between the common objects that appear in the scene. Moreover, computing many-to-many associations of regions allow further applications, like recognition of object parts across images. We report results on iCoseg, a challenging dataset that presents extreme variability in camera viewpoint, illumination and object deformations and poses. We also show that our method is robust against large intra-class variability in the MSRC database.
|
Albert Gordo, Jose Antonio Rodriguez, Florent Perronnin, & Ernest Valveny. (2012). Leveraging category-level labels for instance-level image retrieval. In 25th IEEE Conference on Computer Vision and Pattern Recognition (pp. 3045–3052). IEEE Xplore.
Abstract: In this article, we focus on the problem of large-scale instance-level image retrieval. For efficiency reasons, it is common to represent an image by a fixed-length descriptor which is subsequently encoded into a small number of bits. We note that most encoding techniques include an unsupervised dimensionality reduction step. Our goal in this work is to learn a better subspace in a supervised manner. We especially raise the following question: “can category-level labels be used to learn such a subspace?” To answer this question, we experiment with four learning techniques: the first one is based on a metric learning framework, the second one on attribute representations, the third one on Canonical Correlation Analysis (CCA) and the fourth one on Joint Subspace and Classifier Learning (JSCL). While the first three approaches have been applied in the past to the image retrieval problem, we believe we are the first to show the usefulness of JSCL in this context. In our experiments, we use ImageNet as a source of category-level labels and report retrieval results on two standard dataseis: INRIA Holidays and the University of Kentucky benchmark. Our experimental study shows that metric learning and attributes do not lead to any significant improvement in retrieval accuracy, as opposed to CCA and JSCL. As an example, we report on Holidays an increase in accuracy from 39.3% to 48.6% with 32-dimensional representations. Overall JSCL is shown to yield the best results.
|