Santiago Segui, Oriol Pujol, & Jordi Vitria. (2015). Learning to count with deep object features. In Deep Vision: Deep Learning in Computer Vision, CVPR 2015 Workshop (pp. 90–96).
Abstract: Learning to count is a learning strategy that has been recently proposed in the literature for dealing with problems where estimating the number of object instances in a scene is the final objective. In this framework, the task of learning to detect and localize individual object instances is seen as a harder task that can be evaded by casting the problem as that of computing a regression value from hand-crafted image features. In this paper we explore the features that are learned when training a counting convolutional neural
network in order to understand their underlying representation.
To this end we define a counting problem for MNIST data and show that the internal representation of the network is able to classify digits in spite of the fact that no direct supervision was provided for them during training.
We also present preliminary results about a deep network that is able to count the number of pedestrians in a scene.
|
Santiago Segui, Michal Drozdzal, Petia Radeva, & Jordi Vitria. (2010). Severe Motility Diagnosis using WCE. In Medical Image Computing in Catalunya: Graduate Student Workshop (45–46).
|
Santiago Segui, Michal Drozdzal, Petia Radeva, & Jordi Vitria. (2012). An Integrated Approach to Contextual Face Detection. In 1st International Conference on Pattern Recognition Applications and Methods (pp. 143–150). Springer.
Abstract: Face detection is, in general, based on content-based detectors. Nevertheless, the face is a non-rigid object with well defined relations with respect to the human body parts. In this paper, we propose to take benefit of the context information in order to improve content-based face detections. We propose a novel framework for integrating multiple content- and context-based detectors in a discriminative way. Moreover, we develop an integrated scoring procedure that measures the ’faceness’ of each hypothesis and is used to discriminate the detection results. Our approach detects a higher rate of faces while minimizing the number of false detections, giving an average increase of more than 10% in average precision when comparing it to state-of-the art face detectors
|
Santiago Segui, Michal Drozdzal, Guillem Pascual, Petia Radeva, Carolina Malagelada, Fernando Azpiroz, et al. (2016). Generic Feature Learning for Wireless Capsule Endoscopy Analysis. CBM - Computers in Biology and Medicine, 79, 163–172.
Abstract: The interpretation and analysis of wireless capsule endoscopy (WCE) recordings is a complex task which requires sophisticated computer aided decision (CAD) systems to help physicians with video screening and, finally, with the diagnosis. Most CAD systems used in capsule endoscopy share a common system design, but use very different image and video representations. As a result, each time a new clinical application of WCE appears, a new CAD system has to be designed from the scratch. This makes the design of new CAD systems very time consuming. Therefore, in this paper we introduce a system for small intestine motility characterization, based on Deep Convolutional Neural Networks, which circumvents the laborious step of designing specific features for individual motility events. Experimental results show the superiority of the learned features over alternative classifiers constructed using state-of-the-art handcrafted features. In particular, it reaches a mean classification accuracy of 96% for six intestinal motility events, outperforming the other classifiers by a large margin (a 14% relative performance increase).
Keywords: Wireless capsule endoscopy; Deep learning; Feature learning; Motility analysis
|
Santiago Segui, Michal Drozdzal, Fernando Vilariño, Carolina Malagelada, Fernando Azpiroz, Petia Radeva, et al. (2012). Categorization and Segmentation of Intestinal Content Frames for Wireless Capsule Endoscopy. TITB - IEEE Transactions on Information Technology in Biomedicine, 16(6), 1341–1352.
Abstract: Wireless capsule endoscopy (WCE) is a device that allows the direct visualization of gastrointestinal tract with minimal discomfort for the patient, but at the price of a large amount of time for screening. In order to reduce this time, several works have proposed to automatically remove all the frames showing intestinal content. These methods label frames as {intestinal content – clear} without discriminating between types of content (with different physiological meaning) or the portion of image covered. In addition, since the presence of intestinal content has been identified as an indicator of intestinal motility, its accurate quantification can show a potential clinical relevance. In this paper, we present a method for the robust detection and segmentation of intestinal content in WCE images, together with its further discrimination between turbid liquid and bubbles. Our proposal is based on a twofold system. First, frames presenting intestinal content are detected by a support vector machine classifier using color and textural information. Second, intestinal content frames are segmented into {turbid, bubbles, and clear} regions. We show a detailed validation using a large dataset. Our system outperforms previous methods and, for the first time, discriminates between turbid from bubbles media.
|
Santiago Segui, Michal Drozdzal, Ekaterina Zaytseva, Fernando Azpiroz, Petia Radeva, & Jordi Vitria. (2014). Detection of wrinkle frames in endoluminal videos using betweenness centrality measures for images. TITB - IEEE Transactions on Information Technology in Biomedicine, 18(6), 1831–1838.
Abstract: Intestinal contractions are one of the most important events to diagnose motility pathologies of the small intestine. When visualized by wireless capsule endoscopy (WCE), the sequence of frames that represents a contraction is characterized by a clear wrinkle structure in the central frames that corresponds to the folding of the intestinal wall. In this paper we present a new method to robustly detect wrinkle frames in full WCE videos by using a new mid-level image descriptor that is based on a centrality measure proposed for graphs. We present an extended validation, carried out in a very large database, that shows that the proposed method achieves state of the art performance for this task.
Keywords: Wireless Capsule Endoscopy; Small Bowel Motility Dysfunction; Contraction Detection; Structured Prediction; Betweenness Centrality
|
Santiago Segui, Michal Drozdzal, Ekaterina Zaytseva, Carolina Malagelada, Fernando Azpiroz, Petia Radeva, et al. (2013). A new image centrality descriptor for wrinkle frame detection in WCE videos. In 13th IAPR Conference on Machine Vision Applications.
Abstract: Small bowel motility dysfunctions are a widespread functional disorder characterized by abdominal pain and altered bowel habits in the absence of specific and unique organic pathology. Current methods of diagnosis are complex and can only be conducted at some highly specialized referral centers. Wireless Video Capsule Endoscopy (WCE) could be an interesting diagnostic alternative that presents excellent clinical advantages, since it is non-invasive and can be conducted by non specialists. The purpose of this work is to present a new method for the detection of wrinkle frames in WCE, a critical characteristic to detect one of the main motility events: contractions. The method goes beyond the use of one of the classical image feature, the Histogram
|
Santiago Segui, Laura Igual, Petia Radeva, Carolina Malagelada, Fernando Azpiroz, & Jordi Vitria. (2007). A Semi-Supervised Learning Method for Motility Disease Diagnostic. In Progress in Pattern Recognition, Image Analysis and Applications, 12th Iberoamerican Congress on Pattern (CIARP 2007), LCNS 4756:773–782, ISBN 978–3–540–76724–4.
|
Santiago Segui, Laura Igual, & Jordi Vitria. (2010). Weighted Bagging for Graph based One-Class Classifiers. In 9th International Workshop on Multiple Classifier Systems (Vol. 5997, pp. 1–10). LNCS. Springer Berlin Heidelberg.
Abstract: Most conventional learning algorithms require both positive and negative training data for achieving accurate classification results. However, the problem of learning classifiers from only positive data arises in many applications where negative data are too costly, difficult to obtain, or not available at all. Minimum Spanning Tree Class Descriptor (MSTCD) was presented as a method that achieves better accuracies than other one-class classifiers in high dimensional data. However, the presence of outliers in the target class severely harms the performance of this classifier. In this paper we propose two bagging strategies for MSTCD that reduce the influence of outliers in training data. We show the improved performance on both real and artificially contaminated data.
|
Santiago Segui, Laura Igual, & Jordi Vitria. (2013). Bagged One Class Classifiers in the Presence of Outliers. IJPRAI - International Journal of Pattern Recognition and Artificial Intelligence, 27(5), 1350014–1350035.
Abstract: The problem of training classifiers only with target data arises in many applications where non-target data are too costly, difficult to obtain, or not available at all. Several one-class classification methods have been presented to solve this problem, but most of the methods are highly sensitive to the presence of outliers in the target class. Ensemble methods have therefore been proposed as a powerful way to improve the classification performance of binary/multi-class learning algorithms by introducing diversity into classifiers.
However, their application to one-class classification has been rather limited. In
this paper, we present a new ensemble method based on a non-parametric weighted bagging strategy for one-class classification, to improve accuracy in the presence of outliers. While the standard bagging strategy assumes a uniform data distribution, the method we propose here estimates a probability density based on a forest structure of the data. This assumption allows the estimation of data distribution from the computation of simple univariate and bivariate kernel densities. Experiments using original and noisy versions of 20 different datasets show that bagging ensemble methods applied to different one-class classifiers outperform base one-class classification methods. Moreover, we show that, in noisy versions of the datasets, the non-parametric weighted bagging strategy we propose outperforms the classical bagging strategy in a statistically significant way.
Keywords: One-class Classifier; Ensemble Methods; Bagging and Outliers
|
Santiago Segui, Laura Igual, Fernando Vilariño, Petia Radeva, Carolina Malagelada, Fernando Azpiroz, et al. (2008). Diagnostic System for Intestinal Motility Disfunctions Using Video Capsule Endoscopy. In and J.K. Tsotsos M. V. A. Gasteratos (Ed.), Computer Vision Systems. 6th International (Vol. 5008, 251–260). LNCS. Berlin Heidelberg: Springer-Verlag.
Abstract: Wireless Video Capsule Endoscopy is a clinical technique consisting of the analysis of images from the intestine which are pro- vided by an ingestible device with a camera attached to it. In this paper we propose an automatic system to diagnose severe intestinal motility disfunctions using the video endoscopy data. The system is based on the application of computer vision techniques within a machine learn- ing framework in order to obtain the characterization of diverse motil- ity events from video sequences. We present experimental results that demonstrate the effectiveness of the proposed system and compare them with the ground-truth provided by the gastroenterologists.
|
Santiago Segui. (2007). A Sparse Bayesian Approach for Joint Feature Selection and Classifier Learning.
|
Santiago Segui. (2011). Contributions to the Diagnosis of Intestinal Motility by Automatic Image Analysis (Jordi Vitria, Ed.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: In the early twenty first century Given Imaging Ltd. presented wireless capsule endoscopy (WCE) as a new technological breakthrough that allowed the visualization of
the intestine by using a small, swallowed camera. This small size device was received
with a high enthusiasm within the medical community, and until now, it is still one
of the medical devices with the highest use growth rate. WCE can be used as a novel
diagnostic tool that presents several clinical advantages, since it is non-invasive and
at the same time it provides, for the first time, a full picture of the small bowel morphology, contents and dynamics. Since its appearance, the WCE has been used to
detect several intestinal dysfunctions such as: polyps, ulcers and bleeding. However,
the visual analysis of WCE videos presents an important drawback: the long time
required by the physicians for proper video visualization. In this sense and regarding
to this limitation, the development of computer aided systems is required for the extensive use of WCE in the medical community.
The work presented in this thesis is a set of contributions for the automatic image
analysis and computer-aided diagnosis of intestinal motility disorders using WCE.
Until now, the diagnosis of small bowel motility dysfunctions was basically performed
by invasive techniques such as the manometry test, which can only be conducted at
some referral centers around the world owing to the complexity of the procedure and
the medial expertise required in the interpretation of the results.
Our contributions are divided in three main blocks:
1. Image analysis by computer vision techniques to detect events in the endoluminal WCE scene. Several methods have been proposed to detect visual events
such as: intestinal contractions, intestinal content, tunnel and wrinkles;
2. Machine learning techniques for the analysis and the manipulation of the data
from WCE. These methods have been proposed in order to overcome the problems that the analysis of WCE presents such as: video acquisition cost, unlabeled data and large number of data;
3. Two different systems for the computer-aided diagnosis of intestinal motility
disorders using WCE. The first system presents a fully automatic method that
aids at discriminating healthy subjects from patients with severe intestinal motor disorders like pseudo-obstruction or food intolerance. The second system presents another automatic method that models healthy subjects and discriminate them from mild intestinal motility patients.
|
Santi Puch, Irina Sanchez, Aura Hernandez-Sabate, Gemma Piella, & Vesna Prckovska. (2018). Global Planar Convolutions for Improved Context Aggregation in Brain Tumor Segmentation. In International MICCAI Brainlesion Workshop (Vol. 11384, pp. 393–405). LNCS.
Abstract: In this work, we introduce the Global Planar Convolution module as a building-block for fully-convolutional networks that aggregates global information and, therefore, enhances the context perception capabilities of segmentation networks in the context of brain tumor segmentation. We implement two baseline architectures (3D UNet and a residual version of 3D UNet, ResUNet) and present a novel architecture based on these two architectures, ContextNet, that includes the proposed Global Planar Convolution module. We show that the addition of such module eliminates the need of building networks with several representation levels, which tend to be over-parametrized and to showcase slow rates of convergence. Furthermore, we provide a visual demonstration of the behavior of GPC modules via visualization of intermediate representations. We finally participate in the 2018 edition of the BraTS challenge with our best performing models, that are based on ContextNet, and report the evaluation scores on the validation and the test sets of the challenge.
Keywords: Brain tumors; 3D fully-convolutional CNN; Magnetic resonance imaging; Global planar convolution
|
Sanket Biswas, Pau Riba, Josep Llados, & Umapada Pal. (2021). DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis. In 16th International Conference on Document Analysis and Recognition (Vol. 12823, 555–568). LNCS.
Abstract: Despite significant progress on current state-of-the-art image generation models, synthesis of document images containing multiple and complex object layouts is a challenging task. This paper presents a novel approach, called DocSynth, to automatically synthesize document images based on a given layout. In this work, given a spatial layout (bounding boxes with object categories) as a reference by the user, our proposed DocSynth model learns to generate a set of realistic document images consistent with the defined layout. Also, this framework has been adapted to this work as a superior baseline model for creating synthetic document image datasets for augmenting real data during training for document layout analysis tasks. Different sets of learning objectives have been also used to improve the model performance. Quantitatively, we also compare the generated results of our model with real data using standard evaluation metrics. The results highlight that our model can successfully generate realistic and diverse document images with multiple objects. We also present a comprehensive qualitative analysis summary of the different scopes of synthetic image generation tasks. Lastly, to our knowledge this is the first work of its kind.
|