Jose Antonio Rodriguez, Gemma Sanchez, & Josep Llados. (2007). A Pen-based Interface for Real-time Document Edition. In 9th International Conference on Document Analysis and Recognition. (Vol. 2, 939–944).
|
Patricia Marquez, Debora Gil, & Aura Hernandez-Sabate. (2012). Error Analysis for Lucas-Kanade Based Schemes. In 9th International Conference on Image Analysis and Recognition (Vol. 7324, pp. 184–191). LNCS. Springer-Verlag Berlin Heidelberg.
Abstract: Optical flow is a valuable tool for motion analysis in medical imaging sequences. A reliable application requires determining the accuracy of the computed optical flow. This is a main challenge given the absence of ground truth in medical sequences. This paper presents an error analysis of Lucas-Kanade schemes in terms of intrinsic design errors and numerical stability of the algorithm. Our analysis provides a confidence measure that is naturally correlated to the accuracy of the flow field. Our experiments show the higher predictive value of our confidence measure compared to existing measures.
Keywords: Optical flow, Confidence measure, Lucas-Kanade, Cardiac Magnetic Resonance
|
Ricard Borras, Agata Lapedriza, & Laura Igual. (2012). Depth Information in Human Gait Analysis: An Experimental Study on Gender Recognition. In 9th International Conference on Image Analysis and Recognition (Vol. 7325, pp. 98–105). Springer Berlin Heidelberg.
Abstract: This work presents DGait, a new gait database acquired with a depth camera. This database contains videos from 53 subjects walking in different directions. The intent of this database is to provide a public set to explore whether the depth can be used as an additional information source for gait classification purposes. Each video is labelled according to subject, gender and age. Furthermore, for each subject and view point, we provide initial and final frames of an entire walk cycle. On the other hand, we perform gait-based gender classification experiments with DGait database, in order to illustrate the usefulness of depth information for this purpose. In our experiments, we extract 2D and 3D gait features based on shape descriptors, and compare the performance of these features for gender identification, using a Kernel SVM. The obtained results show that depth can be an information source of great relevance for gait classification problems.
|
Fernando Barrera, Felipe Lumbreras, & Angel Sappa. (2012). Evaluation of Similarity Functions in Multimodal Stereo. In 9th International Conference on Image Analysis and Recognition (Vol. 7324, pp. 320–329). LNCS. Springer Berlin Heidelberg.
Abstract: This paper presents an evaluation framework for multimodal stereo matching, which allows to compare the performance of four similarity functions. Additionally, it presents details of a multimodal stereo head that supply thermal infrared and color images, as well as, aspects of its calibration and rectification. The pipeline includes a novel method for the disparity selection, which is suitable for evaluating the similarity functions. Finally, a benchmark for comparing different initializations of the proposed framework is presented. Similarity functions are based on mutual information, gradient orientation and scale space representations. Their evaluation is performed using two metrics: i) disparity error, and ii) number of correct matches on planar regions. In addition to the proposed evaluation, the current paper also shows that 3D sparse representations can be recovered from such a multimodal stereo head.
Keywords: Aveiro, Portugal
|
Miguel Oliveira, Angel Sappa, & V. Santos. (2012). Color Correction using 3D Gaussian Mixture Models. In 9th International Conference on Image Analysis and Recognition (Vol. 7324, pp. 97–106). LNCS. Springer Berlin Heidelberg.
Abstract: The current paper proposes a novel color correction approach based on a probabilistic segmentation framework by using 3D Gaussian Mixture Models. Regions are used to compute local color correction functions, which are then combined to obtain the final corrected image. The proposed approach is evaluated using both a recently published metric and two large data sets composed of seventy images. The evaluation is performed by comparing our algorithm with eight well known color correction algorithms. Results show that the proposed approach is the highest scoring color correction method. Also, the proposed single step 3D color space probabilistic segmentation reduces processing time over similar approaches.
|
Laura Igual, Joan Carles Soliva, Roger Gimeno, Sergio Escalera, Oscar Vilarroya, & Petia Radeva. (2012). Automatic Internal Segmentation of Caudate Nucleus for Diagnosis of Attention Deficit Hyperactivity Disorder. In 9th International Conference on Image Analysis and Recognition (Vol. 7325, pp. 222–229). LNCS.
Abstract: Poster
Studies on volumetric brain Magnetic Resonance Imaging (MRI) showed neuroanatomical abnormalities in pediatric Attention-Deficit/Hyperactivity Disorder (ADHD). In particular, the diminished right caudate volume is one of the most replicated findings among ADHD samples in morphometric MRI studies. In this paper, we propose a fully-automatic method for internal caudate nucleus segmentation based on machine learning. Moreover, the ratio between right caudate body volume and the bilateral caudate body volume is applied in a ADHD diagnostic test. We separately validate the automatic internal segmentation of caudate in head and body structures and the diagnostic test using real data from ADHD and control subjects. As a result, we show accurate internal caudate segmentation and similar performance among the proposed automatic diagnostic test and the manual annotation.
|
Panagiota Spyridonos, Fernando Vilariño, Jordi Vitria, Fernando Azpiroz, & Petia Radeva. (2006). Anisotropic Feature Extraction from Endoluminal Images for Detection of Intestinal Contractions. In and J. Sporring M. N. R. Larsen (Ed.), 9th International Conference on Medical Image Computing and Computer–Assisted Intervention (Vol. 4191, 161–168). LNCS. Berlin Heidelberg: Springer Verlag.
Abstract: Wireless endoscopy is a very recent and at the same time unique technique allowing to visualize and study the occurrence of con- tractions and to analyze the intestine motility. Feature extraction is es- sential for getting efficient patterns to detect contractions in wireless video endoscopy of small intestine. We propose a novel method based on anisotropic image filtering and efficient statistical classification of con- traction features. In particular, we apply the image gradient tensor for mining informative skeletons from the original image and a sequence of descriptors for capturing the characteristic pattern of contractions. Fea- tures extracted from the endoluminal images were evaluated in terms of their discriminatory ability in correct classifying images as either belong- ing to contractions or not. Classification was performed by means of a support vector machine classifier with a radial basis function kernel. Our classification rates gave sensitivity of the order of 90.84% and specificity of the order of 94.43% respectively. These preliminary results highlight the high efficiency of the selected descriptors and support the feasibility of the proposed method in assisting the automatic detection and analysis of contractions.
|
Ellen J.L. Brunenberg, Oriol Pujol, Bart M. Ter Haar Romeny, & Petia Radeva. (2006). Automatic IVUS Segmentation of Atherosclerotic Plaque with Stop & Go Snake.
|
Lluis Pere de las Heras, Joan Mas, Gemma Sanchez, & Ernest Valveny. (2011). Descriptor-based Svm Wall Detector. In 9th International Workshop on Graphic Recognition.
Abstract: Architectural floorplans exhibit a large variability in notation. Therefore, segmenting and identifying the elements of any kind of plan becomes a challenging task for approaches based on grouping structural primitives obtained by vectorization. Recently, a patch-based segmentation method working at pixel level and relying on the construction of a visual vocabulary has been proposed showing its adaptability to different notations by automatically learning the visual appearance of the elements in each different notation. In this paper we describe an evolution of this new approach in two directions: firstly we evaluate different features to obtain the description of every patch. Secondly, we train an SVM classifier to obtain the category of every patch instead of constructing a visual vocabulary. These modifications of the method have been tested for wall detection on two datasets of architectural floorplans with different notations and compared with the results obtained with the original approach.
|
Mohamed Ilyes Lakhal, Albert Clapes, Sergio Escalera, Oswald Lanz, & Andrea Cavallaro. (2018). Residual Stacked RNNs for Action Recognition. In 9th International Workshop on Human Behavior Understanding (pp. 534–548).
Abstract: Action recognition pipelines that use Recurrent Neural Networks (RNN) are currently 5–10% less accurate than Convolutional Neural Networks (CNN). While most works that use RNNs employ a 2D CNN on each frame to extract descriptors for action recognition, we extract spatiotemporal features from a 3D CNN and then learn the temporal relationship of these descriptors through a stacked residual recurrent neural network (Res-RNN). We introduce for the first time residual learning to counter the degradation problem in multi-layer RNNs, which have been successful for temporal aggregation in two-stream action recognition pipelines. Finally, we use a late fusion strategy to combine RGB and optical flow data of the two-stream Res-RNN. Experimental results show that the proposed pipeline achieves competitive results on UCF-101 and state of-the-art results for RNN-like architectures on the challenging HMDB-51 dataset.
Keywords: Action recognition; Deep residual learning; Two-stream RNN
|
Santiago Segui, Laura Igual, & Jordi Vitria. (2010). Weighted Bagging for Graph based One-Class Classifiers. In 9th International Workshop on Multiple Classifier Systems (Vol. 5997, pp. 1–10). LNCS. Springer Berlin Heidelberg.
Abstract: Most conventional learning algorithms require both positive and negative training data for achieving accurate classification results. However, the problem of learning classifiers from only positive data arises in many applications where negative data are too costly, difficult to obtain, or not available at all. Minimum Spanning Tree Class Descriptor (MSTCD) was presented as a method that achieves better accuracies than other one-class classifiers in high dimensional data. However, the presence of outliers in the target class severely harms the performance of this classifier. In this paper we propose two bagging strategies for MSTCD that reduce the influence of outliers in training data. We show the improved performance on both real and artificially contaminated data.
|
V. Valev, & Petia Radeva. (1995). Constructing Quantitative Non-Reducible Descriptors..
|
Sergio Vera, Miguel Angel Gonzalez Ballester, & Debora Gil. (2012). Optimal Medial Surface Generation for Anatomical Volume Representations. In MichaelW. David and Vannier H. and H. Yoshida (Ed.), Abdominal Imaging. Computational and Clinical Applications (Vol. 7601, pp. 265–273). Lecture Notes in Computer Science. Springer Berlin Heidelberg.
Abstract: Medial representations are a widely used technique in abdominal organ shape representation and parametrization. Those methods require good medial manifolds as a starting point. Any medial
surface used to parametrize a volume should be simple enough to allow an easy manipulation and complete enough to allow an accurate reconstruction of the volume. Obtaining good quality medial
surfaces is still a problem with current iterative thinning methods. This forces the usage of generic, pre-calculated medial templates that are adapted to the final shape at the cost of a drop in volume reconstruction.
This paper describes an operator for generation of medial structures that generates clean and complete manifolds well suited for their further use in medial representations of abdominal organ volumes. While being simpler than thinning surfaces, experiments show its high performance in volume reconstruction and preservation of medial surface main branching topology.
Keywords: Medial surface representation; volume reconstruction
|
David Guillamet, & Jordi Vitria. (2001). Non-negative Matrix Factorization to Extract Part-Based Representations..
|
Y. Patel, Lluis Gomez, Marçal Rusiñol, Dimosthenis Karatzas, & C.V. Jawahar. (2019). Self-Supervised Visual Representations for Cross-Modal Retrieval. In ACM International Conference on Multimedia Retrieval (182–186).
Abstract: Cross-modal retrieval methods have been significantly improved in last years with the use of deep neural networks and large-scale annotated datasets such as ImageNet and Places. However, collecting and annotating such datasets requires a tremendous amount of human effort and, besides, their annotations are limited to discrete sets of popular visual classes that may not be representative of the richer semantics found on large-scale cross-modal retrieval datasets. In this paper, we present a self-supervised cross-modal retrieval framework that leverages as training data the correlations between images and text on the entire set of Wikipedia articles. Our method consists in training a CNN to predict: (1) the semantic context of the article in which an image is more probable to appear as an illustration, and (2) the semantic context of its caption. Our experiments demonstrate that the proposed method is not only capable of learning discriminative visual representations for solving vision tasks like classification, but that the learned representations are better for cross-modal retrieval when compared to supervised pre-training of the network on the ImageNet dataset.
|