|
Jaume Amores. (2015). MILDE: multiple instance learning by discriminative embedding. KAIS - Knowledge and Information Systems, 42(2), 381–407.
Abstract: While the objective of the standard supervised learning problem is to classify feature vectors, in the multiple instance learning problem, the objective is to classify bags, where each bag contains multiple feature vectors. This represents a generalization of the standard problem, and this generalization becomes necessary in many real applications such as drug activity prediction, content-based image retrieval, and others. While the existing paradigms are based on learning the discriminant information either at the instance level or at the bag level, we propose to incorporate both levels of information. This is done by defining a discriminative embedding of the original space based on the responses of cluster-adapted instance classifiers. Results clearly show the advantage of the proposed method over the state of the art, where we tested the performance through a variety of well-known databases that come from real problems, and we also included an analysis of the performance using synthetically generated data.
Keywords: Multi-instance learning; Codebook; Bag of words
|
|
|
Albert Gordo, Florent Perronnin, Yunchao Gong, & Svetlana Lazebnik. (2014). Asymmetric Distances for Binary Embeddings. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(1), 33–47.
Abstract: In large-scale query-by-example retrieval, embedding image signatures in a binary space offers two benefits: data compression and search efficiency. While most embedding algorithms binarize both query and database signatures, it has been noted that this is not strictly a requirement. Indeed, asymmetric schemes which binarize the database signatures but not the query still enjoy the same two benefits but may provide superior accuracy. In this work, we propose two general asymmetric distances which are applicable to a wide variety of embedding techniques including Locality Sensitive Hashing (LSH), Locality Sensitive Binary Codes (LSBC), Spectral Hashing (SH), PCA Embedding (PCAE), PCA Embedding with random rotations (PCAE-RR), and PCA Embedding with iterative quantization (PCAE-ITQ). We experiment on four public benchmarks containing up to 1M images and show that the proposed asymmetric distances consistently lead to large improvements over the symmetric Hamming distance for all binary embedding techniques.
|
|
|
Fadi Dornaika, & Angel Sappa. (2007). Rigid and Non-rigid Face Motion Tracking by Aligning Texture Maps and Stereo 3D Models. PRL - Pattern Recognition Letters, 28(15), 2116–2126.
|
|
|
Carme Julia, Angel Sappa, Felipe Lumbreras, Joan Serrat, & Antonio Lopez. (2008). Rank Estimation in 3D Multibody Motion Segmentation. Electronic Letters, 44(4), 279–280.
Abstract: A novel technique for rank estimation in 3D multibody motion segmentation is proposed. It is based on the study of the frequency spectra of moving rigid objects and does not use or assume a prior knowledge of the objects contained in the scene (i.e. number of objects and motion). The significance of rank estimation on multibody motion segmentation results is shown by using two motion segmentation algorithms over both synthetic and real data.
|
|
|
Joan Serrat, Ferran Diego, Felipe Lumbreras, Jose Manuel Alvarez, Antonio Lopez, & C. Elvira. (2008). Dynamic Comparison of Headlights. Journal of Automobile Engineering, 222(5), 643–656.
Keywords: video alignment
|
|
|
Francisco Javier Orozco, Xavier Roca, & Jordi Gonzalez. (2008). Real-Time Gaze Tracking with Appearance-Based Models. MVAP - Machine Vision Applications, 20(6), 353–364.
Abstract: Psychological evidence has emphasized the importance of eye gaze analysis in human computer interaction and emotion interpretation. To this end, current image analysis algorithms take into consideration eye-lid and iris motion detection using colour information and edge detectors. However, eye movement is fast and and hence difficult to use to obtain a precise and robust tracking. Instead, our
method proposed to describe eyelid and iris movements as continuous variables using appearance-based tracking. This approach combines the strengths of adaptive appearance models, optimization methods and backtracking techniques.Thus,
in the proposed method textures are learned on-line from near frontal images and illumination changes, occlusions and fast movements are managed. The method achieves real-time performance by combining two appearance-based trackers to a
backtracking algorithm for eyelid estimation and another for iris estimation. These contributions represent a significant advance towards a reliable gaze motion description for HCI and expression analysis, where the strength of complementary
methodologies are combined to avoid using high quality images, colour information, texture training, camera settings and other time-consuming processes.
Keywords: Keywords Eyelid and iris tracking, Appearance models, Blinking, Iris saccade, Real-time gaze tracking
|
|
|
Angel Sappa, Fadi Dornaika, Daniel Ponsa, David Geronimo, & Antonio Lopez. (2008). An Efficient Approach to Onboard Stereo Vision System Pose Estimation. TITS - IEEE Transactions on Intelligent Transportation Systems, 9(3), 476–490.
Abstract: This paper presents an efficient technique for estimating the pose of an onboard stereo vision system relative to the environment’s dominant surface area, which is supposed to be the road surface. Unlike previous approaches, it can be used either for urban or highway scenarios since it is not based on a specific visual traffic feature extraction but on 3-D raw data points. The whole process is performed in the Euclidean space and consists of two stages. Initially, a compact 2-D representation of the original 3-D data points is computed. Then, a RANdom SAmple Consensus (RANSAC) based least-squares approach is used to fit a plane to the road. Fast RANSAC fitting is obtained by selecting points according to a probability function that takes into account the density of points at a given depth. Finally, stereo camera height and pitch angle are computed related to the fitted road plane. The proposed technique is intended to be used in driverassistance systems for applications such as vehicle or pedestrian detection. Experimental results on urban environments, which are the most challenging scenarios (i.e., flat/uphill/downhill driving, speed bumps, and car’s accelerations), are presented. These results are validated with manually annotated ground truth. Additionally, comparisons with previous works are presented to show the improvements in the central processing unit processing time, as well as in the accuracy of the obtained results.
Keywords: Camera extrinsic parameter estimation, ground plane estimation, onboard stereo vision system
|
|
|
Jaume Garcia, Debora Gil, Sandra Pujades, & Francesc Carreras. (2008). Valoracion de la Funcion del Ventriculo Izquierdo mediante Modelos Regionales Hiperparametricos. Revista Española de Cardiologia, 61(3), 79.
Abstract: La mayoría de la enfermedades cardiovasculares afectan a las propiedades contráctiles de la banda ventricular helicoidal. Esto se refleja en una variación del comportamiento normal de la función ventricular. Parámetros locales tales como los strains, o la deformación experimentada por el tejido, son indicadores capaces de detectar anomalías funcionales en territorios específicos. A menudo, dichos parámetros son considerados de forma separada. En este trabajo presentamos un marco computacional (el Dominio Paramétrico Normalizado, DPN) que permite integrarlos en hiperparámetros funcionales y estudiar sus rangos de normalidad. Dichos rangos permiten valorar de forma objetiva la función regional de cualquier nuevo paciente. Para ello, consideramos secuencias de resonancia magnética etiquetada a nivel basal, medio y apical. Los hiperparámetros se obtienen a partir del movimiento intramural del VI estimado mediante el método Harmonic Phase Flow. El DPN se define a partir de en una parametrización del Ventrículo Izquierdo (VI) en sus coordenadas radiales y circunferencial basada en criterios anatómicos. El paso de los hiperparámetros al DPN hace posible la comparación entre distintos pacientes. Los rangos de normalidad se definen mediante análisis estadístico de valores de voluntarios sanos en 45 regiones del DPN a lo largo de 9 fases sistólicas. Se ha usado un conjunto de 19 (14 H; E: 30.7±7.5) voluntarios sanos para crear los patrones de normalidad y se han validado usando 2 controles sanos y 3 pacientes afectados de contractilidad global reducida. Para los controles los resultados regionales se han ajustado dentro de la normalidad, mientras que para los pacientes se han obtenido valores anormales en las zonas descritas, localizando y cuantificando así el diagnóstico empírico.
|
|
|
T. Widemann, & Xavier Otazu. (2009). Titanias radius and an upper limit on its atmosphere from the September 8, 2001 stellar occultation. International Journal of Solar System Studies, 199(2), 458–476.
Abstract: On September 8, 2001 around 2 h UT, the largest uranian moon, Titania, occulted Hipparcos star 106829 (alias SAO 164538, a V=7.2, K0 III star). This was the first-ever observed occultation by this satellite, a rare event as Titania subtends only 0.11 arcsec on the sky. The star's unusual brightness allowed many observers, both amateurs or professionals, to monitor this unique event, providing fifty-seven occultations chords over three continents, all reported here. Selecting the best 27 occultation chords, and assuming a circular limb, we derive Titania's radius: View the MathML source (1-σ error bar). This implies a density of View the MathML source using the value View the MathML source derived by Taylor [Taylor, D.B., 1998. Astron. Astrophys. 330, 362–374]. We do not detect any significant difference between equatorial and polar radii, in the limit View the MathML source, in agreement with Voyager limb image retrieval during the 1986 flyby. Titania's offset with respect to the DE405 + URA027 (based on GUST86 theory) ephemeris is derived: ΔαTcos(δT)=−108±13 mas and ΔδT=−62±7 mas (ICRF J2000.0 system). Most of this offset is attributable to a Uranus' barycentric offset with respect to DE405, that we estimate to be: View the MathML source and ΔδU=−85±25 mas at the moment of occultation. This offset is confirmed by another Titania stellar occultation observed on August 1st, 2003, which provides an offset of ΔαTcos(δT)=−127±20 mas and ΔδT=−97±13 mas for the satellite. The combined ingress and egress data do not show any significant hint for atmospheric refraction, allowing us to set surface pressure limits at the level of 10–20 nbar. More specifically, we find an upper limit of 13 nbar (1-σ level) at 70 K and 17 nbar at 80 K, for a putative isothermal CO2 atmosphere. We also provide an upper limit of 8 nbar for a possible CH4 atmosphere, and 22 nbar for pure N2, again at the 1-σ level. We finally constrain the stellar size using the time-resolved star disappearance and reappearance at ingress and egress. We find an angular diameter of 0.54±0.03 mas (corresponding to View the MathML source projected at Titania). With a distance of 170±25 parsecs, this corresponds to a radius of 9.8±0.2 solar radii for HIP 106829, typical of a K0 III giant.
Keywords: Occultations; Uranus, satellites; Satellites, shapes; Satellites, dynamics; Ices; Satellites, atmospheres
|
|
|
Jose Antonio Rodriguez, & Florent Perronnin. (2009). Handwritten word-spotting using hidden Markov models and universal vocabularies. PR - Pattern Recognition, 42(9), 2103–2116.
Abstract: Handwritten word-spotting is traditionally viewed as an image matching task between one or multiple query word-images and a set of candidate word-images in a database. This is a typical instance of the query-by-example paradigm. In this article, we introduce a statistical framework for the word-spotting problem which employs hidden Markov models (HMMs) to model keywords and a Gaussian mixture model (GMM) for score normalization. We explore the use of two types of HMMs for the word modeling part: continuous HMMs (C-HMMs) and semi-continuous HMMs (SC-HMMs), i.e. HMMs with a shared set of Gaussians. We show on a challenging multi-writer corpus that the proposed statistical framework is always superior to a traditional matching system which uses dynamic time warping (DTW) for word-image distance computation. A very important finding is that the SC-HMM is superior when labeled training data is scarce—as low as one sample per keyword—thanks to the prior information which can be incorporated in the shared set of Gaussians.
Keywords: Word-spotting; Hidden Markov model; Score normalization; Universal vocabulary; Handwriting recognition
|
|
|
Jaume Garcia, Debora Gil, Sandra Pujades, & Francesc Carreras. (2008). A Variational Framework for Assessment of the Left Ventricle Motion. International Journal Mathematical Modelling of Natural Phenomena, 3(6), 76–100.
Abstract: Impairment of left ventricular contractility due to cardiovascular diseases is reflected in left ventricle (LV) motion patterns. An abnormal change of torsion or long axis shortening LV values can help with the diagnosis and follow-up of LV dysfunction. Tagged Magnetic Resonance (TMR) is a widely spread medical imaging modality that allows estimation of the myocardial tissue local deformation. In this work, we introduce a novel variational framework for extracting the left ventricle dynamics from TMR sequences. A bi-dimensional representation space of TMR images given by Gabor filter banks is defined. Tracking of the phases of the Gabor response is combined using a variational framework which regularizes the deformation field just at areas where the Gabor amplitude drops, while restoring the underlying motion otherwise. The clinical applicability of the proposed method is illustrated by extracting normality models of the ventricular torsion from 19 healthy subjects.
Keywords: Key words: Left Ventricle Dynamics, Ventricular Torsion, Tagged Magnetic Resonance, Motion Tracking, Variational Framework, Gabor Transform.
|
|
|
C. Butakoff, Simone Balocco, F.M. Sukno, C. Hoogendoorn, C. Tobon-Gomez, G. Avegliano, et al. (2016). Left-ventricular Epi- and Endocardium Extraction from 3D Ultrasound Images Using an Automatically Constructed 3D ASM. CMBBE - Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization, 4(5), 265–280.
Abstract: In this paper, we propose an automatic method for constructing an active shape model (ASM) to segment the complete cardiac left ventricle in 3D ultrasound (3DUS) images, which avoids costly manual landmarking. The automatic construction of the ASM has already been addressed in the literature; however, the direct application of these methods to 3DUS is hampered by a high level of noise and artefacts. Therefore, we propose to construct the ASM by fusing the multidetector computed tomography data, to learn the shape, with the artificially generated 3DUS, in order to learn the neighbourhood of the boundaries. Our artificial images were generated by two approaches: a faster one that does not take into account the geometry of the transducer, and a more comprehensive one, implemented in Field II toolbox. The segmentation accuracy of our ASM was evaluated on 20 patients with left-ventricular asynchrony, demonstrating plausibility of the approach.
Keywords: ASM; cardiac segmentation; statistical model; shape model; 3D ultrasound; cardiac segmentation
|
|
|
Jiaolong Xu, Sebastian Ramos, David Vazquez, & Antonio Lopez. (2014). Domain Adaptation of Deformable Part-Based Models. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(12), 2367–2380.
Abstract: The accuracy of object classifiers can significantly drop when the training data (source domain) and the application scenario (target domain) have inherent differences. Therefore, adapting the classifiers to the scenario in which they must operate is of paramount importance. We present novel domain adaptation (DA) methods for object detection. As proof of concept, we focus on adapting the state-of-the-art deformable part-based model (DPM) for pedestrian detection. We introduce an adaptive structural SVM (A-SSVM) that adapts a pre-learned classifier between different domains. By taking into account the inherent structure in feature space (e.g., the parts in a DPM), we propose a structure-aware A-SSVM (SA-SSVM). Neither A-SSVM nor SA-SSVM needs to revisit the source-domain training data to perform the adaptation. Rather, a low number of target-domain training examples (e.g., pedestrians) are used. To address the scenario where there are no target-domain annotated samples, we propose a self-adaptive DPM based on a self-paced learning (SPL) strategy and a Gaussian Process Regression (GPR). Two types of adaptation tasks are assessed: from both synthetic pedestrians and general persons (PASCAL VOC) to pedestrians imaged from an on-board camera. Results show that our proposals avoid accuracy drops as high as 15 points when comparing adapted and non-adapted detectors.
Keywords: Domain Adaptation; Pedestrian Detection
|
|
|
Santiago Segui, Michal Drozdzal, Ekaterina Zaytseva, Fernando Azpiroz, Petia Radeva, & Jordi Vitria. (2014). Detection of wrinkle frames in endoluminal videos using betweenness centrality measures for images. TITB - IEEE Transactions on Information Technology in Biomedicine, 18(6), 1831–1838.
Abstract: Intestinal contractions are one of the most important events to diagnose motility pathologies of the small intestine. When visualized by wireless capsule endoscopy (WCE), the sequence of frames that represents a contraction is characterized by a clear wrinkle structure in the central frames that corresponds to the folding of the intestinal wall. In this paper we present a new method to robustly detect wrinkle frames in full WCE videos by using a new mid-level image descriptor that is based on a centrality measure proposed for graphs. We present an extended validation, carried out in a very large database, that shows that the proposed method achieves state of the art performance for this task.
Keywords: Wireless Capsule Endoscopy; Small Bowel Motility Dysfunction; Contraction Detection; Structured Prediction; Betweenness Centrality
|
|
|
Miquel Ferrer, Ernest Valveny, & F. Serratosa. (2009). Median graph: A new exact algorithm using a distance based on the maximum common subgraph. PRL - Pattern Recognition Letters, 30(5), 579–588.
Abstract: Median graphs have been presented as a useful tool for capturing the essential information of a set of graphs. Nevertheless, computation of optimal solutions is a very hard problem. In this work we present a new and more efficient optimal algorithm for the median graph computation. With the use of a particular cost function that permits the definition of the graph edit distance in terms of the maximum common subgraph, and a prediction function in the backtracking algorithm, we reduce the size of the search space, avoiding the evaluation of a great amount of states and still obtaining the exact median. We present a set of experiments comparing our new algorithm against the previous existing exact algorithm using synthetic data. In addition, we present the first application of the exact median graph computation to real data and we compare the results against an approximate algorithm based on genetic search. These experimental results show that our algorithm outperforms the previous existing exact algorithm and in addition show the potential applicability of the exact solutions to real problems.
|
|