Fadi Dornaika, Alireza Bosaghzadeh, & Bogdan Raducanu. (2012). LSDA Solution Schemes for Modelless 3D Head Pose Estimation. In IEEE Workshop on the Applications of Computer Vision (pp. 393–398).
|
Bogdan Raducanu, & Fadi Dornaika. (2012). Appearance-based Face Recognition Using A Supervised Manifold Learning Framework. In IEEE Workshop on the Applications of Computer Vision (pp. 465–470). IEEE Xplore.
Abstract: Many natural image sets, depicting objects whose appearance is changing due to motion, pose or light variations, can be considered samples of a low-dimension nonlinear manifold embedded in the high-dimensional observation space (the space of all possible images). The main contribution of our work is represented by a Supervised Laplacian Eigemaps (S-LE) algorithm, which exploits the class label information for mapping the original data in the embedded space. Our proposed approach benefits from two important properties: i) it is discriminative, and ii) it adaptively selects the neighbors of a sample without using any predefined neighborhood size. Experiments were conducted on four face databases and the results demonstrate that the proposed algorithm significantly outperforms many linear and non-linear embedding techniques. Although we've focused on the face recognition problem, the proposed approach could also be extended to other category of objects characterized by large variance in their appearance.
|
Antonio Hernandez, Carlos Primo, & Sergio Escalera. (2011). Automatic user interaction correction via Multi-label Graph cuts. In In ICCV 2011 1st IEEE International Workshop on Human Interaction in Computer Vision HICV (pp. 1276–1281).
Abstract: Most applications in image segmentation requires from user interaction in order to achieve accurate results. However, user wants to achieve the desired segmentation accuracy reducing effort of manual labelling. In this work, we extend standard multi-label α-expansion Graph Cut algorithm so that it analyzes the interaction of the user in order to modify the object model and improve final segmentation of objects. The approach is inspired in the fact that fast user interactions may introduce some pixel errors confusing object and background. Our results with different degrees of user interaction and input errors show high performance of the proposed approach on a multi-label human limb segmentation problem compared with classical α-expansion algorithm.
|
Miguel Reyes, Gabriel Dominguez, & Sergio Escalera. (2011). Feature Weighting in Dynamic Time Warping for Gesture Recognition in Depth Data. In 1st IEEE Workshop on Consumer Depth Cameras for Computer Vision (pp. 1182–1188).
Abstract: We present a gesture recognition approach for depth video data based on a novel Feature Weighting approach within the Dynamic Time Warping framework. Depth features from human joints are compared through video sequences using Dynamic Time Warping, and weights are assigned to features based on inter-intra class gesture variability. Feature Weighting in Dynamic Time Warping is then applied for recognizing begin-end of gestures in data sequences. The obtained results recognizing several gestures in depth data show high performance compared with classical Dynamic Time Warping approach.
|
Santiago Segui, Michal Drozdzal, Petia Radeva, & Jordi Vitria. (2012). An Integrated Approach to Contextual Face Detection. In 1st International Conference on Pattern Recognition Applications and Methods (pp. 143–150). Springer.
Abstract: Face detection is, in general, based on content-based detectors. Nevertheless, the face is a non-rigid object with well defined relations with respect to the human body parts. In this paper, we propose to take benefit of the context information in order to improve content-based face detections. We propose a novel framework for integrating multiple content- and context-based detectors in a discriminative way. Moreover, we develop an integrated scoring procedure that measures the ’faceness’ of each hypothesis and is used to discriminate the detection results. Our approach detects a higher rate of faces while minimizing the number of false detections, giving an average increase of more than 10% in average precision when comparing it to state-of-the art face detectors
|
Patricia Marquez, Debora Gil, & Aura Hernandez-Sabate. (2012). Error Analysis for Lucas-Kanade Based Schemes. In 9th International Conference on Image Analysis and Recognition (Vol. 7324, pp. 184–191). LNCS. Springer-Verlag Berlin Heidelberg.
Abstract: Optical flow is a valuable tool for motion analysis in medical imaging sequences. A reliable application requires determining the accuracy of the computed optical flow. This is a main challenge given the absence of ground truth in medical sequences. This paper presents an error analysis of Lucas-Kanade schemes in terms of intrinsic design errors and numerical stability of the algorithm. Our analysis provides a confidence measure that is naturally correlated to the accuracy of the flow field. Our experiments show the higher predictive value of our confidence measure compared to existing measures.
Keywords: Optical flow, Confidence measure, Lucas-Kanade, Cardiac Magnetic Resonance
|
Albert Andaluz, Francesc Carreras, Cristina Santa Marta, & Debora Gil. (2012). Myocardial torsion estimation with Tagged-MRI in the OsiriX platform. In Wiro Niessen(Erasmus MC) and Marc Modat(UCL) (Ed.), ISBI Workshop on Open Source Medical Image Analysis software. IEEE.
Abstract: Myocardial torsion (MT) plays a crucial role in the assessment of the functionality of the
left ventricle. For this purpose, the IAM group at the CVC has developed the Harmonic Phase Flow (HPF) plugin for the Osirix DICOM platform . We have validated its funcionalty on sequences acquired using different protocols and including healthy and pathological cases. Results show similar torsion trends for SPAMM acquisitions, with pathological cases introducing expected deviations from the ground truth. Finally, we provide the plugin free of charge at http://iam.cvc.uab.es
|
Patricia Marquez, Debora Gil, & Aura Hernandez-Sabate. (2013). Evaluation of the Capabilities of Confidence Measures for Assessing Optical Flow Quality. In ICCV Workshop on Computer Vision in Vehicle Technology: From Earth to Mars (pp. 624–631).
Abstract: Assessing Optical Flow (OF) quality is essential for its further use in reliable decision support systems. The absence of ground truth in such situations leads to the computation of OF Confidence Measures (CM) obtained from either input or output data. A fair comparison across the capabilities of the different CM for bounding OF error is required in order to choose the best OF-CM pair for discarding points where OF computation is not reliable. This paper presents a statistical probabilistic framework for assessing the quality of a given CM. Our quality measure is given in terms of the percentage of pixels whose OF error bound can not be determined by CM values. We also provide statistical tools for the computation of CM values that ensures a given accuracy of the flow field.
|
Francesco Ciompi, Rui Hua, Simone Balocco, Marina Alberti, Oriol Pujol, Carles Caus, et al. (2013). Learning to Detect Stent Struts in Intravascular Ultrasound. In 6th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 7887, pp. 575–583). Springer Berlin Heidelberg.
Abstract: In this paper we tackle the automatic detection of struts elements (metallic braces of a stent device) in Intravascular Ultrasound (IVUS) sequences. The proposed method is based on context-aware classification of IVUS images, where we use Multi-Class Multi-Scale Stacked Sequential Learning (M2SSL). Additionally, we introduce a novel technique to reduce the amount of required contextual features. The comparison with binary and multi-class learning is also performed, using a dataset of IVUS images with struts manually annotated by an expert. The best performing configuration reaches a F-measure F = 63.97% .
|
Marçal Rusiñol, V. Poulain d'Andecy, Dimosthenis Karatzas, & Josep Llados. (2013). Classification of Administrative Document Images by Logo Identification. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: This paper is focused on the categorization of administrative document images (such as invoices) based on the recognition of the supplier's graphical logo. Two different methods are proposed, the first one uses a bag-of-visual-words model whereas the second one tries to locate logo images described by the blurred shape model descriptor within documents by a sliding-window technique. Preliminar results are reported with a dataset of real administrative documents.
|
Marçal Rusiñol, Dimosthenis Karatzas, & Josep Llados. (2013). Spotting Graphical Symbols in Camera-Acquired Documents in Real Time. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: In this paper we present a system devoted to spot graphical symbols in camera-acquired document images. The system is based on the extraction and further matching of ORB compact local features computed over interest key-points. Then, the FLANN indexing framework based on approximate nearest neighbor search allows to efficiently match local descriptors between the captured scene and the graphical models. Finally, the RANSAC algorithm is used in order to compute the homography between the spotted symbol and its appearance in the document image. The proposed approach is efficient and is able to work in real time.
|
Marçal Rusiñol, T.Benkhelfallah, & V. Poulain d'Andecy. (2013). Field Extraction from Administrative Documents by Incremental Structural Templates. In 12th International Conference on Document Analysis and Recognition (pp. 1100–1104).
Abstract: In this paper we present an incremental framework aimed at extracting field information from administrative document images in the context of a Digital Mail-room scenario. Given a single training sample in which the user has marked which fields have to be extracted from a particular document class, a document model representing structural relationships among words is built. This model is incrementally refined as the system processes more and more documents from the same class. A reformulation of the tf-idf statistic scheme allows to adjust the importance weights of the structural relationships among words. We report in the experimental section our results obtained with a large dataset of real invoices.
|
Albert Gordo, Marçal Rusiñol, Dimosthenis Karatzas, & Andrew Bagdanov. (2013). Document Classification and Page Stream Segmentation for Digital Mailroom Applications. In 12th International Conference on Document Analysis and Recognition (pp. 621–625).
Abstract: In this paper we present a method for the segmentation of continuous page streams into multipage documents and the simultaneous classification of the resulting documents. We first present an approach to combine the multiple pages of a document into a single feature vector that represents the whole document. Despite its simplicity and low computational cost, the proposed representation yields results comparable to more complex methods in multipage document classification tasks. We then exploit this representation in the context of page stream segmentation. The most plausible segmentation of a page stream into a sequence of multipage documents is obtained by optimizing a statistical model that represents the probability of each segmented multipage document belonging to a particular class. Experimental results are reported on a large sample of real administrative multipage documents.
|
L. Rothacker, Marçal Rusiñol, & G.A. Fink. (2013). Bag-of-Features HMMs for segmentation-free word spotting in handwritten documents. In 12th International Conference on Document Analysis and Recognition (pp. 1305–1309).
Abstract: Recent HMM-based approaches to handwritten word spotting require large amounts of learning samples and mostly rely on a prior segmentation of the document. We propose to use Bag-of-Features HMMs in a patch-based segmentation-free framework that are estimated by a single sample. Bag-of-Features HMMs use statistics of local image feature representatives. Therefore they can be considered as a variant of discrete HMMs allowing to model the observation of a number of features at a point in time. The discrete nature enables us to estimate a query model with only a single example of the query provided by the user. This makes our method very flexible with respect to the availability of training data. Furthermore, we are able to outperform state-of-the-art results on the George Washington dataset.
|
Jiaolong Xu, Sebastian Ramos, Xu Hu, David Vazquez, & Antonio Lopez. (2013). Multi-task Bilinear Classifiers for Visual Domain Adaptation. In Advances in Neural Information Processing Systems Workshop.
Abstract: We propose a method that aims to lessen the significant accuracy degradation
that a discriminative classifier can suffer when it is trained in a specific domain (source domain) and applied in a different one (target domain). The principal reason for this degradation is the discrepancies in the distribution of the features that feed the classifier in different domains. Therefore, we propose a domain adaptation method that maps the features from the different domains into a common subspace and learns a discriminative domain-invariant classifier within it. Our algorithm combines bilinear classifiers and multi-task learning for domain adaptation.
The bilinear classifier encodes the feature transformation and classification
parameters by a matrix decomposition. In this way, specific feature transformations for multiple domains and a shared classifier are jointly learned in a multi-task learning framework. Focusing on domain adaptation for visual object detection, we apply this method to the state-of-the-art deformable part-based model for cross domain pedestrian detection. Experimental results show that our method significantly avoids the domain drift and improves the accuracy when compared to several baselines.
Keywords: Domain Adaptation; Pedestrian Detection; ADAS
|