|
Hugo Jair Escalante, Jose Martinez, Sergio Escalera, Victor Ponce, & Xavier Baro. (2015). Improving Bag of Visual Words Representations with Genetic Programming. In IEEE International Joint Conference on Neural Networks IJCNN2015.
Abstract: The bag of visual words is a well established representation in diverse computer vision problems. Taking inspiration from the fields of text mining and retrieval, this representation has proved to be very effective in a large number of domains.
In most cases, a standard term-frequency weighting scheme is considered for representing images and videos in computer vision. This is somewhat surprising, as there are many alternative ways of generating bag of words representations within the text processing community. This paper explores the use of alternative weighting schemes for landmark tasks in computer vision: image
categorization and gesture recognition. We study the suitability of using well-known supervised and unsupervised weighting schemes for such tasks. More importantly, we devise a genetic program that learns new ways of representing images and videos under the bag of visual words representation. The proposed method learns to combine term-weighting primitives trying to maximize the classification performance. Experimental results are reported in standard image and video data sets showing the effectiveness of the proposed evolutionary algorithm.
|
|
|
Isabelle Guyon, Kristin Bennett, Gavin Cawley, Hugo Jair Escalante, Sergio Escalera, Tin Kam Ho, et al. (2015). Design of the 2015 ChaLearn AutoML Challenge. In IEEE International Joint Conference on Neural Networks IJCNN2015.
Abstract: ChaLearn is organizing for IJCNN 2015 an Automatic Machine Learning challenge (AutoML) to solve classification and regression problems from given feature representations, without any human intervention. This is a challenge with code
submission: the code submitted can be executed automatically on the challenge servers to train and test learning machines on new datasets. However, there is no obligation to submit code. Half of the prizes can be won by just submitting prediction results.
There are six rounds (Prep, Novice, Intermediate, Advanced, Expert, and Master) in which datasets of progressive difficulty are introduced (5 per round). There is no requirement to participate in previous rounds to enter a new round. The rounds alternate AutoML phases in which submitted code is “blind tested” on
datasets the participants have never seen before, and Tweakathon phases giving time (' 1 month) to the participants to improve their methods by tweaking their code on those datasets. This challenge will push the state-of-the-art in fully automatic machine learning on a wide range of problems taken from real world
applications. The platform will remain available beyond the termination of the challenge: http://codalab.org/AutoML
|
|
|
Carles Sanchez, Debora Gil, R. Tazi, Jorge Bernal, Y. Ruiz, L. Planas, et al. (2015). Quasi-real time digital assessment of Central Airway Obstruction. In 3rd European congress for bronchology and interventional pulmonology ECBIP2015.
|
|
|
Carles Sanchez, Jorge Bernal, F. Javier Sanchez, Marta Diez-Ferrer, Antoni Rosell, & Debora Gil. (2015). Towards On-line Quantification of Tracheal Stenosis from Videobronchoscopy. In 6th International Conference on Information Processing in Computer-Assisted Interventions IPCAI2015 (Vol. 10, pp. 935–945).
Abstract: PURPOSE:
Lack of objective measurement of tracheal obstruction degree has a negative impact on the chosen treatment prone to lead to unnecessary repeated explorations and other scanners. Accurate computation of tracheal stenosis in videobronchoscopy would constitute a breakthrough for this noninvasive technique and a reduction in operation cost for the public health service.
METHODS:
Stenosis calculation is based on the comparison of the region delimited by the lumen in an obstructed frame and the region delimited by the first visible ring in a healthy frame. We propose a parametric strategy for the extraction of lumen and tracheal ring regions based on models of their geometry and appearance that guide a deformable model. To ensure a systematic applicability, we present a statistical framework to choose optimal parametric values and a strategy to choose the frames that minimize the impact of scope optical distortion.
RESULTS:
Our method has been tested in 40 cases covering different stenosed tracheas. Experiments report a non- clinically relevant [Formula: see text] of discrepancy in the calculated stenotic area and a computational time allowing online implementation in the operating room.
CONCLUSIONS:
Our methodology allows reliable measurements of airway narrowing in the operating room. To fully assess its clinical impact, a prospective clinical trial should be done.
|
|
|
Antoni Gurgui, Debora Gil, & Enric Marti. (2015). Laplacian Unitary Domain for Texture Morphing. In Proceedings of the 10th International Conference on Computer Vision Theory and Applications VISIGRAPP2015 (Vol. 1, pp. 693–699). SciTePress.
Abstract: Deformation of expressive textures is the gateway to realistic computer synthesis of expressions. By their good mathematical properties and flexible formulation on irregular meshes, most texture mappings rely on solutions to the Laplacian in the cartesian space. In the context of facial expression morphing, this approximation can be seen from the opposite point of view by neglecting the metric. In this paper, we use the properties of the Laplacian in manifolds to present a novel approach to warping expressive facial images in order to generate a morphing between them.
Keywords: Facial; metamorphosis;LaplacianMorphing
|
|
|
Sergio Vera, Miguel Angel Gonzalez Ballester, & Debora Gil. (2015). A Novel Cochlear Reference Frame Based On The Laplace Equation. In 29th international Congress and Exhibition on Computer Assisted Radiology and Surgery (Vol. 10, pp. 1–312).
|
|
|
Hanne Kause, Patricia Marquez, Andrea Fuster, Aura Hernandez-Sabate, Luc Florack, Debora Gil, et al. (2015). Quality Assessment of Optical Flow in Tagging MRI. In 5th Dutch Bio-Medical Engineering Conference BME2015.
|
|
|
Olivier Lefebvre, Pau Riba, Charles Fournier, Alicia Fornes, Josep Llados, Rejean Plamondon, et al. (2015). Monitoring neuromotricity on-line: a cloud computing approach. In 17th Conference of the International Graphonomics Society IGS2015.
Abstract: The goal of our experiment is to develop a useful and accessible tool that can be used to evaluate a patient's health by analyzing handwritten strokes. We use a cloud computing approach to analyze stroke data sampled on a commercial tablet working on the Android platform and a distant server to perform complex calculations using the Delta and Sigma lognormal algorithms. A Google Drive account is used to store the data and to ease the development of the project. The communication between the tablet, the cloud and the server is encrypted to ensure biomedical information confidentiality. Highly parameterized biomedical tests are implemented on the tablet as well as a free drawing test to evaluate the validity of the data acquired by the first test compared to the second one. A blurred shape model descriptor pattern recognition algorithm is used to classify the data obtained by the free drawing test. The functions presented in this paper are still currently under development and other improvements are needed before launching the application in the public domain.
|
|
|
Pau Riba, Josep Llados, Alicia Fornes, & Anjan Dutta. (2015). Large-scale Graph Indexing using Binary Embeddings of Node Contexts. In C.-L.Liu, B.Luo, W.G.Kropatsch, & J.Cheng (Eds.), 10th IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition (Vol. 9069, pp. 208–217). LNCS. Springer International Publishing.
Abstract: Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations in terms of feature vectors. Retrieving a query graph from a large dataset of graphs has the drawback of the high computational complexity required to compare the query and the target graphs. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. In this paper we propose a fast indexation formalism for graph retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Hence, each attribute counts the length of a walk of order k originated in a vertex with label l. Each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in a handwritten word spotting scenario in images of historical documents.
Keywords: Graph matching; Graph indexing; Application in document analysis; Word spotting; Binary embedding
|
|
|
Jorge Bernal, F. Javier Sanchez, Cristina Rodriguez de Miguel, & Gloria Fernandez Esparrach. (2015). Bulding up the future of colonoscopy: A synergy between clinicians and computer scientists. In Colonoscopy and Colorectal Cancer.
Abstract: Recent advances in endoscopic technology have generated an increasing interest in strengthening the collaboration between clinicians and computers scientist to develop intelligent systems that can provide additional information to clinicians in the different stages of an intervention. The objective of this chapter is to identify clinical drawbacks of colonoscopy in order to define potential areas of collaboration. Once areas are defined, we present the challenges that colonoscopy images present in order computational methods to provide with meaningful output, including those related to image formation and acquisition, as they are proven to have an impact in the performance of an intelligent system. Finally, we also propose how to define validation frameworks in order to assess the performance of a given method, making an special emphasis on how databases should be created and annotated and which metrics should be used to evaluate systems correctly.
Keywords: Intelligent systems; Image properties; Validation; Clinical drawbacks; Endoluminal scene description
|
|
|
Youssef El Rhabi, Simon Loic, & Brun Luc. (2015). Estimation de la pose d’une caméra à partir d’un flux vidéo en s’approchant du temps réel. In 15ème édition d'ORASIS, journées francophones des jeunes chercheurs en vision par ordinateur ORASIS2015.
Abstract: Finding a way to estimate quickly and robustly the pose of an image is essential in augmented reality. Here we will discuss the approach we chose in order to get closer to real time by using SIFT points [4]. We propose a method based on filtering both SIFT points and images on which to focus on. Hence we will focus on relevant data.
Keywords: Augmented Reality; SFM; SLAM; real time pose computation; 2D/3D registration
|
|
|
Bogdan Raducanu, Alireza Bosaghzadeh, & Fadi Dornaika. (2015). Multi-observation Face Recognition in Videos based on Label Propagation. In 6th Workshop on Analysis and Modeling of Faces and Gestures AMFG2015 (pp. 10–17).
Abstract: In order to deal with the huge amount of content generated by social media, especially for indexing and retrieval purposes, the focus shifted from single object recognition to multi-observation object recognition. Of particular interest is the problem of face recognition (used as primary cue for persons’ identity assessment), since it is highly required by popular social media search engines like Facebook and Youtube. Recently, several approaches for graph-based label propagation were proposed. However, the associated graphs were constructed in an ad-hoc manner (e.g., using the KNN graph) that cannot cope properly with the rapid and frequent changes in data appearance, a phenomenon intrinsically related with video sequences. In this paper, we
propose a novel approach for efficient and adaptive graph construction, based on a two-phase scheme: (i) the first phase is used to adaptively find the neighbors of a sample and also to find the adequate weights for the minimization function of the second phase; (ii) in the second phase, the
selected neighbors along with their corresponding weights are used to locally and collaboratively estimate the sparse affinity matrix weights. Experimental results performed on Honda Video Database (HVDB) and a subset of video
sequences extracted from the popular TV-series ’Friends’ show a distinct advantage of the proposed method over the existing standard graph construction methods.
|
|
|
M. Cruz, Cristhian A. Aguilera-Carrasco, Boris X. Vintimilla, Ricardo Toledo, & Angel Sappa. (2015). Cross-spectral image registration and fusion: an evaluation study. In 2nd International Conference on Machine Vision and Machine Learning.
Abstract: This paper presents a preliminary study on the registration and fusion of cross-spectral imaging. The objective is to evaluate the validity of widely used computer vision approaches when they are applied at different
spectral bands. In particular, we are interested in merging images from the infrared (both long wave infrared: LWIR and near infrared: NIR) and visible spectrum (VS). Experimental results with different data sets are presented.
Keywords: multispectral imaging; image registration; data fusion; infrared and visible spectra
|
|
|
Cristhian A. Aguilera-Carrasco, Angel Sappa, & Ricardo Toledo. (2015). LGHD: a Feature Descriptor for Matching Across Non-Linear Intensity Variations. In 22th IEEE International Conference on Image Processing (pp. 178–181).
|
|
|
Jiaolong Xu. (2015). Domain Adaptation of Deformable Part-based Models (Antonio Lopez, Ed.). Ph.D. thesis, , .
Abstract: On-board pedestrian detection is crucial for Advanced Driver Assistance Systems
(ADAS). An accurate classication is fundamental for vision-based pedestrian detection.
The underlying assumption for learning classiers is that the training set and the deployment environment (testing) follow the same probability distribution regarding the features used by the classiers. However, in practice, there are dierent reasons that can break this constancy assumption. Accordingly, reusing existing classiers by adapting them from the previous training environment (source domain) to the new testing one (target domain) is an approach with increasing acceptance in the computer vision community. In this thesis we focus on the domain adaptation of deformable part-based models (DPMs) for pedestrian detection. As a prof of concept, we use a computer graphic based synthetic dataset, i.e. a virtual world, as the source domain, and adapt the virtual-world trained DPM detector to various real-world dataset.
We start by exploiting the maximum detection accuracy of the virtual-world
trained DPM. Even though, when operating in various real-world datasets, the virtualworld trained detector still suer from accuracy degradation due to the domain gap of virtual and real worlds. We then focus on domain adaptation of DPM. At the rst step, we consider single source and single target domain adaptation and propose two batch learning methods, namely A-SSVM and SA-SSVM. Later, we further consider leveraging multiple target (sub-)domains for progressive domain adaptation and propose a hierarchical adaptive structured SVM (HA-SSVM) for optimization. Finally, we extend HA-SSVM for the challenging online domain adaptation problem, aiming at making the detector to automatically adapt to the target domain online, without any human intervention. All of the proposed methods in this thesis do not require
revisiting source domain data. The evaluations are done on the Caltech pedestrian detection benchmark. Results show that SA-SSVM slightly outperforms A-SSVM and avoids accuracy drops as high as 15 points when comparing with a non-adapted detector. The hierarchical model learned by HA-SSVM further boosts the domain adaptation performance. Finally, the online domain adaptation method has demonstrated that it can achieve comparable accuracy to the batch learned models while not requiring manually label target domain examples. Domain adaptation for pedestrian detection is of paramount importance and a relatively unexplored area. We humbly hope the work in this thesis could provide foundations for future work in this area.
|
|