|
Xavier Otazu, Olivier Penacchio, & Xim Cerda-Company. (2015). An excitatory-inhibitory firing rate model accounts for brightness induction, colour induction and visual discomfort. In Barcelona Computational, Cognitive and Systems Neuroscience.
|
|
|
Olivier Penacchio, Xavier Otazu, A. wilkins, & J. Harris. (2015). Uncomfortable images prevent lateral interactions in the cortex from providing a sparse code. In European Conference on Visual Perception ECVP2015.
|
|
|
Xavier Otazu, Olivier Penacchio, & Xim Cerda-Company. (2015). Brightness and colour induction through contextual influences in V1. In Scottish Vision Group 2015 SGV2015 (Vol. 12, pp. 1208–2012).
|
|
|
Jiaolong Xu. (2015). Domain Adaptation of Deformable Part-based Models (Antonio Lopez, Ed.). Ph.D. thesis, , .
Abstract: On-board pedestrian detection is crucial for Advanced Driver Assistance Systems
(ADAS). An accurate classication is fundamental for vision-based pedestrian detection.
The underlying assumption for learning classiers is that the training set and the deployment environment (testing) follow the same probability distribution regarding the features used by the classiers. However, in practice, there are dierent reasons that can break this constancy assumption. Accordingly, reusing existing classiers by adapting them from the previous training environment (source domain) to the new testing one (target domain) is an approach with increasing acceptance in the computer vision community. In this thesis we focus on the domain adaptation of deformable part-based models (DPMs) for pedestrian detection. As a prof of concept, we use a computer graphic based synthetic dataset, i.e. a virtual world, as the source domain, and adapt the virtual-world trained DPM detector to various real-world dataset.
We start by exploiting the maximum detection accuracy of the virtual-world
trained DPM. Even though, when operating in various real-world datasets, the virtualworld trained detector still suer from accuracy degradation due to the domain gap of virtual and real worlds. We then focus on domain adaptation of DPM. At the rst step, we consider single source and single target domain adaptation and propose two batch learning methods, namely A-SSVM and SA-SSVM. Later, we further consider leveraging multiple target (sub-)domains for progressive domain adaptation and propose a hierarchical adaptive structured SVM (HA-SSVM) for optimization. Finally, we extend HA-SSVM for the challenging online domain adaptation problem, aiming at making the detector to automatically adapt to the target domain online, without any human intervention. All of the proposed methods in this thesis do not require
revisiting source domain data. The evaluations are done on the Caltech pedestrian detection benchmark. Results show that SA-SSVM slightly outperforms A-SSVM and avoids accuracy drops as high as 15 points when comparing with a non-adapted detector. The hierarchical model learned by HA-SSVM further boosts the domain adaptation performance. Finally, the online domain adaptation method has demonstrated that it can achieve comparable accuracy to the batch learned models while not requiring manually label target domain examples. Domain adaptation for pedestrian detection is of paramount importance and a relatively unexplored area. We humbly hope the work in this thesis could provide foundations for future work in this area.
|
|
|
Cristhian A. Aguilera-Carrasco, Angel Sappa, & Ricardo Toledo. (2015). LGHD: a Feature Descriptor for Matching Across Non-Linear Intensity Variations. In 22th IEEE International Conference on Image Processing (pp. 178–181).
|
|
|
M. Cruz, Cristhian A. Aguilera-Carrasco, Boris X. Vintimilla, Ricardo Toledo, & Angel Sappa. (2015). Cross-spectral image registration and fusion: an evaluation study. In 2nd International Conference on Machine Vision and Machine Learning.
Abstract: This paper presents a preliminary study on the registration and fusion of cross-spectral imaging. The objective is to evaluate the validity of widely used computer vision approaches when they are applied at different
spectral bands. In particular, we are interested in merging images from the infrared (both long wave infrared: LWIR and near infrared: NIR) and visible spectrum (VS). Experimental results with different data sets are presented.
Keywords: multispectral imaging; image registration; data fusion; infrared and visible spectra
|
|
|
Marco Pedersoli, Andrea Vedaldi, Jordi Gonzalez, & Xavier Roca. (2015). A coarse-to-fine approach for fast deformable object detection. PR - Pattern Recognition, 48(5), 1844–1853.
Abstract: We present a method that can dramatically accelerate object detection with part based models. The method is based on the observation that the cost of detection is likely to be dominated by the cost of matching each part to the image, and not by the cost of computing the optimal configuration of the parts as commonly assumed. Therefore accelerating detection requires minimizing the number of
part-to-image comparisons. To this end we propose a multiple-resolutions hierarchical part based model and a corresponding coarse-to-fine inference procedure that recursively eliminates from the search space unpromising part
placements. The method yields a ten-fold speedup over the standard dynamic programming approach and is complementary to the cascade-of-parts approach of [9]. Compared to the latter, our method does not have parameters to be determined empirically, which simplifies its use during the training of the model. Most importantly, the two techniques can be combined to obtain a very significant speedup, of two orders of magnitude in some cases. We evaluate our method extensively on the PASCAL VOC and INRIA datasets, demonstrating a very high increase in the detection speed with little degradation of the accuracy.
|
|
|
Bogdan Raducanu, Alireza Bosaghzadeh, & Fadi Dornaika. (2015). Multi-observation Face Recognition in Videos based on Label Propagation. In 6th Workshop on Analysis and Modeling of Faces and Gestures AMFG2015 (pp. 10–17).
Abstract: In order to deal with the huge amount of content generated by social media, especially for indexing and retrieval purposes, the focus shifted from single object recognition to multi-observation object recognition. Of particular interest is the problem of face recognition (used as primary cue for persons’ identity assessment), since it is highly required by popular social media search engines like Facebook and Youtube. Recently, several approaches for graph-based label propagation were proposed. However, the associated graphs were constructed in an ad-hoc manner (e.g., using the KNN graph) that cannot cope properly with the rapid and frequent changes in data appearance, a phenomenon intrinsically related with video sequences. In this paper, we
propose a novel approach for efficient and adaptive graph construction, based on a two-phase scheme: (i) the first phase is used to adaptively find the neighbors of a sample and also to find the adequate weights for the minimization function of the second phase; (ii) in the second phase, the
selected neighbors along with their corresponding weights are used to locally and collaboratively estimate the sparse affinity matrix weights. Experimental results performed on Honda Video Database (HVDB) and a subset of video
sequences extracted from the popular TV-series ’Friends’ show a distinct advantage of the proposed method over the existing standard graph construction methods.
|
|
|
Youssef El Rhabi, Simon Loic, & Brun Luc. (2015). Estimation de la pose d’une caméra à partir d’un flux vidéo en s’approchant du temps réel. In 15ème édition d'ORASIS, journées francophones des jeunes chercheurs en vision par ordinateur ORASIS2015.
Abstract: Finding a way to estimate quickly and robustly the pose of an image is essential in augmented reality. Here we will discuss the approach we chose in order to get closer to real time by using SIFT points [4]. We propose a method based on filtering both SIFT points and images on which to focus on. Hence we will focus on relevant data.
Keywords: Augmented Reality; SFM; SLAM; real time pose computation; 2D/3D registration
|
|
|
Alejandro Gonzalez Alzate, Gabriel Villalonga, Jiaolong Xu, David Vazquez, Jaume Amores, & Antonio Lopez. (2015). Multiview Random Forest of Local Experts Combining RGB and LIDAR data for Pedestrian Detection. In IEEE Intelligent Vehicles Symposium IV2015 (pp. 356–361).
Abstract: Despite recent significant advances, pedestrian detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities and a strong multi-view classifier that accounts for different pedestrian views and poses. In this paper we provide an extensive evaluation that gives insight into how each of these aspects (multi-cue, multimodality and strong multi-view classifier) affect performance both individually and when integrated together. In the multimodality component we explore the fusion of RGB and depth maps obtained by high-definition LIDAR, a type of modality that is only recently starting to receive attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the performance, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient. These simple blocks can be easily replaced with more sophisticated ones recently proposed, such as the use of convolutional neural networks for feature representation, to further improve the accuracy.
Keywords: Pedestrian Detection
|
|
|
Jorge Bernal, F. Javier Sanchez, Cristina Rodriguez de Miguel, & Gloria Fernandez Esparrach. (2015). Bulding up the future of colonoscopy: A synergy between clinicians and computer scientists. In Colonoscopy and Colorectal Cancer.
Abstract: Recent advances in endoscopic technology have generated an increasing interest in strengthening the collaboration between clinicians and computers scientist to develop intelligent systems that can provide additional information to clinicians in the different stages of an intervention. The objective of this chapter is to identify clinical drawbacks of colonoscopy in order to define potential areas of collaboration. Once areas are defined, we present the challenges that colonoscopy images present in order computational methods to provide with meaningful output, including those related to image formation and acquisition, as they are proven to have an impact in the performance of an intelligent system. Finally, we also propose how to define validation frameworks in order to assess the performance of a given method, making an special emphasis on how databases should be created and annotated and which metrics should be used to evaluate systems correctly.
Keywords: Intelligent systems; Image properties; Validation; Clinical drawbacks; Endoluminal scene description
|
|
|
Pau Riba, Josep Llados, Alicia Fornes, & Anjan Dutta. (2015). Large-scale Graph Indexing using Binary Embeddings of Node Contexts. In C.-L.Liu, B.Luo, W.G.Kropatsch, & J.Cheng (Eds.), 10th IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition (Vol. 9069, pp. 208–217). LNCS. Springer International Publishing.
Abstract: Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations in terms of feature vectors. Retrieving a query graph from a large dataset of graphs has the drawback of the high computational complexity required to compare the query and the target graphs. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. In this paper we propose a fast indexation formalism for graph retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Hence, each attribute counts the length of a walk of order k originated in a vertex with label l. Each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in a handwritten word spotting scenario in images of historical documents.
Keywords: Graph matching; Graph indexing; Application in document analysis; Word spotting; Binary embedding
|
|
|
Olivier Lefebvre, Pau Riba, Charles Fournier, Alicia Fornes, Josep Llados, Rejean Plamondon, et al. (2015). Monitoring neuromotricity on-line: a cloud computing approach. In 17th Conference of the International Graphonomics Society IGS2015.
Abstract: The goal of our experiment is to develop a useful and accessible tool that can be used to evaluate a patient's health by analyzing handwritten strokes. We use a cloud computing approach to analyze stroke data sampled on a commercial tablet working on the Android platform and a distant server to perform complex calculations using the Delta and Sigma lognormal algorithms. A Google Drive account is used to store the data and to ease the development of the project. The communication between the tablet, the cloud and the server is encrypted to ensure biomedical information confidentiality. Highly parameterized biomedical tests are implemented on the tablet as well as a free drawing test to evaluate the validity of the data acquired by the first test compared to the second one. A blurred shape model descriptor pattern recognition algorithm is used to classify the data obtained by the free drawing test. The functions presented in this paper are still currently under development and other improvements are needed before launching the application in the public domain.
|
|
|
Hanne Kause, Patricia Marquez, Andrea Fuster, Aura Hernandez-Sabate, Luc Florack, Debora Gil, et al. (2015). Quality Assessment of Optical Flow in Tagging MRI. In 5th Dutch Bio-Medical Engineering Conference BME2015.
|
|
|
Sergio Vera, Miguel Angel Gonzalez Ballester, & Debora Gil. (2015). A Novel Cochlear Reference Frame Based On The Laplace Equation. In 29th international Congress and Exhibition on Computer Assisted Radiology and Surgery (Vol. 10, pp. 1–312).
|
|