|
Angel Sappa, Fadi Dornaika, David Geronimo, & Antonio Lopez. (2008). Registration-based Moving Object Detection from a Moving Camera. In IROS2008 2nd Workshop on Perception, Planning and Navigation for Intelligent Vehicles (65–69).
Abstract: This paper presents a robust approach for detecting moving objects from on-board stereo vision systems. It relies on a feature point quaternion-based registration, which avoids common problems that appear when computationally expensive iterative-based algorithms are used on dynamic environments. The proposed approach consists of three stages. Initially, feature points are extracted and tracked through consecutive frames. Then, a RANSAC based approach is used for registering
two 3D point sets with known correspondences by means of the quaternion method. Finally, the computed 3D rigid displacement is used to map two consecutive frames into the same coordinate system. Moving objects correspond to those areas with large registration errors. Experimental results, in different scenarios, show the viability of the proposed approach.
|
|
|
Pau Riba, Jon Almazan, Alicia Fornes, David Fernandez, Ernest Valveny, & Josep Llados. (2014). e-Crowds: a mobile platform for browsing and searching in historical demographyrelated manuscripts. In 14th International Conference on Frontiers in Handwriting Recognition (pp. 228–233).
Abstract: This paper presents a prototype system running on portable devices for browsing and word searching through historical handwritten document collections. The platform adapts the paradigm of eBook reading, where the narrative is not necessarily sequential, but centered on the user actions. The novelty is to replace digitally born books by digitized historical manuscripts of marriage licenses, so document analysis tasks are required in the browser. With an active reading paradigm, the user can cast queries of people names, so he/she can implicitly follow genealogical links. In addition, the system allows combined searches: the user can refine a search by adding more words to search. As a second contribution, the retrieval functionality involves as a core technology a word spotting module with an unified approach, which allows combined query searches, and also two input modalities: query-by-example, and query-by-string.
|
|
|
M. Cruz, Cristhian A. Aguilera-Carrasco, Boris X. Vintimilla, Ricardo Toledo, & Angel Sappa. (2015). Cross-spectral image registration and fusion: an evaluation study. In 2nd International Conference on Machine Vision and Machine Learning.
Abstract: This paper presents a preliminary study on the registration and fusion of cross-spectral imaging. The objective is to evaluate the validity of widely used computer vision approaches when they are applied at different
spectral bands. In particular, we are interested in merging images from the infrared (both long wave infrared: LWIR and near infrared: NIR) and visible spectrum (VS). Experimental results with different data sets are presented.
Keywords: multispectral imaging; image registration; data fusion; infrared and visible spectra
|
|
|
Mariano Vazquez, Ruth Aris, Guillaume Hozeaux, R.Aubry, P.Villar, Jaume Garcia, et al. (2011). A massively parallel computational electrophysiology model of the heart. IJNMBE - International Journal for Numerical Methods in Biomedical Engineering, 27, 1911–1929.
Abstract: This paper presents a patient-sensitive simulation strategy capable of using the most efficient way the high-performance computational resources. The proposed strategy directly involves three different players: Computational Mechanics Scientists (CMS), Image Processing Scientists and Cardiologists, each one mastering its own expertise area within the project. This paper describes the general integrative scheme but focusing on the CMS side presents a massively parallel implementation of computational electrophysiology applied to cardiac tissue simulation. The paper covers different angles of the computational problem: equations, numerical issues, the algorithm and parallel implementation. The proposed methodology is illustrated with numerical simulations testing all the different possibilities, ranging from small domains up to very large ones. A key issue is the almost ideal scalability not only for large and complex problems but also for medium-size meshes. The explicit formulation is particularly well suited for solving this highly transient problems, with very short time-scale.
Keywords: computational electrophysiology; parallelization; finite element methods
|
|
|
J. Chazalon, Marçal Rusiñol, Jean-Marc Ogier, & Josep Llados. (2015). A Semi-Automatic Groundtruthing Tool for Mobile-Captured Document Segmentation. In 13th International Conference on Document Analysis and Recognition ICDAR2015 (pp. 621–625).
Abstract: This paper presents a novel way to generate groundtruth data for the evaluation of mobile document capture systems, focusing on the first stage of the image processing pipeline involved: document object detection and segmentation in lowquality preview frames. We introduce and describe a simple, robust and fast technique based on color markers which enables a semi-automated annotation of page corners. We also detail a technique for marker removal. Methods and tools presented in the paper were successfully used to annotate, in few hours, 24889
frames in 150 video files for the smartDOC competition at ICDAR 2015
|
|
|
Patricia Suarez, Angel Sappa, Boris X. Vintimilla, & Riad I. Hammoud. (2021). Cycle Generative Adversarial Network: Towards A Low-Cost Vegetation Index Estimation. In 28th IEEE International Conference on Image Processing (pp. 19–22).
Abstract: This paper presents a novel unsupervised approach to estimate the Normalized Difference Vegetation Index (NDVI). The NDVI is obtained as the ratio between information from the visible and near infrared spectral bands; in the current work, the NDVI is estimated just from an image of the visible spectrum through a Cyclic Generative Adversarial Network (CyclicGAN). This unsupervised architecture learns to estimate the NDVI index by means of an image translation between the red channel of a given RGB image and the NDVI unpaired index’s image. The translation is obtained by means of a ResNET architecture and a multiple loss function. Experimental results obtained with this unsupervised scheme show the validity of the implemented model. Additionally, comparisons with the state of the art approaches are provided showing improvements with the proposed approach.
|
|
|
Cristina Palmero, Jordi Esquirol, Vanessa Bayo, Miquel Angel Cos, Pouya Ahmadmonfared, Joan Salabert, et al. (2017). Automatic Sleep System Recommendation by Multi-modal RBG-Depth-Pressure Anthropometric Analysis. IJCV - International Journal of Computer Vision, 122(2), 212–227.
Abstract: This paper presents a novel system for automatic sleep system recommendation using RGB, depth and pressure information. It consists of a validated clinical knowledge-based model that, along with a set of prescription variables extracted automatically, obtains a personalized bed design recommendation. The automatic process starts by performing multi-part human body RGB-D segmentation combining GrabCut, 3D Shape Context descriptor and Thin Plate Splines, to then extract a set of anthropometric landmark points by applying orthogonal plates to the segmented human body. The extracted variables are introduced to the computerized clinical model to calculate body circumferences, weight, morphotype and Body Mass Index categorization. Furthermore, pressure image analysis is performed to extract pressure values and at-risk points, which are also introduced to the model to eventually obtain the final prescription of mattress, topper, and pillow. We validate the complete system in a set of 200 subjects, showing accurate category classification and high correlation results with respect to manual measures.
Keywords: Sleep system recommendation; RGB-Depth data Pressure imaging; Anthropometric landmark extraction; Multi-part human body segmentation
|
|
|
Jorge Charco, Angel Sappa, Boris X. Vintimilla, & Henry Velesaca. (2020). Transfer Learning from Synthetic Data in the Camera Pose Estimation Problem. In 15th International Conference on Computer Vision Theory and Applications.
Abstract: This paper presents a novel Siamese network architecture, as a variant of Resnet-50, to estimate the relative camera pose on multi-view environments. In order to improve the performance of the proposed model a transfer learning strategy, based on synthetic images obtained from a virtual-world, is considered. The transfer learning consists of first training the network using pairs of images from the virtual-world scenario
considering different conditions (i.e., weather, illumination, objects, buildings, etc.); then, the learned weight
of the network are transferred to the real case, where images from real-world scenarios are considered. Experimental results and comparisons with the state of the art show both, improvements on the relative pose estimation accuracy using the proposed model, as well as further improvements when the transfer learning strategy (synthetic-world data transfer learning real-world data) is considered to tackle the limitation on the
training due to the reduced number of pairs of real-images on most of the public data sets.
|
|
|
Marcelo D. Pistarelli, Angel Sappa, & Ricardo Toledo. (2013). Multispectral Stereo Image Correspondence. In 15th International Conference on Computer Analysis of Images and Patterns (Vol. 8048, pp. 217–224). LNCS. Springer Berlin Heidelberg.
Abstract: This paper presents a novel multispectral stereo image correspondence approach. It is evaluated using a stereo rig constructed with a visible spectrum camera and a long wave infrared spectrum camera. The novelty of the proposed approach lies on the usage of Hough space as a correspondence search domain. In this way it avoids searching for correspondence in the original multispectral image domains, where information is low correlated, and a common domain is used. The proposed approach is intended to be used in outdoor urban scenarios, where images contain large amount of edges. These edges are used as distinctive characteristics for the matching in the Hough space. Experimental results are provided showing the validity of the proposed approach.
|
|
|
Dennis G.Romero, Anselmo Frizera, Angel Sappa, Boris X. Vintimilla, & Teodiano F.Bastos. (2015). A predictive model for human activity recognition by observing actions and context. In Advanced Concepts for Intelligent Vision Systems, Proceedings of 16th International Conference, ACIVS 2015 (Vol. 9386, pp. 323–333). LNCS. Springer International Publishing.
Abstract: This paper presents a novel model to estimate human activities — a human activity is defined by a set of human actions. The proposed approach is based on the usage of Recurrent Neural Networks (RNN) and Bayesian inference through the continuous monitoring of human actions and its surrounding environment. In the current work human activities are inferred considering not only visual analysis but also additional resources; external sources of information, such as context information, are incorporated to contribute to the activity estimation. The novelty of the proposed approach lies in the way the information is encoded, so that it can be later associated according to a predefined semantic structure. Hence, a pattern representing a given activity can be defined by a set of actions, plus contextual information or other kind of information that could be relevant to describe the activity. Experimental results with real data are provided showing the validity of the proposed approach.
|
|
|
Mohammad Rouhani, & Angel Sappa. (2010). Relaxing the 3L Algorithm for an Accurate Implicit Polynomial Fitting. In 23rd IEEE Conference on Computer Vision and Pattern Recognition (pp. 3066–3072).
Abstract: This paper presents a novel method to increase the accuracy of linear fitting of implicit polynomials. The proposed method is based on the 3L algorithm philosophy. The novelty lies on the relaxation of the additional constraints, already imposed by the 3L algorithm. Hence, the accuracy of the final solution is increased due to the proper adjustment of the expected values in the aforementioned additional constraints. Although iterative, the proposed approach solves the fitting problem within a linear framework, which is independent of the threshold tuning. Experimental results, both in 2D and 3D, showing improvements in the accuracy of the fitting are presented. Comparisons with both state of the art algorithms and a geometric based one (non-linear fitting), which is used as a ground truth, are provided.
|
|
|
Adria Molina, Pau Riba, Lluis Gomez, Oriol Ramos Terrades, & Josep Llados. (2021). Date Estimation in the Wild of Scanned Historical Photos: An Image Retrieval Approach. In 16th International Conference on Document Analysis and Recognition (Vol. 12822, pp. 306–320). LNCS.
Abstract: This paper presents a novel method for date estimation of historical photographs from archival sources. The main contribution is to formulate the date estimation as a retrieval task, where given a query, the retrieved images are ranked in terms of the estimated date similarity. The closer are their embedded representations the closer are their dates. Contrary to the traditional models that design a neural network that learns a classifier or a regressor, we propose a learning objective based on the nDCG ranking metric. We have experimentally evaluated the performance of the method in two different tasks: date estimation and date-sensitive image retrieval, using the DEW public database, overcoming the baseline methods.
|
|
|
Meysam Madadi, Sergio Escalera, Jordi Gonzalez, Xavier Roca, & Felipe Lumbreras. (2015). Multi-part body segmentation based on depth maps for soft biometry analysis. PRL - Pattern Recognition Letters, 56, 14–21.
Abstract: This paper presents a novel method extracting biometric measures using depth sensors. Given a multi-part labeled training data, a new subject is aligned to the best model of the dataset, and soft biometrics such as lengths or circumference sizes of limbs and body are computed. The process is performed by training relevant pose clusters, defining a representative model, and fitting a 3D shape context descriptor within an iterative matching procedure. We show robust measures by applying orthogonal plates to body hull. We test our approach in a novel full-body RGB-Depth data set, showing accurate estimation of soft biometrics and better segmentation accuracy in comparison with random forest approach without requiring large training data.
Keywords: 3D shape context; 3D point cloud alignment; Depth maps; Human body segmentation; Soft biometry analysis
|
|
|
Mohammad Rouhani, & Angel Sappa. (2010). A Fast accurate Implicit Polynomial Fitting Approach. In 17th IEEE International Conference on Image Processing (1429–1432).
Abstract: This paper presents a novel hybrid approach that combines state of the art fitting algorithms: algebraic-based and geometric-based. It consists of two steps; first, the 3L algorithm is used as an initialization and then, the obtained result, is improved through a geometric approach. The adopted geometric approach is based on a distance estimation that avoids costly search for the real orthogonal distance. Experimental results are presented as well as quantitative comparisons.
|
|
|
Mohammad Rouhani, & Angel Sappa. (2011). Correspondence Free Registration through a Point-to-Model Distance Minimization. In 13th IEEE International Conference on Computer Vision (pp. 2150–2157).
Abstract: This paper presents a novel formulation, which derives in a smooth minimization problem, to tackle the rigid registration between a given point set and a model set. Unlike most of the existing works, which are based on minimizing a point-wise correspondence term, we propose to describe the model set by means of an implicit representation. It allows a new definition of the registration error, which works beyond the point level representation. Moreover, it could be used in a gradient-based optimization framework. The proposed approach consists of two stages. Firstly, a novel formulation is proposed that relates the registration parameters with the distance between the model and data set. Secondly, the registration parameters are obtained by means of the Levengberg-Marquardt algorithm. Experimental results and comparisons with state of the art show the validity of the proposed framework.
|
|