|
Dani Rowe, Jordi Gonzalez, Marco Pedersoli, & Juan J. Villanueva. (2010). On Tracking Inside Groups. MVA - Machine Vision and Applications, 21(2), 113–127.
Abstract: This work develops a new architecture for multiple-target tracking in unconstrained dynamic scenes, which consists of a detection level which feeds a two-stage tracking system. A remarkable characteristic of the system is its ability to track several targets while they group and split, without using 3D information. Thus, special attention is given to the feature-selection and appearance-computation modules, and to those modules involved in tracking through groups. The system aims to work as a stand-alone application in complex and dynamic scenarios. No a-priori knowledge about either the scene or the targets, based on a previous training period, is used. Hence, the scenario is completely unknown beforehand. Successful tracking has been demonstrated in well-known databases of both indoor and outdoor scenarios. Accurate and robust localisations have been yielded during long-term target merging and occlusions.
|
|
|
Miquel Ferrer, Dimosthenis Karatzas, Ernest Valveny, & Horst Bunke. (2009). A Recursive Embedding Approach to Median Graph Computation. In 7th IAPR – TC–15 Workshop on Graph–Based Representations in Pattern Recognition (Vol. 5534, 113–123). LNCS. Springer Berlin Heidelberg.
Abstract: The median graph has been shown to be a good choice to infer a representative of a set of graphs. It has been successfully applied to graph-based classification and clustering. Nevertheless, its computation is extremely complex. Several approaches have been presented up to now based on different strategies. In this paper we present a new approximate recursive algorithm for median graph computation based on graph embedding into vector spaces. Preliminary experiments on three databases show that this new approach is able to obtain better medians than the previous existing approaches.
|
|
|
Jose Manuel Alvarez, & Antonio Lopez. (2012). Photometric Invariance by Machine Learning. In Jan-Mark Geusebroek Joost van de Weijer A. G. Theo Gevers (Ed.), Color in Computer Vision: Fundamentals and Applications (Vol. 7, pp. 113–134). iConcept Press Ltd.
|
|
|
Miguel Oliveira, Victor Santos, Angel Sappa, P. Dias, & A. Moreira. (2016). Incremental texture mapping for autonomous driving. RAS - Robotics and Autonomous Systems, 84, 113–128.
Abstract: Autonomous vehicles have a large number of on-board sensors, not only for providing coverage all around the vehicle, but also to ensure multi-modality in the observation of the scene. Because of this, it is not trivial to come up with a single, unique representation that feeds from the data given by all these sensors. We propose an algorithm which is capable of mapping texture collected from vision based sensors onto a geometric description of the scenario constructed from data provided by 3D sensors. The algorithm uses a constrained Delaunay triangulation to produce a mesh which is updated using a specially devised sequence of operations. These enforce a partial configuration of the mesh that avoids bad quality textures and ensures that there are no gaps in the texture. Results show that this algorithm is capable of producing fine quality textures.
Keywords: Scene reconstruction; Autonomous driving; Texture mapping
|
|
|
Joel Barajas, Jaume Garcia, Karla Lizbeth Caballero, Francesc Carreras, Sandra Pujades, & Petia Radeva. (2006). Correction of Misalignment Artifacts Among 2-D Cardiac MR Images in 3-D Space. In 1st International Wokshop on Computer Vision for Intravascular and Intracardiac Imaging (CVII’06) (Vol. 3217, pp. 114–121). Copenhagen (Denmark).
Abstract: Cardiac Magnetic Resonance images offer the opportunity to study the heart in detail. One of the main issues in its modelling is to create an accurate 3-D reconstruction of the left ventricle from 2-D views. A first step to achieve this goal is the correct registration among the different image planes due to patient movements. In this article, we present an accurate method to correct displacement artifacts using the Normalized Mutual Information. Here, the image views are treated as planes in order to diminish the approximation error caused by the association of a certain thickness, and moved simultaneously to avoid any kind of bias in the alignment process. This method has been validated using real and syntectic plane displacements, yielding promising results.
|
|
|
Angel Sappa, & Boris X. Vintimilla. (2008). Edge Point Linking by Means of Global and Local Schemes. In E. Damiani (Ed.), in Signal Processing for Image Enhancement and Multimedia Processing (Vol. 11, 115–125). Springer.
|
|
|
Carme Julia, Felipe Lumbreras, & Angel Sappa. (2011). A Factorization-based Approach to Photometric Stereo. IJIST - International Journal of Imaging Systems and Technology, 21(1), 115–119.
Abstract: This article presents an adaptation of a factorization technique to tackle the photometric stereo problem. That is to recover the surface normals and reflectance of an object from a set of images obtained under different lighting conditions. The main contribution of the proposed approach is to consider pixels in shadow and saturated regions as missing data, in order to reduce their influence to the result. Concretely, an adapted Alternation technique is used to deal with missing data. Experimental results considering both synthetic and real images show the viability of the proposed factorization-based strategy. © 2011 Wiley Periodicals, Inc. Int J Imaging Syst Technol, 21, 115–119, 2011.
|
|
|
Pau Riba, Josep Llados, & Alicia Fornes. (2020). Hierarchical graphs for coarse-to-fine error tolerant matching. PRL - Pattern Recognition Letters, 134, 116–124.
Abstract: During the last years, graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their ability to capture both structural and appearance-based information. Thus, they provide a greater representational power than classical statistical frameworks. However, graph-based representations leads to high computational complexities usually dealt by graph embeddings or approximated matching techniques. Despite their representational power, they are very sensitive to noise and small variations of the input image. With the aim to cope with the time complexity and the variability present in the generated graphs, in this paper we propose to construct a novel hierarchical graph representation. Graph clustering techniques adapted from social media analysis have been used in order to contract a graph at different abstraction levels while keeping information about the topology. Abstract nodes attributes summarise information about the contracted graph partition. For the proposed representations, a coarse-to-fine matching technique is defined. Hence, small graphs are used as a filtering before more accurate matching methods are applied. This approach has been validated in real scenarios such as classification of colour images or retrieval of handwritten words (i.e. word spotting).
Keywords: Hierarchical graph representation; Coarse-to-fine graph matching; Graph-based retrieval
|
|
|
Sergio Escalera, Oriol Pujol, & Petia Radeva. (2008). Loss-Weighted Decoding for Error-Correcting Output Coding. In 3rd International Conference on Computer Vision Theory and Applications, (Vol. 2, 117–122).
|
|
|
Michal Drozdzal, Laura Igual, Petia Radeva, Jordi Vitria, C. Malagelada, & Fernando Azpiroz. (2010). Aligning Endoluminal Scene Sequences in Wireless Capsule Endoscopy. In IEEE Computer Society Workshop on Mathematical Methods in Biomedical Image Analysis (117–124).
Abstract: Intestinal motility analysis is an important examination in detection of various intestinal malfunctions. One of the big challenges of automatic motility analysis is how to compare sequence of images and extract dynamic paterns taking into account the high deformability of the intestine wall as well as the capsule motion. From clinical point of view the ability to align endoluminal scene sequences will help to find regions of similar intestinal activity and in this way will provide a valuable information on intestinal motility problems. This work, for first time, addresses the problem of aligning endoluminal sequences taking into account motion and structure of the intestine. To describe motility in the sequence, we propose different descriptors based on the Sift Flow algorithm, namely: (1) Histograms of Sift Flow Directions to describe the flow course, (2) Sift Descriptors to represent image intestine structure and (3) Sift Flow Magnitude to quantify intestine deformation. We show that the merge of all three descriptors provides robust information on sequence description in terms of motility. Moreover, we develop a novel methodology to rank the intestinal sequences based on the expert feedback about relevance of the results. The experimental results show that the selected descriptors are useful in the alignment and similarity description and the proposed method allows the analysis of the WCE.
|
|
|
Nuria Cirera, Alicia Fornes, Volkmar Frinken, & Josep Llados. (2013). Hybrid grammar language model for handwritten historical documents recognition. In 6th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 7887, pp. 117–124). LNCS. Springer Berlin Heidelberg.
Abstract: In this paper we present a hybrid language model for the recognition of handwritten historical documents with a structured syntactical layout. Using a hidden Markov model-based recognition framework, a word-based grammar with a closed dictionary is enhanced by a character sequence recognition method. This allows to recognize out-of-dictionary words in controlled parts of the recognition, while keeping a closed vocabulary restriction for other parts. While the current status is work in progress, we can report an improvement in terms of character error rate.
|
|
|
Mark Philip Philipsen, Jacob Velling Dueholm, Anders Jorgensen, Sergio Escalera, & Thomas B. Moeslund. (2018). Organ Segmentation in Poultry Viscera Using RGB-D. SENS - Sensors, 18(1), 117.
Abstract: We present a pattern recognition framework for semantic segmentation of visual structures, that is, multi-class labelling at pixel level, and apply it to the task of segmenting organs in the eviscerated viscera from slaughtered poultry in RGB-D images. This is a step towards replacing the current strenuous manual inspection at poultry processing plants. Features are extracted from feature maps such as activation maps from a convolutional neural network (CNN). A random forest classifier assigns class probabilities, which are further refined by utilizing context in a conditional random field. The presented method is compatible with both 2D and 3D features, which allows us to explore the value of adding 3D and CNN-derived features. The dataset consists of 604 RGB-D images showing 151 unique sets of eviscerated viscera from four different perspectives. A mean Jaccard index of 78.11% is achieved across the four classes of organs by using features derived from 2D, 3D and a CNN, compared to 74.28% using only basic 2D image features.
Keywords: semantic segmentation; RGB-D; random forest; conditional random field; 2D; 3D; CNN
|
|
|
Joan Mas, Gemma Sanchez, & Josep Llados. (2010). SSP: Sketching slide Presentations, a Syntactic Approach. In Graphics Recognition. Achievements, Challenges, and Evolution. 8th International Workshop, GREC 2009. Selected Papers (Vol. 6020, pp. 118–129). LNCS. Springer Berlin Heidelberg.
Abstract: The design of a slide presentation is a creative process. In this process first, humans visualize in their minds what they want to explain. Then, they have to be able to represent this knowledge in an understandable way. There exists a lot of commercial software that allows to create our own slide presentations but the creativity of the user is rather limited. In this article we present an application that allows the user to create and visualize a slide presentation from a sketch. A slide may be seen as a graphical document or a diagram where its elements are placed in a particular spatial arrangement. To describe and recognize slides a syntactic approach is proposed. This approach is based on an Adjacency Grammar and a parsing methodology to cope with this kind of grammars. The experimental evaluation shows the performance of our methodology from a qualitative and a quantitative point of view. Six different slides containing different number of symbols, from 4 to 7, have been given to the users and they have drawn them without restrictions in the order of the elements. The quantitative results give an idea on how suitable is our methodology to describe and recognize the different elements in a slide.
|
|
|
Pichao Wang, Wanqing Li, Philip Ogunbona, Jun Wan, & Sergio Escalera. (2018). RGB-D-based Human Motion Recognition with Deep Learning: A Survey. CVIU - Computer Vision and Image Understanding, 171, 118–139.
Abstract: Human motion recognition is one of the most important branches of human-centered research activities. In recent years, motion recognition based on RGB-D data has attracted much attention. Along with the development in artificial intelligence, deep learning techniques have gained remarkable success in computer vision. In particular, convolutional neural networks (CNN) have achieved great success for image-based tasks, and recurrent neural networks (RNN) are renowned for sequence-based problems. Specifically, deep learning methods based on the CNN and RNN architectures have been adopted for motion recognition using RGB-D data. In this paper, a detailed overview of recent advances in RGB-D-based motion recognition is presented. The reviewed methods are broadly categorized into four groups, depending on the modality adopted for recognition: RGB-based, depth-based, skeleton-based and RGB+D-based. As a survey focused on the application of deep learning to RGB-D-based motion recognition, we explicitly discuss the advantages and limitations of existing techniques. Particularly, we highlighted the methods of encoding spatial-temporal-structural information inherent in video sequence, and discuss potential directions for future research.
Keywords: Human motion recognition; RGB-D data; Deep learning; Survey
|
|
|
Thanh Nam Le, Muhammad Muzzamil Luqman, Anjan Dutta, Pierre Heroux, Christophe Rigaud, Clement Guerin, et al. (2018). Subgraph spotting in graph representations of comic book images. PRL - Pattern Recognition Letters, 112, 118–124.
Abstract: Graph-based representations are the most powerful data structures for extracting, representing and preserving the structural information of underlying data. Subgraph spotting is an interesting research problem, especially for studying and investigating the structural information based content-based image retrieval (CBIR) and query by example (QBE) in image databases. In this paper we address the problem of lack of freely available ground-truthed datasets for subgraph spotting and present a new dataset for subgraph spotting in graph representations of comic book images (SSGCI) with its ground-truth and evaluation protocol. Experimental results of two state-of-the-art methods of subgraph spotting are presented on the new SSGCI dataset.
Keywords: Attributed graph; Region adjacency graph; Graph matching; Graph isomorphism; Subgraph isomorphism; Subgraph spotting; Graph indexing; Graph retrieval; Query by example; Dataset and comic book images
|
|