|
Miquel Ferrer, Ernest Valveny, F. Serratosa, K. Riesen, & Horst Bunke. (2010). Generalized Median Graph Computation by Means of Graph Embedding in Vector Spaces. PR - Pattern Recognition, 43(4), 1642–1655.
Abstract: The median graph has been presented as a useful tool to represent a set of graphs. Nevertheless its computation is very complex and the existing algorithms are restricted to use limited amount of data. In this paper we propose a new approach for the computation of the median graph based on graph embedding. Graphs are embedded into a vector space and the median is computed in the vector domain. We have designed a procedure based on the weighted mean of a pair of graphs to go from the vector domain back to the graph domain in order to obtain a final approximation of the median graph. Experiments on three different databases containing large graphs show that we succeed to compute good approximations of the median graph. We have also applied the median graph to perform some basic classification tasks achieving reasonable good results. These experiments on real data open the door to the application of the median graph to a number of more complex machine learning algorithms where a representative of a set of graphs is needed.
Keywords: Graph matching; Weighted mean of graphs; Median graph; Graph embedding; Vector spaces
|
|
|
Mathieu Nicolas Delalandre, Ernest Valveny, Tony Pridmore, & Dimosthenis Karatzas. (2010). Generation of Synthetic Documents for Performance Evaluation of Symbol Recognition & Spotting Systems. IJDAR - International Journal on Document Analysis and Recognition, 13(3), 187–207.
Abstract: This paper deals with the topic of performance evaluation of symbol recognition & spotting systems. We propose here a new approach to the generation of synthetic graphics documents containing non-isolated symbols in a real context. This approach is based on the definition of a set of constraints that permit us to place the symbols on a pre-defined background according to the properties of a particular domain (architecture, electronics, engineering, etc.). In this way, we can obtain a large amount of images resembling real documents by simply defining the set of constraints and providing a few pre-defined backgrounds. As documents are synthetically generated, the groundtruth (the location and the label of every symbol) becomes automatically available. We have applied this approach to the generation of a large database of architectural drawings and electronic diagrams, which shows the flexibility of the system. Performance evaluation experiments of a symbol localization system show that our approach permits to generate documents with different features that are reflected in variation of localization results.
|
|
|
Jose Manuel Alvarez, Felipe Lumbreras, Theo Gevers, & Antonio Lopez. (2010). Geographic Information for vision-based Road Detection. In IEEE Intelligent Vehicles Symposium (621–626).
Abstract: Road detection is a vital task for the development of autonomous vehicles. The knowledge of the free road surface ahead of the target vehicle can be used for autonomous driving, road departure warning, as well as to support advanced driver assistance systems like vehicle or pedestrian detection. Using vision to detect the road has several advantages in front of other sensors: richness of features, easy integration, low cost or low power consumption. Common vision-based road detection approaches use low-level features (such as color or texture) as visual cues to group pixels exhibiting similar properties. However, it is difficult to foresee a perfect clustering algorithm since roads are in outdoor scenarios being imaged from a mobile platform. In this paper, we propose a novel high-level approach to vision-based road detection based on geographical information. The key idea of the algorithm is exploiting geographical information to provide a rough detection of the road. Then, this segmentation is refined at low-level using color information to provide the final result. The results presented show the validity of our approach.
Keywords: road detection
|
|
|
Jaume Gibert, & Ernest Valveny. (2010). Graph Embedding based on Nodes Attributes Representatives and a Graph of Words Representation. In I. Ulusoy and F. Escolano T. Windeatt R. C. W. In E.R. Hancock (Ed.), 13th International worshop on structural and syntactic pattern recognition and 8th international worshop on statistical pattern recognition (Vol. 6218, 223–232). LNCS. Springer Berlin Heidelberg.
Abstract: Although graph embedding has recently been used to extend statistical pattern recognition techniques to the graph domain, some existing embeddings are usually computationally expensive as they rely on classical graph-based operations. In this paper we present a new way to embed graphs into vector spaces by first encapsulating the information stored in the original graph under another graph representation by clustering the attributes of the graphs to be processed. This new representation makes the association of graphs to vectors an easy step by just arranging both node attributes and the adjacency matrix in the form of vectors. To test our method, we use two different databases of graphs whose nodes attributes are of different nature. A comparison with a reference method permits to show that this new embedding is better in terms of classification rates, while being much more faster.
|
|
|
Jaume Gibert, Ernest Valveny, & Horst Bunke. (2010). Graph of Words Embedding for Molecular Structure-Activity Relationship Analysis. In 15th Iberoamerican Congress on Pattern Recognition (Vol. 6419, 30–37). LNCS.
Abstract: Structure-Activity relationship analysis aims at discovering chemical activity of molecular compounds based on their structure. In this article we make use of a particular graph representation of molecules and propose a new graph embedding procedure to solve the problem of structure-activity relationship analysis. The embedding is essentially an arrangement of a molecule in the form of a vector by considering frequencies of appearing atoms and frequencies of covalent bonds between them. Results on two benchmark databases show the effectiveness of the proposed technique in terms of recognition accuracy while avoiding high operational costs in the transformation.
|
|
|
Jean-Marc Ogier, Wenyin Liu, & Josep Llados (Eds.). (2010). Graphics Recognition: Achievements, Challenges, and Evolution (Vol. 6020). LNCS. Springer Link.
|
|
|
David Fernandez. (2010). Handwritten Word Spotting in Old Manuscript Images using Shape Descriptors (Vol. 161). Master's thesis, , .
|
|
|
Josep M. Gonfaus, Xavier Boix, Joost Van de Weijer, Andrew Bagdanov, Joan Serrat, & Jordi Gonzalez. (2010). Harmony Potentials for Joint Classification and Segmentation. In 23rd IEEE Conference on Computer Vision and Pattern Recognition (3280–3287).
Abstract: Hierarchical conditional random fields have been successfully applied to object segmentation. One reason is their ability to incorporate contextual information at different scales. However, these models do not allow multiple labels to be assigned to a single node. At higher scales in the image, this yields an oversimplified model, since multiple classes can be reasonable expected to appear within one region. This simplified model especially limits the impact that observations at larger scales may have on the CRF model. Neglecting the information at larger scales is undesirable since class-label estimates based on these scales are more reliable than at smaller, noisier scales. To address this problem, we propose a new potential, called harmony potential, which can encode any possible combination of class labels. We propose an effective sampling strategy that renders tractable the underlying optimization problem. Results show that our approach obtains state-of-the-art results on two challenging datasets: Pascal VOC 2009 and MSRC-21.
|
|
|
Ekain Artola. (2010). Human Attention Map Prediction Combining Visual Features (Vol. 160). Bachelor's thesis, , .
|
|
|
Francisco Javier Orozco. (2010). Human Emotion Evaluation on Facial Image Sequences (Jordi Gonzalez, & Xavier Roca, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: Psychological evidence has emphasized the importance of affective behaviour understanding due to its high impact in nowadays interaction humans and computers. All
type of affective and behavioural patterns such as gestures, emotions and mental
states are highly displayed through the face, head and body. Therefore, this thesis is
focused to analyse affective behaviours on head and face. To this end, head and facial
movements are encoded by using appearance based tracking methods. Specifically,
a wise combination of deformable models captures rigid and non-rigid movements of
different kinematics; 3D head pose, eyebrows, mouth, eyelids and irises are taken into
account as basis for extracting features from databases of video sequences. This approach combines the strengths of adaptive appearance models, optimization methods
and backtracking techniques.
For about thirty years, computer sciences have addressed the investigation on
human emotions to the automatic recognition of six prototypic emotions suggested
by Darwin and systematized by Paul Ekman in the seventies. The Facial Action
Coding System (FACS) which uses discrete movements of the face (called Action
units or AUs) to code the six facial emotions named anger, disgust, fear, happy-Joy,
sadness and surprise. However, human emotions are much complex patterns that
have not received the same attention from computer scientists.
Simon Baron-Cohen proposed a new taxonomy of emotions and mental states
without a system coding of the facial actions. These 426 affective behaviours are
more challenging for the understanding of human emotions. Beyond of classically
classifying the six basic facial expressions, more subtle gestures, facial actions and
spontaneous emotions are considered here. By assessing confidence on the recognition
results, exploring spatial and temporal relationships of the features, some methods are
combined and enhanced for developing new taxonomy of expressions and emotions.
The objective of this dissertation is to develop a computer vision system, including both facial feature extraction, expression recognition and emotion understanding
by building a bottom-up reasoning process. Building a detailed taxonomy of human
affective behaviours is an interesting challenge for head-face-based image analysis
methods. In this paper, we exploit the strengths of Canonical Correlation Analysis
(CCA) to enhance an on-line head-face tracker. A relationship between head pose and
local facial movements is studied according to their cognitive interpretation on affective expressions and emotions. Active Shape Models are synthesized for AAMs based
on CCA-regression. Head pose and facial actions are fused into a maximally correlated space in order to assess expressiveness, confidence and classification in a CBR system. The CBR solutions are also correlated to the cognitive features, which allow
avoiding exhaustive search when recognizing new head-face features. Subsequently,
Support Vector Machines (SVMs) and Bayesian Networks are applied for learning the
spatial relationships of facial expressions. Similarly, the temporal evolution of facial
expressions, emotion and mental states are analysed based on Factorized Dynamic
Bayesian Networks (FaDBN).
As results, the bottom-up system recognizes six facial expressions, six basic emotions and six mental states, plus enhancing this categorization with confidence assessment at each level, intensity of expressions and a complete taxonomy
|
|
|
O. Fors, J. Nuñez, Xavier Otazu, A. Prades, & Robert D. Cardinal. (2010). Improving the Ability of Image Sensors to Detect Faint Stars and Moving Objects Using Image Deconvolution Techniques. SENS - Sensors, 10(3), 1743–1752.
Abstract: Abstract: In this paper we show how the techniques of image deconvolution can increase the ability of image sensors as, for example, CCD imagers, to detect faint stars or faint orbital objects (small satellites and space debris). In the case of faint stars, we show that this benefit is equivalent to double the quantum efficiency of the used image sensor or to increase the effective telescope aperture by more than 30% without decreasing the astrometric precision or introducing artificial bias. In the case of orbital objects, the deconvolution technique can double the signal-to-noise ratio of the image, which helps to discover and control dangerous objects as space debris or lost satellites. The benefits obtained using CCD detectors can be extrapolated to any kind of image sensors.
Keywords: image processing; image deconvolution; faint stars; space debris; wavelet transform
|
|
|
Oriol Ramos Terrades, Alejandro Hector Toselli, Nicolas Serrano, Veronica Romero, Enrique Vidal, & Alfons Juan. (2010). Interactive layout analysis and transcription systems for historic handwritten documents. In 10th ACM Symposium on Document Engineering (219–222).
Abstract: The amount of digitized legacy documents has been rising dramatically over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents, waiting to be classified and finally transcribed into a textual electronic format (such as ASCII or PDF). Nevertheless, most of the available fully-automatic applications addressing this task are far from being perfect and heavy and inefficient human intervention is often required to check and correct the results of such systems. In contrast, multimodal interactive-predictive approaches may allow the users to participate in the process helping the system to improve the overall performance. With this in mind, two sets of recent advances are introduced in this work: a novel interactive method for text block detection and two multimodal interactive handwritten text transcription systems which use active learning and interactive-predictive technologies in the recognition process.
Keywords: Handwriting recognition; Interactive predictive processing; Partial supervision; Interactive layout analysis
|
|
|
Oriol Ramos Terrades, N. Serrano, Albert Gordo, Ernest Valveny, & Alfons Juan-Ciscar. (2010). Interactive-predictive detection of handwritten text blocks. In 17th Document Recognition and Retrieval Conference, part of the IS&T-SPIE Electronic Imaging Symposium (Vol. 7534, 75340Q–75340Q–10).
Abstract: A method for text block detection is introduced for old handwritten documents. The proposed method takes advantage of sequential book structure, taking into account layout information from pages previously transcribed. This glance at the past is used to predict the position of text blocks in the current page with the help of conventional layout analysis methods. The method is integrated into the GIDOC prototype: a first attempt to provide integrated support for interactive-predictive page layout analysis, text line detection and handwritten text transcription. Results are given in a transcription task on a 764-page Spanish manuscript from 1891.
|
|
|
Carolina Malagelada, F.De Lorio, Fernando Azpiroz, Santiago Segui, Petia Radeva, Anna Accarino, et al. (2010). Intestinal Dysmotility in Patients with Functional Intestinal Disorders Demonstrated by Computer Vision Analysis of Capsule Endoscopy Images. In 18th United European Gastroenterology Week (Vol. 56, pp. A19–20).
|
|
|
Fernando Vilariño, Panagiota Spyridonos, Fosca De Iorio, Jordi Vitria, Fernando Azpiroz, & Petia Radeva. (2010). Intestinal Motility Assessment With Video Capsule Endoscopy: Automatic Annotation of Phasic Intestinal Contractions. TMI - IEEE Transactions on Medical Imaging, 29(2), 246–259.
Abstract: Intestinal motility assessment with video capsule endoscopy arises as a novel and challenging clinical fieldwork. This technique is based on the analysis of the patterns of intestinal contractions shown in a video provided by an ingestible capsule with a wireless micro-camera. The manual labeling of all the motility events requires large amount of time for offline screening in search of findings with low prevalence, which turns this procedure currently unpractical. In this paper, we propose a machine learning system to automatically detect the phasic intestinal contractions in video capsule endoscopy, driving a useful but not feasible clinical routine into a feasible clinical procedure. Our proposal is based on a sequential design which involves the analysis of textural, color, and blob features together with SVM classifiers. Our approach tackles the reduction of the imbalance rate of data and allows the inclusion of domain knowledge as new stages in the cascade. We present a detailed analysis, both in a quantitative and a qualitative way, by providing several measures of performance and the assessment study of interobserver variability. Our system performs at 70% of sensitivity for individual detection, whilst obtaining equivalent patterns to those of the experts for density of contractions.
|
|