|
Joan Mas, Gemma Sanchez, & Josep Llados. (2009). SSP: Sketching slide Presentations, a Syntactic Approach. In 8th IAPR International Workshop on Graphics Recognition.
Abstract: The design of a slide presentation is a creative process. In this process first, humans visualize in their minds what they want to explain. Then, they have to be able to represent this knowledge in an understandable way. There exists a lot of commercial software that allows to create our own slide presentations but the creativity of the user is rather limited. In this article we present an application that allows the user to create and visualize a slide presentation from a sketch. A slide may be seen as a graphical document or a diagram where its elements are placed in a particular spatial arrangement. To describe and recognize slides a syntactic approach is proposed. This approach is based on an Adjacency Grammar and a parsing methodology to cope with this kind of grammars. The experimental evaluation shows the performance of our methodology from a qualitative and a quantitative point of view. Six different slides containing different number of symbols, from 4 to 7, have been given to the users and they have drawn them without restrictions in the order of the elements. The quantitative results give an idea on how suitable is our methodology to describe and recognize the different elements in a slide.
|
|
|
Salim Jouili, Salvatore Tabbone, & Ernest Valveny. (2009). Comparing Graph Similarity Measures for Graphical Recognition. In 8th IAPR International Workshop on Graphics Recognition. LNCS. Springer.
Abstract: In this paper we evaluate four graph distance measures. The analysis is performed for document retrieval tasks. For this aim, different kind of documents are used including line drawings (symbols), ancient documents (ornamental letters), shapes and trademark-logos. The experimental results show that the performance of each graph distance measure depends on the kind of data and the graph representation technique.
|
|
|
Mathieu Nicolas Delalandre, Jean-Yves Ramel, Ernest Valveny, & Muhammad Muzzamil Luqman. (2009). A Performance Characterization Algorithm for Symbol Localization. In 8th IAPR International Workshop on Graphics Recognition (pp. 3–11). Springer.
Abstract: In this paper we present an algorithm for performance characterization of symbol localization systems. This algorithm is aimed to be a more “reliable” and “open” solution to characterize the performance. To achieve that, it exploits only single points as the result of localization and offers the possibility to reconsider the localization results provided by a system. We use the information about context in groundtruth, and overall localization results, to detect the ambiguous localization results. A probability score is computed for each matching between a localization point and a groundtruth region, depending on the spatial distribution of the other regions in the groundtruth. Final characterization is given with detection rate/probability score plots, describing the sets of possible interpretations of the localization results, according to a given confidence rate. We present experimentation details along with the results for the symbol localization system of [1], exploiting a synthetic dataset of architectural floorplans and electrical diagrams (composed of 200 images and 3861 symbols).
|
|
|
Marçal Rusiñol, K. Bertet, Jean-Marc Ogier, & Josep Llados. (2009). Symbol Recognition Using a Concept Lattice of Graphical Patterns. In 8th IAPR International Workshop on Graphics Recognition.
Abstract: In this paper we propose a new approach to recognize symbols by the use of a concept lattice. We propose to build a concept lattice in terms of graphical patterns. Each model symbol is decomposed in a set of composing graphical patterns taken as primitives. Each one of these primitives is described by boundary moment invariants. The obtained concept lattice relates which symbolic patterns compose a given graphical symbol. A Hasse diagram is derived from the context and is used to recognize symbols affected by noise. We present some preliminary results over a variation of the dataset of symbols from the GREC 2005 symbol recognition contest.
|
|
|
Partha Pratim Roy, Umapada Pal, & Josep Llados. (2009). Touching Text Character Localization in Graphical Documents using SIFT. In In proceedings 8th IAPR International Workshop on Graphics Recognition.
Abstract: Interpretation of graphical document images is a challenging task as it requires proper understanding of text/graphics symbols present in such documents. Difficulties arise in graphical document recognition when text and symbol overlapped/touched. Intersection of text and symbols with graphical lines and curves occur frequently in graphical documents and hence separation of such symbols is very difficult.
Several pattern recognition and classification techniques exist to recognize isolated text/symbol. But, the touching/overlapping text and symbol recognition has not yet been dealt successfully. An interesting technique, Scale Invariant Feature Transform (SIFT), originally devised for object recognition can take care of overlapping problems. Even if SIFT features have emerged as a very powerful object descriptors, their employment in graphical documents context has not been investigated much. In this paper we present the adaptation of the SIFT approach in the context of text character localization (spotting) in graphical documents. We evaluate the applicability of this technique in such documents and discuss the scope of improvement by combining some state-of-the-art approaches.
|
|
|
Carlo Gatta, Simone Balocco, Francesco Ciompi, R. Hemetsberger, Oriol Rodriguez-Leor, & Petia Radeva. (2010). Real-time gating of IVUS sequences based on motion blur analysis: Method and quantitative validation. In 13th international conference on Medical image computing and computer-assisted intervention (Vol. II, pp. 59–67). Springer-Verlag Berlin.
Abstract: Intravascular Ultrasound (IVUS) is an image-guiding technique for cardiovascular diagnostic, providing cross-sectional images of vessels. During the acquisition, the catheter is pulled back (pullback) at a constant speed in order to acquire spatially subsequent images of the artery. However, during this procedure, the heart twist produces a swinging fluctuation of the probe position along the vessel axis. In this paper we propose a real-time gating algorithm based on the analysis of motion blur variations during the IVUS sequence. Quantitative tests performed on an in-vitro ground truth data base shown that our method is superior to state of the art algorithms both in computational speed and accuracy.
|
|
|
Eloi Puertas, Sergio Escalera, & Oriol Pujol. (2010). Classifying Objects at Different Sizes with Multi-Scale Stacked Sequential Learning. In J. Aguilar A. M. R. Alquezar (Ed.), 13th International Conference of the Catalan Association for Artificial Intelligence (Vol. 220, 193–200).
Abstract: Sequential learning is that discipline of machine learning that deals with dependent data. In this paper, we use the Multi-scale Stacked Sequential Learning approach (MSSL) to solve the task of pixel-wise classification based on contextual information. The main contribution of this work is a shifting technique applied during the testing phase that makes possible, thanks to template images, to classify objects at different sizes. The results show that the proposed method robustly classifies such objects capturing their spatial relationships.
|
|
|
Xavier Otazu, C. Alejandro Parraga, & Maria Vanrell. (2010). Towards a unified chromatic inducction model. VSS - Journal of Vision, 10(12:5), 1–24.
Abstract: In a previous work (X. Otazu, M. Vanrell, & C. A. Párraga, 2008b), we showed how several brightness induction effects can be predicted using a simple multiresolution wavelet model (BIWaM). Here we present a new model for chromatic induction processes (termed Chromatic Induction Wavelet Model or CIWaM), which is also implemented on a multiresolution framework and based on similar assumptions related to the spatial frequency and the contrast surround energy of the stimulus. The CIWaM can be interpreted as a very simple extension of the BIWaM to the chromatic channels, which in our case are defined in the MacLeod-Boynton (lsY) color space. This new model allows us to unify both chromatic assimilation and chromatic contrast effects in a single mathematical formulation. The predictions of the CIWaM were tested by means of several color and brightness induction experiments, which showed an acceptable agreement between model predictions and psychophysical data.
Keywords: Visual system; Color induction; Wavelet transform
|
|
|
Jose Manuel Alvarez, Theo Gevers, & Antonio Lopez. (2010). Learning photometric invariance for object detection. IJCV - International Journal of Computer Vision, 90(1), 45–61.
Abstract: Impact factor: 3.508 (the last available from JCR2009SCI). Position 4/103 in the category Computer Science, Artificial Intelligence. Quartile
Color is a powerful visual cue in many computer vision applications such as image segmentation and object recognition. However, most of the existing color models depend on the imaging conditions that negatively affect the performance of the task at hand. Often, a reflection model (e.g., Lambertian or dichromatic reflectance) is used to derive color invariant models. However, this approach may be too restricted to model real-world scenes in which different reflectance mechanisms can hold simultaneously.
Therefore, in this paper, we aim to derive color invariance by learning from color models to obtain diversified color invariant ensembles. First, a photometrical orthogonal and non-redundant color model set is computed composed of both color variants and invariants. Then, the proposed method combines these color models to arrive at a diversified color ensemble yielding a proper balance between invariance (repeatability) and discriminative power (distinctiveness). To achieve this, our fusion method uses a multi-view approach to minimize the estimation error. In this way, the proposed method is robust to data uncertainty and produces properly diversified color invariant ensembles. Further, the proposed method is extended to deal with temporal data by predicting the evolution of observations over time.
Experiments are conducted on three different image datasets to validate the proposed method. Both the theoretical and experimental results show that the method is robust against severe variations in imaging conditions. The method is not restricted to a certain reflection model or parameter tuning, and outperforms state-of-the-art detection techniques in the field of object, skin and road recognition. Considering sequential data, the proposed method (extended to deal with future observations) outperforms the other methods
Keywords: road detection
|
|
|
Sergio Escalera, Oriol Pujol, Eric Laciar, Jordi Vitria, Esther Pueyo, & Petia Radeva. (2010). Classification of Coronary Damage in Chronic Chagasic Patients. In M. H.(eds) V. Sgurev (Ed.), Intelligent Systems – From Theory to Practice. Studies in Computational Intelligence (Vol. 299, pp. 461–478). Springer-Verlag.
Abstract: Post Conference IEEE-IS 2008
The Chagas’ disease is endemic in all Latin America, affecting millions of people in the continent. In order to diagnose and treat the chagas’ disease, it is important to detect and measure the coronary damage of the patient. In this paper,
we analyze and categorize patients into different groups based on the coronary damage produced by the disease. Based on the features of the heart cycle extracted using high resolution ECG, a multi-class scheme of Error-Correcting Output Codes (ECOC)is formulated and successfully applied. The results show that the proposed scheme obtains significant performance improvements compared to previous works and state-of-the-art ECOC designs.
Keywords: Chagas disease; Error-Correcting Output Codes; High resolution ECG; Decoding
|
|
|
Francesco Ciompi, Oriol Pujol, E Fernandez-Nofrerias, J. Mauri, & Petia Radeva. (2010). Conditional Random Fields for image segmentation in Intravascular Ultrasound. In Medical Image Computing in Catalunya: Graduate Student Workshop (13–14).
Abstract: We present a Conditional Random Fields based approach for segmenting Intravascular Ultrasond (IVUS) images. The presented method uses a contextual discriminative graphical model to deal with the presence of distorsions and artifacts in IVUS images, that turns the segmentation of interesting regions into a difficult task. An accurate lumen segmentation on IVUS longitudinal images is achieved.
|
|
|
Jose Manuel Alvarez. (2010). Combining Context and Appearance for Road Detection (Antonio Lopez, & Theo Gevers, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: Road traffic crashes have become a major cause of death and injury throughout the world.
Hence, in order to improve road safety, the automobile manufacture is moving towards the
development of vehicles with autonomous functionalities such as keeping in the right lane, safe distance keeping between vehicles or regulating the speed of the vehicle according to the traffic conditions. A key component of these systems is vision–based road detection that aims to detect the free road surface ahead the moving vehicle. Detecting the road using a monocular vision system is very challenging since the road is an outdoor scenario imaged from a mobile platform. Hence, the detection algorithm must be able to deal with continuously changing imaging conditions such as the presence ofdifferent objects (vehicles, pedestrians), different environments (urban, highways, off–road), different road types (shape, color), and different imaging conditions (varying illumination, different viewpoints and changing weather conditions). Therefore, in this thesis, we focus on vision–based road detection using a single color camera. More precisely, we first focus on analyzing and grouping pixels according to their low–level properties. In this way, two different approaches are presented to exploit
color and photometric invariance. Then, we focus the research of the thesis on exploiting context information. This information provides relevant knowledge about the road not using pixel features from road regions but semantic information from the analysis of the scene.
In this way, we present two different approaches to infer the geometry of the road ahead
the moving vehicle. Finally, we focus on combining these context and appearance (color)
approaches to improve the overall performance of road detection algorithms. The qualitative and quantitative results presented in this thesis on real–world driving sequences show that the proposed method is robust to varying imaging conditions, road types and scenarios going beyond the state–of–the–art.
|
|
|
Partha Pratim Roy. (2010). Multi-Oriented and Multi-Scaled Text Character Analysis and Recognition in Graphical Documents and their Applications to Document Image Retrieval (Josep Llados, & Umapada Pal, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: With the advent research of Document Image Analysis and Recognition (DIAR), an
important line of research is explored on indexing and retrieval of graphics rich documents. It aims at finding relevant documents relying on segmentation and recognition
of text and graphics components underlying in non-standard layout where commercial
OCRs can not be applied due to complexity. This thesis is focused towards text information extraction approaches in graphical documents and retrieval of such documents
using text information.
Automatic text recognition in graphical documents (map, engineering drawing,
etc.) involves many challenges because text characters are usually printed in multioriented and multi-scale way along with different graphical objects. Text characters
are used to annotate the graphical curve lines and hence, many times they follow
curvi-linear paths too. For OCR of such documents, individual text lines and their
corresponding words/characters need to be extracted.
For recognition of multi-font, multi-scale and multi-oriented characters, we have
proposed a feature descriptor for character shape using angular information from contour pixels to take care of the invariance nature. To improve the efficiency of OCR, an
approach towards the segmentation of multi-oriented touching strings into individual
characters is also discussed. Convex hull based background information is used to
segment a touching string into possible primitive segments and later these primitive
segments are merged to get optimum segmentation using dynamic programming. To
overcome the touching/overlapping problem of text with graphical lines, a character
spotting approach using SIFT and skeleton information is included. Afterwards, we
propose a novel method to extract individual curvi-linear text lines using the foreground and background information of the characters of the text and a water reservoir
concept is used to utilize the background information.
We have also formulated the methodologies for graphical document retrieval applications using query words and seals. The retrieval approaches are performed using
recognition results of individual components in the document. Given a query text,
the system extracts positional knowledge from the query word and uses the same to
generate hypothetical locations in the document. Indexing of documents is also performed based on automatic detection of seals from documents containing cluttered
background. A seal is characterized by scale and rotation invariant spatial feature
descriptors computed from labelled text characters and a concept based on the Generalized Hough Transform is used to locate the seal in documents.
|
|
|
Jose Manuel Alvarez, & Antonio Lopez. (2011). Road Detection Based on Illuminant Invariance. TITS - IEEE Transactions on Intelligent Transportation Systems, 12(1), 184–193.
Abstract: By using an onboard camera, it is possible to detect the free road surface ahead of the ego-vehicle. Road detection is of high relevance for autonomous driving, road departure warning, and supporting driver-assistance systems such as vehicle and pedestrian detection. The key for vision-based road detection is the ability to classify image pixels as belonging or not to the road surface. Identifying road pixels is a major challenge due to the intraclass variability caused by lighting conditions. A particularly difficult scenario appears when the road surface has both shadowed and nonshadowed areas. Accordingly, we propose a novel approach to vision-based road detection that is robust to shadows. The novelty of our approach relies on using a shadow-invariant feature space combined with a model-based classifier. The model is built online to improve the adaptability of the algorithm to the current lighting and the presence of other vehicles in the scene. The proposed algorithm works in still images and does not depend on either road shape or temporal restrictions. Quantitative and qualitative experiments on real-world road sequences with heavy traffic and shadows show that the method is robust to shadows and lighting variations. Moreover, the proposed method provides the highest performance when compared with hue-saturation-intensity (HSI)-based algorithms.
Keywords: road detection
|
|
|
Cesar Isaza, Joaquin Salas, & Bogdan Raducanu. (2010). Toward the Detection of Urban Infrastructures Edge Shadows. In eds. Blanc–Talon et al (Ed.), 12th International Conference on Advanced Concepts for Intelligent Vision Systems (Vol. 6474, 30–37). LNCS. Springer Berlin Heidelberg.
Abstract: In this paper, we propose a novel technique to detect the shadows cast by urban infrastructure, such as buildings, billboards, and traffic signs, using a sequence of images taken from a fixed camera. In our approach, we compute two different background models in parallel: one for the edges and one for the reflected light intensity. An algorithm is proposed to train the system to distinguish between moving edges in general and edges that belong to static objects, creating an edge background model. Then, during operation, a background intensity model allow us to separate between moving and static objects. Those edges included in the moving objects and those that belong to the edge background model are subtracted from the current image edges. The remaining edges are the ones cast by urban infrastructure. Our method is tested on a typical crossroad scene and the results show that the approach is sound and promising.
|
|