|
Mathieu Nicolas Delalandre, Jean-Yves Ramel, Ernest Valveny, & Muhammad Muzzamil Luqman. (2010). A Performance Characterization Algorithm for Symbol Localization. In Graphics Recognition. Achievements, Challenges, and Evolution. 8th International Workshop, GREC 2009. Selected Papers (Vol. 6020, 260–271). LNCS. Springer Berlin Heidelberg.
Abstract: In this paper we present an algorithm for performance characterization of symbol localization systems. This algorithm is aimed to be a more “reliable” and “open” solution to characterize the performance. To achieve that, it exploits only single points as the result of localization and offers the possibility to reconsider the localization results provided by a system. We use the information about context in groundtruth, and overall localization results, to detect the ambiguous localization results. A probability score is computed for each matching between a localization point and a groundtruth region, depending on the spatial distribution of the other regions in the groundtruth. Final characterization is given with detection rate/probability score plots, describing the sets of possible interpretations of the localization results, according to a given confidence rate. We present experimentation details along with the results for the symbol localization system of [1], exploiting a synthetic dataset of architectural floorplans and electrical diagrams (composed of 200 images and 3861 symbols).
|
|
|
Marçal Rusiñol, K. Bertet, Jean-Marc Ogier, & Josep Llados. (2010). Symbol Recognition Using a Concept Lattice of Graphical Patterns. In Graphics Recognition. Achievements, Challenges, and Evolution. 8th International Workshop, GREC 2009. Selected Papers (Vol. 6020, pp. 187–198). LNCS. Springer Berlin Heidelberg.
Abstract: In this paper we propose a new approach to recognize symbols by the use of a concept lattice. We propose to build a concept lattice in terms of graphical patterns. Each model symbol is decomposed in a set of composing graphical patterns taken as primitives. Each one of these primitives is described by boundary moment invariants. The obtained concept lattice relates which symbolic patterns compose a given graphical symbol. A Hasse diagram is derived from the context and is used to recognize symbols affected by noise. We present some preliminary results over a variation of the dataset of symbols from the GREC 2005 symbol recognition contest.
|
|
|
Partha Pratim Roy, Umapada Pal, & Josep Llados. (2010). Touching Text Character Localization in Graphical Documents using SIFT. In Graphics Recognition. Achievements, Challenges, and Evolution. 8th International Workshop, GREC 2009. Selected Papers (Vol. 6020, pp. 199–211). LNCS. Springer Berlin Heidelberg.
Abstract: Interpretation of graphical document images is a challenging task as it requires proper understanding of text/graphics symbols present in such documents. Difficulties arise in graphical document recognition when text and symbol overlapped/touched. Intersection of text and symbols with graphical lines and curves occur frequently in graphical documents and hence separation of such symbols is very difficult.
Several pattern recognition and classification techniques exist to recognize isolated text/symbol. But, the touching/overlapping text and symbol recognition has not yet been dealt successfully. An interesting technique, Scale Invariant Feature Transform (SIFT), originally devised for object recognition can take care of overlapping problems. Even if SIFT features have emerged as a very powerful object descriptors, their employment in graphical documents context has not been investigated much. In this paper we present the adaptation of the SIFT approach in the context of text character localization (spotting) in graphical documents. We evaluate the applicability of this technique in such documents and discuss the scope of improvement by combining some state-of-the-art approaches.
Keywords: Support Vector Machine; Text Component; Graphical Line; Document Image; Scale Invariant Feature Transform
|
|
|
Bhaskar Chakraborty, Jordi Gonzalez, & Xavier Roca. (2013). Large scale continuous visual event recognition using max-margin Hough transformation framework. CVIU - Computer Vision and Image Understanding, 117(10), 1356–1368.
Abstract: In this paper we propose a novel method for continuous visual event recognition (CVER) on a large scale video dataset using max-margin Hough transformation framework. Due to high scalability, diverse real environmental state and wide scene variability direct application of action recognition/detection methods such as spatio-temporal interest point (STIP)-local feature based technique, on the whole dataset is practically infeasible. To address this problem, we apply a motion region extraction technique which is based on motion segmentation and region clustering to identify possible candidate “event of interest” as a preprocessing step. On these candidate regions a STIP detector is applied and local motion features are computed. For activity representation we use generalized Hough transform framework where each feature point casts a weighted vote for possible activity class centre. A max-margin frame work is applied to learn the feature codebook weight. For activity detection, peaks in the Hough voting space are taken into account and initial event hypothesis is generated using the spatio-temporal information of the participating STIPs. For event recognition a verification Support Vector Machine is used. An extensive evaluation on benchmark large scale video surveillance dataset (VIRAT) and as well on a small scale benchmark dataset (MSR) shows that the proposed method is applicable on a wide range of continuous visual event recognition applications having extremely challenging conditions.
|
|
|
German Ros. (2012). Visual SLAM for Driverless Cars: An Initial Survey (Vol. 170). Master's thesis, , .
|
|
|
Xu Hu. (2012). Real-Time Part Based Models for Object Detection (Vol. 171). Master's thesis, , .
|
|
|
Nuria Cirera. (2012). Recognition of Handwritten Historical Documents (Vol. 174). Master's thesis, , .
|
|
|
Ferran Poveda. (2013). Computer Graphics and Vision Techniques for the Study of the Muscular Fiber Architecture of the Myocardium (Debora Gil, & Enric Marti, Eds.). Ph.D. thesis, , .
|
|
|
T.Chauhan, E.Perales, Kaida Xiao, E.Hird, Dimosthenis Karatzas, & Sophie Wuerger. (2014). The achromatic locus: Effect of navigation direction in color space. VSS - Journal of Vision, 14 (1)(25), 1–11.
Abstract: 5Y Impact Factor: 2.99 / 1st (Ophthalmology)
An achromatic stimulus is defined as a patch of light that is devoid of any hue. This is usually achieved by asking observers to adjust the stimulus such that it looks neither red nor green and at the same time neither yellow nor blue. Despite the theoretical and practical importance of the achromatic locus, little is known about the variability in these settings. The main purpose of the current study was to evaluate whether achromatic settings were dependent on the task of the observers, namely the navigation direction in color space. Observers could either adjust the test patch along the two chromatic axes in the CIE u*v* diagram or, alternatively, navigate along the unique-hue lines. Our main result is that the navigation method affects the reliability of these achromatic settings. Observers are able to make more reliable achromatic settings when adjusting the test patch along the directions defined by the four unique hues as opposed to navigating along the main axes in the commonly used CIE u*v* chromaticity plane. This result holds across different ambient viewing conditions (Dark, Daylight, Cool White Fluorescent) and different test luminance levels (5, 20, and 50 cd/m2). The reduced variability in the achromatic settings is consistent with the idea that internal color representations are more aligned with the unique-hue lines than the u* and v* axes.
Keywords: achromatic; unique hues; color constancy; luminance; color space
|
|
|
Antonio Clavelli, Dimosthenis Karatzas, Josep Llados, Mario Ferraro, & Giuseppe Boccignone. (2014). Modelling task-dependent eye guidance to objects in pictures. CoCom - Cognitive Computation, 6(3), 558–584.
Abstract: 5Y Impact Factor: 1.14 / 3rd (Computer Science, Artificial Intelligence)
We introduce a model of attentional eye guidance based on the rationale that the deployment of gaze is to be considered in the context of a general action-perception loop relying on two strictly intertwined processes: sensory processing, depending on current gaze position, identifies sources of information that are most valuable under the given task; motor processing links such information with the oculomotor act by sampling the next gaze position and thus performing the gaze shift. In such a framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the payoff of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects. The different levels of the action-perception loop are represented in probabilistic form and eventually give rise to a stochastic process that generates the gaze sequence. This way the model also accounts for statistical properties of gaze shifts such as individual scan path variability. Results of the simulations are compared either with experimental data derived from publicly available datasets and from our own experiments.
Keywords: Visual attention; Gaze guidance; Value; Payoff; Stochastic fixation prediction
|
|
|
Fernando Vilariño, Stephan Ameling, Gerard Lacey, Stephen Patchett, & Hugh Mulcahy. (2009). Eye Tracking Search Patterns in Expert and Trainee Colonoscopists: A Novel Method of Assessing Endoscopic Competency? GI - Gastrointestinal Endoscopy, 69(5), 370.
|
|
|
Miquel Ferrer, I. Bardaji, Ernest Valveny, Dimosthenis Karatzas, & Horst Bunke. (2013). Median Graph Computation by Means of Graph Embedding into Vector Spaces. In Yun Fu, & Yungian Ma (Eds.), Graph Embedding for Pattern Analysis (pp. 45–72). Springer New York.
Abstract: In pattern recognition [8, 14], a key issue to be addressed when designing a system is how to represent input patterns. Feature vectors is a common option. That is, a set of numerical features describing relevant properties of the pattern are computed and arranged in a vector form. The main advantages of this kind of representation are computational simplicity and a well sound mathematical foundation. Thus, a large number of operations are available to work with vectors and a large repository of algorithms for pattern analysis and classification exist. However, the simple structure of feature vectors might not be the best option for complex patterns where nonnumerical features or relations between different parts of the pattern become relevant.
|
|
|
Rozenn Dhayot, Fernando Vilariño, & Gerard Lacey. (2008). Improving the Quality of Color Colonoscopy Videos. EURASIP JIVP - EURASIP Journal on Image and Video Processing, 139429(1), 1–9.
|
|
|
Mirko Arnold, Anarta Ghosh, Stephen Ameling, & G Lacey. (2010). Automatic segmentation and inpainting of specular highlights for endoscopic imaging. EURASIP JIVP - EURASIP Journal on Image and Video Processing, 2010(9).
|
|
|
Mirko Arnold, Anarta Ghosh, Gerard Lacey, Stephen Patchett, & Hugh Mulcahy. (2009). Indistinct frame detection in colonoscopy videos. In Machine Vision and Image Processing Conference (pp. 47–52).
|
|