|
Lluis Pere de las Heras and Gemma Sanchez. 2011. And-Or Graph Grammar for Architectural Floorplan Representation, Learning and Recognition. A Semantic, Structural and Hierarchical Model. 5th Iberian Conference on Pattern Recognition and Image Analysis.17–24.
Abstract: This paper presents a syntactic model for architectural floor plan interpretation. A stochastic image grammar over an And-Or graph is inferred to represent the hierarchical, structural and semantic relations between elements of all possible floor plans. This grammar is augmented with three different probabilistic models, learnt from a training set, to account the frequency of that relations. Then, a Bottom-Up/Top-Down parser with a pruning strategy has been used for floor plan recognition. For a given input, the parser generates the most probable parse graph for that document. This graph not only contains the structural and semantic relations of its elements, but also its hierarchical composition, that allows to interpret the floor plan at different levels of abstraction.
|
|
|
Kaida Xiao, Chenyang Fu, Dimosthenis Karatzas and Sophie Wuerger. 2011. Visual Gamma Correction for LCD Displays. DIS, 32(1), 17–23.
Abstract: An improved method for visual gamma correction is developed for LCD displays to increase the accuracy of digital colour reproduction. Rather than utilising a photometric measurement device, we use observ- ers’ visual luminance judgements for gamma correction. Eight half tone patterns were designed to gen- erate relative luminances from 1/9 to 8/9 for each colour channel. A psychophysical experiment was conducted on an LCD display to find the digital signals corresponding to each relative luminance by visually matching the half-tone background to a uniform colour patch. Both inter- and intra-observer vari- ability for the eight luminance matches in each channel were assessed and the luminance matches proved to be consistent across observers (DE00 < 3.5) and repeatable (DE00 < 2.2). Based on the individual observer judgements, the display opto-electronic transfer function (OETF) was estimated by using either a 3rd order polynomial regression or linear interpolation for each colour channel. The performance of the proposed method is evaluated by predicting the CIE tristimulus values of a set of coloured patches (using the observer-based OETFs) and comparing them to the expected CIE tristimulus values (using the OETF obtained from spectro-radiometric luminance measurements). The resulting colour differences range from 2 to 4.6 DE00. We conclude that this observer-based method of visual gamma correction is useful to estimate the OETF for LCD displays. Its major advantage is that no particular functional relationship between digital inputs and luminance outputs has to be assumed.
Keywords: Display calibration; Psychophysics ; Perceptual; Visual gamma correction; Luminance matching; Observer-based calibration
|
|
|
Ernest Valveny, Oriol Ramos Terrades, Joan Mas and Marçal Rusiñol. 2013. Interactive Document Retrieval and Classification. In Angel Sappa and Jordi Vitria, eds. Multimodal Interaction in Image and Video Applications. Springer Berlin Heidelberg, 17–30.
Abstract: In this chapter we describe a system for document retrieval and classification following the interactive-predictive framework. In particular, the system addresses two different scenarios of document analysis: document classification based on visual appearance and logo detection. These two classical problems of document analysis are formulated following the interactive-predictive model, taking the user interaction into account to make easier the process of annotating and labelling the documents. A system implementing this model in a real scenario is presented and analyzed. This system also takes advantage of active learning techniques to speed up the task of labelling the documents.
|
|
|
Oriol Ramos Terrades, Ernest Valveny and Salvatore Tabbone. 2007. On the Combination of Ridgelets Descriptors for Symbol Recognition. Seventh IAPR International Workshop on Graphics Recognition.18–20.
|
|
|
Antonio Clavelli, Dimosthenis Karatzas and Josep Llados. 2010. A framework for the assessment of text extraction algorithms on complex colour images. 9th IAPR International Workshop on Document Analysis Systems.19–26.
Abstract: The availability of open, ground-truthed datasets and clear performance metrics is a crucial factor in the development of an application domain. The domain of colour text image analysis (real scenes, Web and spam images, scanned colour documents) has traditionally suffered from a lack of a comprehensive performance evaluation framework. Such a framework is extremely difficult to specify, and corresponding pixel-level accurate information tedious to define. In this paper we discuss the challenges and technical issues associated with developing such a framework. Then, we describe a complete framework for the evaluation of text extraction methods at multiple levels, provide a detailed ground-truth specification and present a case study on how this framework can be used in a real-life situation.
|
|
|
Alicia Fornes, Josep Llados, Oriol Ramos Terrades and Marçal Rusiñol. 2016. La Visió per Computador com a Eina per a la Interpretació Automàtica de Fonts Documentals.
|
|
|
Ariel Amato, Angel Sappa, Alicia Fornes, Felipe Lumbreras and Josep Llados. 2013. Divide and Conquer: Atomizing and Parallelizing A Task in A Mobile Crowdsourcing Platform. 2nd International ACM Workshop on Crowdsourcing for Multimedia.21–22.
Abstract: In this paper we present some conclusions about the advantages of having an efficient task formulation when a crowdsourcing platform is used. In particular we show how the task atomization and distribution can help to obtain results in an efficient way. Our proposal is based on a recursive splitting of the original task into a set of smaller and simpler tasks. As a result both more accurate and faster solutions are obtained. Our evaluation is performed on a set of ancient documents that need to be digitized.
|
|
|
Jaume Gibert, Ernest Valveny and Horst Bunke. 2011. Dimensionality Reduction for Graph of Words Embedding. In Xiaoyi Jiang, Miquel Ferrer and Andrea Torsello, eds. 8th IAPR-TC-15 International Workshop. Graph-Based Representations in Pattern Recognition.22–31. (LNCS.)
Abstract: The Graph of Words Embedding consists in mapping every graph of a given dataset to a feature vector by counting unary and binary relations between node attributes of the graph. While it shows good properties in classification problems, it suffers from high dimensionality and sparsity. These two issues are addressed in this article. Two well-known techniques for dimensionality reduction, kernel principal component analysis (kPCA) and independent component analysis (ICA), are applied to the embedded graphs. We discuss their performance compared to the classification of the original vectors on three different public databases of graphs.
|
|
|
Kaida Xiao, Chenyang Fu, D.Mylonas, Dimosthenis Karatzas and S. Wuerger. 2013. Unique Hue Data for Colour Appearance Models. Part ii: Chromatic Adaptation Transform. CRA, 38(1), 22–29.
Abstract: Unique hue settings of 185 observers under three room-lighting conditions were used to evaluate the accuracy of full and mixed chromatic adaptation transform models of CIECAM02 in terms of unique hue reproduction. Perceptual hue shifts in CIECAM02 were evaluated for both models with no clear difference using the current Commission Internationale de l'Éclairage (CIE) recommendation for mixed chromatic adaptation ratio. Using our large dataset of unique hue data as a benchmark, an optimised parameter is proposed for chromatic adaptation under mixed illumination conditions that produces more accurate results in unique hue reproduction. © 2011 Wiley Periodicals, Inc. Col Res Appl, 2013
|
|
|
Partha Pratim Roy, Umapada Pal and Josep Llados. 2010. Seal Object Detection in Document Images using GHT of Local Component Shapes. 10th ACM Symposium On Applied Computing.23–27.
Abstract: Due to noise, overlapped text/signature and multi-oriented nature, seal (stamp) object detection involves a difficult challenge. This paper deals with automatic detection of seal from documents with cluttered background. Here, a seal object is characterized by scale and rotation invariant spatial feature descriptors (distance and angular position) computed from recognition result of individual connected components (characters). Recognition of multi-scale and multi-oriented component is done using Support Vector Machine classifier. Generalized Hough Transform (GHT) is used to detect the seal and a voting is casted for finding possible location of the seal object in a document based on these spatial feature descriptor of components pairs. The peak of votes in GHT accumulator validates the hypothesis to locate the seal object in a document. Experimental results show that, the method is efficient to locate seal instance of arbitrary shape and orientation in documents.
|
|