|
Miquel Ferrer, Ernest Valveny, F. Serratosa and Horst Bunke. 2008. Exact Median Graph Computation via Graph Embedding. 12th International Workshop on Structural and Syntactic Pattern Recognition.15–24. (LNCS.)
|
|
|
Lluis Pere de las Heras, Oriol Ramos Terrades, Sergi Robles and Gemma Sanchez. 2015. CVC-FP and SGT: a new database for structural floor plan analysis and its groundtruthing tool. IJDAR, 18(1), 15–30.
Abstract: Recent results on structured learning methods have shown the impact of structural information in a wide range of pattern recognition tasks. In the field of document image analysis, there is a long experience on structural methods for the analysis and information extraction of multiple types of documents. Yet, the lack of conveniently annotated and free access databases has not benefited the progress in some areas such as technical drawing understanding. In this paper, we present a floor plan database, named CVC-FP, that is annotated for the architectural objects and their structural relations. To construct this database, we have implemented a groundtruthing tool, the SGT tool, that allows to make specific this sort of information in a natural manner. This tool has been made for general purpose groundtruthing: It allows to define own object classes and properties, multiple labeling options are possible, grants the cooperative work, and provides user and version control. We finally have collected some of the recent work on floor plan interpretation and present a quantitative benchmark for this database. Both CVC-FP database and the SGT tool are freely released to the research community to ease comparisons between methods and boost reproducible research.
|
|
|
Lluis Pere de las Heras and Gemma Sanchez. 2011. And-Or Graph Grammar for Architectural Floorplan Representation, Learning and Recognition. A Semantic, Structural and Hierarchical Model. 5th Iberian Conference on Pattern Recognition and Image Analysis.17–24.
Abstract: This paper presents a syntactic model for architectural floor plan interpretation. A stochastic image grammar over an And-Or graph is inferred to represent the hierarchical, structural and semantic relations between elements of all possible floor plans. This grammar is augmented with three different probabilistic models, learnt from a training set, to account the frequency of that relations. Then, a Bottom-Up/Top-Down parser with a pruning strategy has been used for floor plan recognition. For a given input, the parser generates the most probable parse graph for that document. This graph not only contains the structural and semantic relations of its elements, but also its hierarchical composition, that allows to interpret the floor plan at different levels of abstraction.
|
|
|
Kaida Xiao, Chenyang Fu, Dimosthenis Karatzas and Sophie Wuerger. 2011. Visual Gamma Correction for LCD Displays. DIS, 32(1), 17–23.
Abstract: An improved method for visual gamma correction is developed for LCD displays to increase the accuracy of digital colour reproduction. Rather than utilising a photometric measurement device, we use observ- ers’ visual luminance judgements for gamma correction. Eight half tone patterns were designed to gen- erate relative luminances from 1/9 to 8/9 for each colour channel. A psychophysical experiment was conducted on an LCD display to find the digital signals corresponding to each relative luminance by visually matching the half-tone background to a uniform colour patch. Both inter- and intra-observer vari- ability for the eight luminance matches in each channel were assessed and the luminance matches proved to be consistent across observers (DE00 < 3.5) and repeatable (DE00 < 2.2). Based on the individual observer judgements, the display opto-electronic transfer function (OETF) was estimated by using either a 3rd order polynomial regression or linear interpolation for each colour channel. The performance of the proposed method is evaluated by predicting the CIE tristimulus values of a set of coloured patches (using the observer-based OETFs) and comparing them to the expected CIE tristimulus values (using the OETF obtained from spectro-radiometric luminance measurements). The resulting colour differences range from 2 to 4.6 DE00. We conclude that this observer-based method of visual gamma correction is useful to estimate the OETF for LCD displays. Its major advantage is that no particular functional relationship between digital inputs and luminance outputs has to be assumed.
Keywords: Display calibration; Psychophysics ; Perceptual; Visual gamma correction; Luminance matching; Observer-based calibration
|
|
|
Ernest Valveny, Oriol Ramos Terrades, Joan Mas and Marçal Rusiñol. 2013. Interactive Document Retrieval and Classification. In Angel Sappa and Jordi Vitria, eds. Multimodal Interaction in Image and Video Applications. Springer Berlin Heidelberg, 17–30.
Abstract: In this chapter we describe a system for document retrieval and classification following the interactive-predictive framework. In particular, the system addresses two different scenarios of document analysis: document classification based on visual appearance and logo detection. These two classical problems of document analysis are formulated following the interactive-predictive model, taking the user interaction into account to make easier the process of annotating and labelling the documents. A system implementing this model in a real scenario is presented and analyzed. This system also takes advantage of active learning techniques to speed up the task of labelling the documents.
|
|
|
Oriol Ramos Terrades, Ernest Valveny and Salvatore Tabbone. 2007. On the Combination of Ridgelets Descriptors for Symbol Recognition. Seventh IAPR International Workshop on Graphics Recognition.18–20.
|
|
|
Antonio Clavelli, Dimosthenis Karatzas and Josep Llados. 2010. A framework for the assessment of text extraction algorithms on complex colour images. 9th IAPR International Workshop on Document Analysis Systems.19–26.
Abstract: The availability of open, ground-truthed datasets and clear performance metrics is a crucial factor in the development of an application domain. The domain of colour text image analysis (real scenes, Web and spam images, scanned colour documents) has traditionally suffered from a lack of a comprehensive performance evaluation framework. Such a framework is extremely difficult to specify, and corresponding pixel-level accurate information tedious to define. In this paper we discuss the challenges and technical issues associated with developing such a framework. Then, we describe a complete framework for the evaluation of text extraction methods at multiple levels, provide a detailed ground-truth specification and present a case study on how this framework can be used in a real-life situation.
|
|
|
Alicia Fornes, Josep Llados, Oriol Ramos Terrades and Marçal Rusiñol. 2016. La Visió per Computador com a Eina per a la Interpretació Automàtica de Fonts Documentals.
|
|
|
Ariel Amato, Angel Sappa, Alicia Fornes, Felipe Lumbreras and Josep Llados. 2013. Divide and Conquer: Atomizing and Parallelizing A Task in A Mobile Crowdsourcing Platform. 2nd International ACM Workshop on Crowdsourcing for Multimedia.21–22.
Abstract: In this paper we present some conclusions about the advantages of having an efficient task formulation when a crowdsourcing platform is used. In particular we show how the task atomization and distribution can help to obtain results in an efficient way. Our proposal is based on a recursive splitting of the original task into a set of smaller and simpler tasks. As a result both more accurate and faster solutions are obtained. Our evaluation is performed on a set of ancient documents that need to be digitized.
|
|
|
Jaume Gibert, Ernest Valveny and Horst Bunke. 2011. Dimensionality Reduction for Graph of Words Embedding. In Xiaoyi Jiang, Miquel Ferrer and Andrea Torsello, eds. 8th IAPR-TC-15 International Workshop. Graph-Based Representations in Pattern Recognition.22–31. (LNCS.)
Abstract: The Graph of Words Embedding consists in mapping every graph of a given dataset to a feature vector by counting unary and binary relations between node attributes of the graph. While it shows good properties in classification problems, it suffers from high dimensionality and sparsity. These two issues are addressed in this article. Two well-known techniques for dimensionality reduction, kernel principal component analysis (kPCA) and independent component analysis (ICA), are applied to the embedded graphs. We discuss their performance compared to the classification of the original vectors on three different public databases of graphs.
|
|