|
Lluis Pere de las Heras, Ernest Valveny and Gemma Sanchez. 2014. Unsupervised and Notation-Independent Wall Segmentation in Floor Plans Using a Combination of Statistical and Structural Strategies. Graphics Recognition. Current Trends and Challenges. Springer Berlin Heidelberg, 109–121. (LNCS.)
Abstract: In this paper we present a wall segmentation approach in floor plans that is able to work independently to the graphical notation, does not need any pre-annotated data for learning, and is able to segment multiple-shaped walls such as beams and curved-walls. This method results from the combination of the wall segmentation approaches [3, 5] presented recently by the authors. Firstly, potential straight wall segments are extracted in an unsupervised way similar to [3], but restricting even more the wall candidates considered in the original approach. Then, based on [5], these segments are used to learn the texture pattern of walls and spot the lost instances. The presented combination of both methods has been tested on 4 available datasets with different notations and compared qualitatively and quantitatively to the state-of-the-art applied on these collections. Additionally, some qualitative results on floor plans directly downloaded from the Internet are reported in the paper. The overall performance of the method demonstrates either its adaptability to different wall notations and shapes, and to document qualities and resolutions.
Keywords: Graphics recognition; Floor plan analysis; Object segmentation
|
|
|
Marçal Rusiñol, Dimosthenis Karatzas and Josep Llados. 2013. Spotting Graphical Symbols in Camera-Acquired Documents in Real Time. 10th IAPR International Workshop on Graphics Recognition.
Abstract: In this paper we present a system devoted to spot graphical symbols in camera-acquired document images. The system is based on the extraction and further matching of ORB compact local features computed over interest key-points. Then, the FLANN indexing framework based on approximate nearest neighbor search allows to efficiently match local descriptors between the captured scene and the graphical models. Finally, the RANSAC algorithm is used in order to compute the homography between the spotted symbol and its appearance in the document image. The proposed approach is efficient and is able to work in real time.
|
|
|
Marçal Rusiñol, Dimosthenis Karatzas and Josep Llados. 2014. Spotting Graphical Symbols in Camera-Acquired Documents in Real Time. In Bart Lamiroy and Jean-Marc Ogier, eds. Graphics Recognition. Current Trends and Challenges. Springer Berlin Heidelberg, 3–10. (LNCS.)
Abstract: In this paper we present a system devoted to spot graphical symbols in camera-acquired document images. The system is based on the extraction and further matching of ORB compact local features computed over interest key-points. Then, the FLANN indexing framework based on approximate nearest neighbor search allows to efficiently match local descriptors between the captured scene and the graphical models. Finally, the RANSAC algorithm is used in order to compute the homography between the spotted symbol and its appearance in the document image. The proposed approach is efficient and is able to work in real time.
|
|
|
Lluis Gomez, Marçal Rusiñol and Dimosthenis Karatzas. 2018. Cutting Sayre's Knot: Reading Scene Text without Segmentation. Application to Utility Meters. 13th IAPR International Workshop on Document Analysis Systems.97–102.
Abstract: In this paper we present a segmentation-free system for reading text in natural scenes. A CNN architecture is trained in an end-to-end manner, and is able to directly output readings without any explicit text localization step. In order to validate our proposal, we focus on the specific case of reading utility meters. We present our results in a large dataset of images acquired by different users and devices, so text appears in any location, with different sizes, fonts and lengths, and the images present several distortions such as
dirt, illumination highlights or blur.
Keywords: Robust Reading; End-to-end Systems; CNN; Utility Meters
|
|
|
Hongxing Gao, Marçal Rusiñol, Dimosthenis Karatzas, Apostolos Antonacopoulos and Josep Llados. 2013. An interactive appearance-based document retrieval system for historical newspapers. Proceedings of the International Conference on Computer Vision Theory and Applications.84–87.
Abstract: In this paper we present a retrieval-based application aimed at assisting a user to semi-automatically segment an incoming flow of historical newspaper images by automatically detecting a particular type of pages based on their appearance. A visual descriptor is used to assess page similarity while a relevance feedback process allow refining the results iteratively. The application is tested on a large dataset of digitised historic newspapers.
|
|
|
Marçal Rusiñol, Dimosthenis Karatzas, Andrew Bagdanov and Josep Llados. 2012. Multipage Document Retrieval by Textual and Visual Representations. 21st International Conference on Pattern Recognition.521–524.
Abstract: In this paper we present a multipage administrative document image retrieval system based on textual and visual representations of document pages. Individual pages are represented by textual or visual information using a bag-of-words framework. Different fusion strategies are evaluated which allow the system to perform multipage document retrieval on the basis of a single page retrieval system. Results are reported on a large dataset of document images sampled from a banking workflow.
|
|
|
Francisco Cruz and Oriol Ramos Terrades. 2014. EM-Based Layout Analysis Method for Structured Documents. 22nd International Conference on Pattern Recognition.315–320.
Abstract: In this paper we present a method to perform layout analysis in structured documents. We proposed an EM-based algorithm to fit a set of Gaussian mixtures to the different regions according to the logical distribution along the page. After the convergence, we estimate the final shape of the regions according
to the parameters computed for each component of the mixture. We evaluated our method in the task of record detection in a collection of historical structured documents and performed a comparison with other previous works in this task.
|
|
|
Nuria Cirera, Alicia Fornes and Josep Llados. 2015. Hidden Markov model topology optimization for handwriting recognition. 13th International Conference on Document Analysis and Recognition ICDAR2015.626–630.
Abstract: In this paper we present a method to optimize the topology of linear left-to-right hidden Markov models. These models are very popular for sequential signals modeling on tasks such as handwriting recognition. Many topology definition methods select the number of states for a character model based
on character length. This can be a drawback when characters are shorter than the minimum allowed by the model, since they can not be properly trained nor recognized. The proposed method optimizes the number of states per model by automatically including convenient skip-state transitions and therefore it avoids the aforementioned problem.We discuss and compare our method with other character length-based methods such the Fixed, Bakis and Quantile methods. Our proposal performs well on off-line handwriting recognition task.
|
|
|
Albert Gordo, Marçal Rusiñol, Dimosthenis Karatzas and Andrew Bagdanov. 2013. Document Classification and Page Stream Segmentation for Digital Mailroom Applications. 12th International Conference on Document Analysis and Recognition.621–625.
Abstract: In this paper we present a method for the segmentation of continuous page streams into multipage documents and the simultaneous classification of the resulting documents. We first present an approach to combine the multiple pages of a document into a single feature vector that represents the whole document. Despite its simplicity and low computational cost, the proposed representation yields results comparable to more complex methods in multipage document classification tasks. We then exploit this representation in the context of page stream segmentation. The most plausible segmentation of a page stream into a sequence of multipage documents is obtained by optimizing a statistical model that represents the probability of each segmented multipage document belonging to a particular class. Experimental results are reported on a large sample of real administrative multipage documents.
|
|
|
Marçal Rusiñol, Farshad Nourbakhsh, Dimosthenis Karatzas, Ernest Valveny and Josep Llados. 2010. Perceptual Image Retrieval by Adding Color Information to the Shape Context Descriptor. 20th International Conference on Pattern Recognition.1594–1597.
Abstract: In this paper we present a method for the retrieval of images in terms of perceptual similarity. Local color information is added to the shape context descriptor in order to obtain an object description integrating both shape and color as visual cues. We use a color naming algorithm in order to represent the color information from a perceptual point of view. The proposed method has been tested in two different applications, an object retrieval scenario based on color sketch queries and a color trademark retrieval problem. Experimental results show that the addition of the color information significantly outperforms the sole use of the shape context descriptor.
|
|