2015 |
|
Hongxing Gao, Marçal Rusiñol, Dimosthenis Karatzas, Josep Llados, R.Jain and D.Doermann. 2015. Novel Line Verification for Multiple Instance Focused Retrieval in Document Collections. 13th International Conference on Document Analysis and Recognition ICDAR2015.481–485.
|
|
|
J. Chazalon, Marçal Rusiñol and Jean-Marc Ogier. 2015. Improving Document Matching Performance by Local Descriptor Filtering. 6th IAPR International Workshop on Camera Based Document Analysis and Recognition CBDAR2015.1216–1220.
Abstract: In this paper we propose an effective method aimed at reducing the amount of local descriptors to be indexed in a document matching framework. In an off-line training stage, the matching between the model document and incoming images is computed retaining the local descriptors from the model that steadily produce good matches. We have evaluated this approach by using the ICDAR2015 SmartDOC dataset containing near 25 000 images from documents to be captured by a mobile device. We have tested the performance of this filtering step by using
ORB and SIFT local detectors and descriptors. The results show an important gain both in quality of the final matching as well as in time and space requirements.
|
|
|
J. Chazalon, Marçal Rusiñol, Jean-Marc Ogier and Josep Llados. 2015. A Semi-Automatic Groundtruthing Tool for Mobile-Captured Document Segmentation. 13th International Conference on Document Analysis and Recognition ICDAR2015.621–625.
Abstract: This paper presents a novel way to generate groundtruth data for the evaluation of mobile document capture systems, focusing on the first stage of the image processing pipeline involved: document object detection and segmentation in lowquality preview frames. We introduce and describe a simple, robust and fast technique based on color markers which enables a semi-automated annotation of page corners. We also detail a technique for marker removal. Methods and tools presented in the paper were successfully used to annotate, in few hours, 24889
frames in 150 video files for the smartDOC competition at ICDAR 2015
|
|
|
J.Kuhn and 10 others. 2015. Advancing Physics Learning Through Traversing a Multi-Modal Experimentation Space. Workshop Proceedings on the 11th International Conference on Intelligent Environments.373–380.
Abstract: Translating conceptual knowledge into real world experiences presents a significant educational challenge. This position paper presents an approach that supports learners in moving seamlessly between conceptual learning and their application in the real world by bringing physical and virtual experiments into everyday settings. Learners are empowered in conducting these situated experiments in a variety of physical settings by leveraging state of the art mobile, augmented reality, and virtual reality technology. A blend of mobile-based multi-sensory physical experiments, augmented reality and enabling virtual environments can allow learners to bridge their conceptual learning with tangible experiences in a completely novel manner. This approach focuses on the learner by applying self-regulated personalised learning techniques, underpinned by innovative pedagogical approaches and adaptation techniques, to ensure that the needs and preferences of each learner are catered for individually.
|
|
|
Jean-Christophe Burie and 9 others. 2015. ICDAR2015 Competition on Smartphone Document Capture and OCR (SmartDoc). 13th International Conference on Document Analysis and Recognition ICDAR2015.1161–1165.
Abstract: Smartphones are enabling new ways of capture,
hence arises the need for seamless and reliable acquisition and
digitization of documents, in order to convert them to editable,
searchable and a more human-readable format. Current stateof-the-art
works lack databases and baseline benchmarks for
digitizing mobile captured documents. We have organized a
competition for mobile document capture and OCR in order to
address this issue. The competition is structured into two independent
challenges: smartphone document capture, and smartphone
OCR. This report describes the datasets for both challenges
along with their ground truth, details the performance evaluation
protocols which we used, and presents the final results of the
participating methods. In total, we received 13 submissions: 8
for challenge-I, and 5 for challenge-2.
|
|
|
Juan Ignacio Toledo, Jordi Cucurull, Jordi Puiggali, Alicia Fornes and Josep Llados. 2015. Document Analysis Techniques for Automatic Electoral Document Processing: A Survey. E-Voting and Identity, Proceedings of 5th international conference, VoteID 2015.139–141. (LNCS.)
Abstract: In this paper, we will discuss the most common challenges in electoral document processing and study the different solutions from the document analysis community that can be applied in each case. We will cover Optical Mark Recognition techniques to detect voter selections in the Australian Ballot, handwritten number recognition for preferential elections and handwriting recognition for write-in areas. We will also propose some particular adjustments that can be made to those general techniques in the specific context of electoral documents.
Keywords: Document image analysis; Computer vision; Paper ballots; Paper based elections; Optical scan; Tally
|
|
|
Lluis Gomez and Dimosthenis Karatzas. 2015. Object Proposals for Text Extraction in the Wild. 13th International Conference on Document Analysis and Recognition ICDAR2015.206–210.
Abstract: Object Proposals is a recent computer vision technique receiving increasing interest from the research community. Its main objective is to generate a relatively small set of bounding box proposals that are most likely to contain objects of interest. The use of Object Proposals techniques in the scene text understanding field is innovative. Motivated by the success of powerful while expensive techniques to recognize words in a holistic way, Object Proposals techniques emerge as an alternative to the traditional text detectors. In this paper we study to what extent the existing generic Object Proposals methods may be useful for scene text understanding. Also, we propose a new Object Proposals algorithm that is specifically designed for text and compare it with other generic methods in the state of the art. Experiments show that our proposal is superior in its ability of producing good quality word proposals in an efficient way. The source code of our method is made publicly available
|
|
|
Lluis Pere de las Heras, Oriol Ramos Terrades and Josep Llados. 2015. Attributed Graph Grammar for floor plan analysis. 13th International Conference on Document Analysis and Recognition ICDAR2015.726–730.
Abstract: In this paper, we propose the use of an Attributed Graph Grammar as unique framework to model and recognize the structure of floor plans. This grammar represents a building as a hierarchical composition of structurally and semantically related elements, where common representations are learned stochastically from annotated data. Given an input image, the parsing consists on constructing that graph representation that better agrees with the probabilistic model defined by the grammar. The proposed method provides several advantages with respect to the traditional floor plan analysis techniques. It uses an unsupervised statistical approach for detecting walls that adapts to different graphical notations and relaxes strong structural assumptions such are straightness and orthogonality. Moreover, the independence between the knowledge model and the parsing implementation allows the method to learn automatically different building configurations and thus, to cope the existing variability. These advantages are clearly demonstrated by comparing it with the most recent floor plan interpretation techniques on 4 datasets of real floor plans with different notations.
|
|
|
Lluis Pere de las Heras, Oriol Ramos Terrades, Josep Llados, David Fernandez and Cristina Cañero. 2015. Use case visual Bag-of-Words techniques for camera based identity document classification. 13th International Conference on Document Analysis and Recognition ICDAR2015.721–725.
Abstract: Nowadays, automatic identity document recognition, including passport and driving license recognition, is at the core of many applications within the administrative and service sectors, such as police, hospitality, car renting, etc. In former years, the document information was manually extracted whereas today this data is recognized automatically from images obtained by flat-bed scanners. Yet, since these scanners tend to be expensive and voluminous, companies in the sector have recently turned their attention to cheaper, small and yet computationally powerful scanners: the mobile devices. The document identity recognition from mobile images enclose several new difficulties w.r.t traditional scanned images, such as the loss of a controlled background, perspective, blurring, etc. In this paper we present a real application for identity document classification of images taken from mobile devices. This classification process is of extreme importance since a prior knowledge of the document type and origin strongly facilitates the subsequent information extraction. The proposed method is based on a traditional Bagof-Words in which we have taken into consideration several key aspects to enhance recognition rate. The method performance has been studied on three datasets containing more than 2000 images from 129 different document classes.
|
|
|
Lluis Pere de las Heras, Oriol Ramos Terrades, Sergi Robles and Gemma Sanchez. 2015. CVC-FP and SGT: a new database for structural floor plan analysis and its groundtruthing tool. IJDAR, 18(1), 15–30.
Abstract: Recent results on structured learning methods have shown the impact of structural information in a wide range of pattern recognition tasks. In the field of document image analysis, there is a long experience on structural methods for the analysis and information extraction of multiple types of documents. Yet, the lack of conveniently annotated and free access databases has not benefited the progress in some areas such as technical drawing understanding. In this paper, we present a floor plan database, named CVC-FP, that is annotated for the architectural objects and their structural relations. To construct this database, we have implemented a groundtruthing tool, the SGT tool, that allows to make specific this sort of information in a natural manner. This tool has been made for general purpose groundtruthing: It allows to define own object classes and properties, multiple labeling options are possible, grants the cooperative work, and provides user and version control. We finally have collected some of the recent work on floor plan interpretation and present a quantitative benchmark for this database. Both CVC-FP database and the SGT tool are freely released to the research community to ease comparisons between methods and boost reproducible research.
|
|