Home | [191–200] << 201 202 203 204 205 206 207 208 209 210 >> [211–220] |
Records | |||||
---|---|---|---|---|---|
Author | Patricia Marquez; Debora Gil; Aura Hernandez-Sabate | ||||
Title | Evaluation of the Capabilities of Confidence Measures for Assessing Optical Flow Quality | Type | Conference Article | ||
Year | 2013 | Publication | ICCV Workshop on Computer Vision in Vehicle Technology: From Earth to Mars | Abbreviated Journal | |
Volume | Issue | Pages | 624-631 | ||
Keywords | |||||
Abstract | Assessing Optical Flow (OF) quality is essential for its further use in reliable decision support systems. The absence of ground truth in such situations leads to the computation of OF Confidence Measures (CM) obtained from either input or output data. A fair comparison across the capabilities of the different CM for bounding OF error is required in order to choose the best OF-CM pair for discarding points where OF computation is not reliable. This paper presents a statistical probabilistic framework for assessing the quality of a given CM. Our quality measure is given in terms of the percentage of pixels whose OF error bound can not be determined by CM values. We also provide statistical tools for the computation of CM values that ensures a given accuracy of the flow field. | ||||
Address | Sydney; Australia; December 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVTT:E2M | ||
Notes | IAM; ADAS; 600.044; 600.057; 601.145 | Approved | no | ||
Call Number | Admin @ si @ MGH2013b | Serial | 2351 | ||
Permanent link to this record | |||||
Author | Javier Marin; David Vazquez; Antonio Lopez; Jaume Amores; Bastian Leibe | ||||
Title | Random Forests of Local Experts for Pedestrian Detection | Type | Conference Article | ||
Year | 2013 | Publication | 15th IEEE International Conference on Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 2592 - 2599 | ||
Keywords | ADAS; Random Forest; Pedestrian Detection | ||||
Abstract | Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one. | ||||
Address | Sydney; Australia; December 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | IEEE | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1550-5499 | ISBN | Medium | ||
Area | Expedition | Conference | ICCV | ||
Notes | ADAS; 600.057; 600.054 | Approved | no | ||
Call Number | ADAS @ adas @ MVL2013 | Serial | 2333 | ||
Permanent link to this record | |||||
Author | Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny | ||||
Title | Handwritten Word Spotting with Corrected Attributes | Type | Conference Article | ||
Year | 2013 | Publication | 15th IEEE International Conference on Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 1017-1024 | ||
Keywords | |||||
Abstract | We propose an approach to multi-writer word spotting, where the goal is to find a query word in a dataset comprised of document images. We propose an attributes-based approach that leads to a low-dimensional, fixed-length representation of the word images that is fast to compute and, especially, fast to compare. This approach naturally leads to an unified representation of word images and strings, which seamlessly allows one to indistinctly perform query-by-example, where the query is an image, and query-by-string, where the query is a string. We also propose a calibration scheme to correct the attributes scores based on Canonical Correlation Analysis that greatly improves the results on a challenging dataset. We test our approach on two public datasets showing state-of-the-art results. | ||||
Address | Sydney; Australia; December 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1550-5499 | ISBN | Medium | ||
Area | Expedition | Conference | ICCV | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ AGF2013 | Serial | 2327 | ||
Permanent link to this record | |||||
Author | Gemma Roig; Xavier Boix; R. de Nijs; Sebastian Ramos; K. Kühnlenz; Luc Van Gool | ||||
Title | Active MAP Inference in CRFs for Efficient Semantic Segmentation | Type | Conference Article | ||
Year | 2013 | Publication | 15th IEEE International Conference on Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 2312 - 2319 | ||
Keywords | Semantic Segmentation | ||||
Abstract | Most MAP inference algorithms for CRFs optimize an energy function knowing all the potentials. In this paper, we focus on CRFs where the computational cost of instantiating the potentials is orders of magnitude higher than MAP inference. This is often the case in semantic image segmentation, where most potentials are instantiated by slow classifiers fed with costly features. We introduce Active MAP inference 1) to on-the-fly select a subset of potentials to be instantiated in the energy function, leaving the rest of the parameters of the potentials unknown, and 2) to estimate the MAP labeling from such incomplete energy function. Results for semantic segmentation benchmarks, namely PASCAL VOC 2010 [5] and MSRC-21 [19], show that Active MAP inference achieves similar levels of accuracy but with major efficiency gains. | ||||
Address | Sydney; Australia; December 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1550-5499 | ISBN | Medium | ||
Area | Expedition | Conference | ICCV | ||
Notes | ADAS; 600.057 | Approved | no | ||
Call Number | ADAS @ adas @ RBN2013 | Serial | 2377 | ||
Permanent link to this record | |||||
Author | Raul Gomez; Ali Furkan Biten; Lluis Gomez; Jaume Gibert; Marçal Rusiñol; Dimosthenis Karatzas | ||||
Title | Selective Style Transfer for Text | Type | Conference Article | ||
Year | 2019 | Publication | 15th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 805-812 | ||
Keywords | transfer; text style transfer; data augmentation; scene text detection | ||||
Abstract | This paper explores the possibilities of image style transfer applied to text maintaining the original transcriptions. Results on different text domains (scene text, machine printed text and handwritten text) and cross-modal results demonstrate that this is feasible, and open different research lines. Furthermore, two architectures for selective style transfer, which means
transferring style to only desired image pixels, are proposed. Finally, scene text selective style transfer is evaluated as a data augmentation technique to expand scene text detection datasets, resulting in a boost of text detectors performance. Our implementation of the described models is publicly available. |
||||
Address | Sydney; Australia; September 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.129; 600.135; 601.338; 601.310; 600.121 | Approved | no | ||
Call Number | GBG2019 | Serial | 3265 | ||
Permanent link to this record | |||||
Author | Ali Furkan Biten; R. Tito; Andres Mafla; Lluis Gomez; Marçal Rusiñol; M. Mathew; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas | ||||
Title | ICDAR 2019 Competition on Scene Text Visual Question Answering | Type | Conference Article | ||
Year | 2019 | Publication | 3rd Workshop on Closing the Loop Between Vision and Language, in conjunction with ICCV2019 | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | This paper presents final results of ICDAR 2019 Scene Text Visual Question Answering competition (ST-VQA). ST-VQA introduces an important aspect that is not addressed
by any Visual Question Answering system up to date, namely the incorporation of scene text to answer questions asked about an image. The competition introduces a new dataset comprising 23, 038 images annotated with 31, 791 question / answer pairs where the answer is always grounded on text instances present in the image. The images are taken from 7 different public computer vision datasets, covering a wide range of scenarios. The competition was structured in three tasks of increasing difficulty, that require reading the text in a scene and understanding it in the context of the scene, to correctly answer a given question. A novel evaluation metric is presented, which elegantly assesses both key capabilities expected from an optimal model: text recognition and image understanding. A detailed analysis of results from different participants is showcased, which provides insight into the current capabilities of VQA systems that can read. We firmly believe the dataset proposed in this challenge will be an important milestone to consider towards a path of more robust and general models that can exploit scene text to achieve holistic image understanding. |
||||
Address | Sydney; Australia; September 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CLVL | ||
Notes | DAG; 600.129; 601.338; 600.135; 600.121 | Approved | no | ||
Call Number | Admin @ si @ BTM2019a | Serial | 3284 | ||
Permanent link to this record | |||||
Author | Ali Furkan Biten; R. Tito; Andres Mafla; Lluis Gomez; Marçal Rusiñol; M. Mathew; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas | ||||
Title | ICDAR 2019 Competition on Scene Text Visual Question Answering | Type | Conference Article | ||
Year | 2019 | Publication | 15th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1563-1570 | ||
Keywords | |||||
Abstract | This paper presents final results of ICDAR 2019 Scene Text Visual Question Answering competition (ST-VQA). ST-VQA introduces an important aspect that is not addressed by any Visual Question Answering system up to date, namely the incorporation of scene text to answer questions asked about an image. The competition introduces a new dataset comprising 23,038 images annotated with 31,791 question / answer pairs where the answer is always grounded on text instances present in the image. The images are taken from 7 different public computer vision datasets, covering a wide range of scenarios. The competition was structured in three tasks of increasing difficulty, that require reading the text in a scene and understanding it in the context of the scene, to correctly answer a given question. A novel evaluation metric is presented, which elegantly assesses both key capabilities expected from an optimal model: text recognition and image understanding. A detailed analysis of results from different participants is showcased, which provides insight into the current capabilities of VQA systems that can read. We firmly believe the dataset proposed in this challenge will be an important milestone to consider towards a path of more robust and general models that can exploit scene text to achieve holistic image understanding. | ||||
Address | Sydney; Australia; September 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.129; 601.338; 600.121 | Approved | no | ||
Call Number | Admin @ si @ BTM2019c | Serial | 3286 | ||
Permanent link to this record | |||||
Author | Rui Zhang; Yongsheng Zhou; Qianyi Jiang; Qi Song; Nan Li; Kai Zhou; Lei Wang; Dong Wang; Minghui Liao; Mingkun Yang; Xiang Bai; Baoguang Shi; Dimosthenis Karatzas; Shijian Lu; CV Jawahar | ||||
Title | ICDAR 2019 Robust Reading Challenge on Reading Chinese Text on Signboard | Type | Conference Article | ||
Year | 2019 | Publication | 15th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1577-1581 | ||
Keywords | |||||
Abstract | Chinese scene text reading is one of the most challenging problems in computer vision and has attracted great interest. Different from English text, Chinese has more than 6000 commonly used characters and Chinesecharacters can be arranged in various layouts with numerous fonts. The Chinese signboards in street view are a good choice for Chinese scene text images since they have different backgrounds, fonts and layouts. We organized a competition called ICDAR2019-ReCTS, which mainly focuses on reading Chinese text on signboard. This report presents the final results of the competition. A large-scale dataset of 25,000 annotated signboard images, in which all the text lines and characters are annotated with locations and transcriptions, were released. Four tasks, namely character recognition, text line recognition, text line detection and end-to-end recognition were set up. Besides, considering the Chinese text ambiguity issue, we proposed a multi ground truth (multi-GT) evaluation method to make evaluation fairer. The competition started on March 1, 2019 and ended on April 30, 2019. 262 submissions from 46 teams are received. Most of the participants come from universities, research institutes, and tech companies in China. There are also some participants from the United States, Australia, Singapore, and Korea. 21 teams submit results for Task 1, 23 teams submit results for Task 2, 24 teams submit results for Task 3, and 13 teams submit results for Task 4. | ||||
Address | Sydney; Australia; September 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.129; 600.121 | Approved | no | ||
Call Number | Admin @ si @ LZZ2019 | Serial | 3335 | ||
Permanent link to this record | |||||
Author | Helena Muñoz; Fernando Vilariño; Dimosthenis Karatzas | ||||
Title | Eye-Movements During Information Extraction from Administrative Documents | Type | Conference Article | ||
Year | 2019 | Publication | International Conference on Document Analysis and Recognition Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 6-9 | ||
Keywords | |||||
Abstract | A key aspect of digital mailroom processes is the extraction of relevant information from administrative documents. More often than not, the extraction process cannot be fully automated, and there is instead an important amount of manual intervention. In this work we study the human process of information extraction from invoice document images. We explore whether the gaze of human annotators during an manual information extraction process could be exploited towards reducing the manual effort and automating the process. To this end, we perform an eye-tracking experiment replicating real-life interfaces for information extraction. Through this pilot study we demonstrate that relevant areas in the document can be identified reliably through automatic fixation classification, and the obtained models generalize well to new subjects. Our findings indicate that it is in principle possible to integrate the human in the document image analysis loop, making use of the scanpath to automate the extraction process or verify extracted information. | ||||
Address | Sydney; Australia; September 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDARW | ||
Notes | DAG; 600.140; 600.121; 600.129;SIAI | Approved | no | ||
Call Number | Admin @ si @ MVK2019 | Serial | 3336 | ||
Permanent link to this record | |||||
Author | Mohammed Al Rawi; Ernest Valveny; Dimosthenis Karatzas | ||||
Title | Can One Deep Learning Model Learn Script-Independent Multilingual Word-Spotting? | Type | Conference Article | ||
Year | 2019 | Publication | 15th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 260-267 | ||
Keywords | |||||
Abstract | Word spotting has gained increased attention lately as it can be used to extract textual information from handwritten documents and scene-text images. Current word spotting approaches are designed to work on a single language and/or script. Building intelligent models that learn script-independent multilingual word-spotting is challenging due to the large variability of multilingual alphabets and symbols. We used ResNet-152 and the Pyramidal Histogram of Characters (PHOC) embedding to build a one-model script-independent multilingual word-spotting and we tested it on Latin, Arabic, and Bangla (Indian) languages. The one-model we propose performs on par with the multi-model language-specific word-spotting system, and thus, reduces the number of models needed for each script and/or language. | ||||
Address | Sydney; Australia; September 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.129; 600.121 | Approved | no | ||
Call Number | Admin @ si @ RVK2019 | Serial | 3337 | ||
Permanent link to this record | |||||
Author | Zheng Huang; Kai Chen; Jianhua He; Xiang Bai; Dimosthenis Karatzas; Shijian Lu; CV Jawahar | ||||
Title | ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction | Type | Conference Article | ||
Year | 2019 | Publication | 15th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1516-1520 | ||
Keywords | |||||
Abstract | The ICDAR 2019 Challenge on “Scanned receipts OCR and key information extraction” (SROIE) covers important aspects related to the automated analysis of scanned receipts. The SROIE tasks play a key role in many document analysis systems and hold significant commercial potential. Although a lot of work has been published over the years on administrative document analysis, the community has advanced relatively slowly, as most datasets have been kept private. One of the key contributions of SROIE to the document analysis community is to offer a first, standardized dataset of 1000 whole scanned receipt images and annotations, as well as an evaluation procedure for such tasks. The Challenge is structured around three tasks, namely Scanned Receipt Text Localization (Task 1), Scanned Receipt OCR (Task 2) and Key Information Extraction from Scanned Receipts (Task 3). The competition opened on 10th February, 2019 and closed on 5th May, 2019. We received 29, 24 and 18 valid submissions received for the three competition tasks, respectively. This report presents the competition datasets, define the tasks and the evaluation protocols, offer detailed submission statistics, as well as an analysis of the submitted performance. While the tasks of text localization and recognition seem to be relatively easy to tackle, it is interesting to observe the variety of ideas and approaches proposed for the information extraction task. According to the submissions' performance we believe there is still margin for improving information extraction performance, although the current dataset would have to grow substantially in following editions. Given the success of the SROIE competition evidenced by the wide interest generated and the healthy number of submissions from academic, research institutes and industry over different countries, we consider that the SROIE competition can evolve into a useful resource for the community, drawing further attention and promoting research and development efforts in this field. | ||||
Address | Sydney; Australia; September 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.129 | Approved | no | ||
Call Number | Admin @ si @ HCH2019 | Serial | 3338 | ||
Permanent link to this record | |||||
Author | Yipeng Sun; Zihan Ni; Chee-Kheng Chng; Yuliang Liu; Canjie Luo; Chun Chet Ng; Junyu Han; Errui Ding; Jingtuo Liu; Dimosthenis Karatzas; Chee Seng Chan; Lianwen Jin | ||||
Title | ICDAR 2019 Competition on Large-Scale Street View Text with Partial Labeling – RRC-LSVT | Type | Conference Article | ||
Year | 2019 | Publication | 15th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1557-1562 | ||
Keywords | |||||
Abstract | Robust text reading from street view images provides valuable information for various applications. Performance improvement of existing methods in such a challenging scenario heavily relies on the amount of fully annotated training data, which is costly and in-efficient to obtain. To scale up the amount of training data while keeping the labeling procedure cost-effective, this competition introduces a new challenge on Large-scale Street View Text with Partial Labeling (LSVT), providing 50, 000 and 400, 000 images in full and weak annotations, respectively. This competition aims to explore the abilities of state-of-the-art methods to detect and recognize text instances from large-scale street view images, closing the gap between research benchmarks and real applications. During the competition period, a total of 41 teams participated in the two proposed tasks with 132 valid submissions, ie, text detection and end-to-end text spotting. This paper includes dataset descriptions, task definitions, evaluation protocols and results summaries of the ICDAR 2019-LSVT challenge. | ||||
Address | Sydney; Australia; September 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.129; 600.121 | Approved | no | ||
Call Number | Admin @ si @ SNC2019 | Serial | 3339 | ||
Permanent link to this record | |||||
Author | Chee-Kheng Chng; Yuliang Liu; Yipeng Sun; Chun Chet Ng; Canjie Luo; Zihan Ni; ChuanMing Fang; Shuaitao Zhang; Junyu Han; Errui Ding; Jingtuo Liu; Dimosthenis Karatzas; Chee Seng Chan; Lianwen Jin | ||||
Title | ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text – RRC-ArT | Type | Conference Article | ||
Year | 2019 | Publication | 15th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1571-1576 | ||
Keywords | |||||
Abstract | This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text – RRC-ArT that consists of three major challenges: i) scene text detection, ii) scene text recognition, and iii) scene text spotting. A total of 78 submissions from 46 unique teams/individuals were received for this competition. The top performing score of each challenge is as follows: i) T1 – 82.65%, ii) T2.1 – 74.3%, iii) T2.2 – 85.32%, iv) T3.1 – 53.86%, and v) T3.2 – 54.91%. Apart from the results, this paper also details the ArT dataset, tasks description, evaluation metrics and participants' methods. The dataset, the evaluation kit as well as the results are publicly available at the challenge website. | ||||
Address | Sydney; Australia; September 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ CLS2019 | Serial | 3340 | ||
Permanent link to this record | |||||
Author | Nibal Nayef; Yash Patel; Michal Busta; Pinaki Nath Chowdhury; Dimosthenis Karatzas; Wafa Khlif; Jiri Matas; Umapada Pal; Jean-Christophe Burie; Cheng-lin Liu; Jean-Marc Ogier | ||||
Title | ICDAR2019 Robust Reading Challenge on Multi-lingual Scene Text Detection and Recognition — RRC-MLT-2019 | Type | Conference Article | ||
Year | 2019 | Publication | 15th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1582-1587 | ||
Keywords | |||||
Abstract | With the growing cosmopolitan culture of modern cities, the need of robust Multi-Lingual scene Text (MLT) detection and recognition systems has never been more immense. With the goal to systematically benchmark and push the state-of-the-art forward, the proposed competition builds on top of the RRC-MLT-2017 with an additional end-to-end task, an additional language in the real images dataset, a large scale multi-lingual synthetic dataset to assist the training, and a baseline End-to-End recognition method. The real dataset consists of 20,000 images containing text from 10 languages. The challenge has 4 tasks covering various aspects of multi-lingual scene text: (a) text detection, (b) cropped word script classification, (c) joint text detection and script classification and (d) end-to-end detection and recognition. In total, the competition received 60 submissions from the research and industrial communities. This paper presents the dataset, the tasks and the findings of the presented RRC-MLT-2019 challenge. | ||||
Address | Sydney; Australia; September 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ NPB2019 | Serial | 3341 | ||
Permanent link to this record | |||||
Author | Veronica Romero; Emilio Granell; Alicia Fornes; Enrique Vidal; Joan Andreu Sanchez | ||||
Title | Information Extraction in Handwritten Marriage Licenses Books | Type | Conference Article | ||
Year | 2019 | Publication | 5th International Workshop on Historical Document Imaging and Processing | Abbreviated Journal | |
Volume | Issue | Pages | 66-71 | ||
Keywords | |||||
Abstract | Handwritten marriage licenses books are characterized by a simple structure of the text in the records with an evolutionary vocabulary, mainly composed of proper names that change along the time. This distinct vocabulary makes automatic transcription and semantic information extraction difficult tasks. Previous works have shown that the use of category-based language models and a Grammatical Inference technique known as MGGI can improve the accuracy of these
tasks. However, the application of the MGGI algorithm requires an a priori knowledge to label the words of the training strings, that is not always easy to obtain. In this paper we study how to automatically obtain the information required by the MGGI algorithm using a technique based on Confusion Networks. Using the resulting language model, full handwritten text recognition and information extraction experiments have been carried out with results supporting the proposed approach. |
||||
Address | Sydney; Australia; September 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | HIP | ||
Notes | DAG; 600.140; 600.121 | Approved | no | ||
Call Number | Admin @ si @ RGF2019 | Serial | 3352 | ||
Permanent link to this record |