toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Jose Antonio Rodriguez; Florent Perronnin; Gemma Sanchez; Josep Llados edit  doi
openurl 
  Title Unsupervised writer style adaptation for handwritten word spotting Type Conference Article
  Year 2008 Publication Pattern Recognition. 19th International Conference on, IBM Best Student Paper Award. Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address (down) Tampa, USA  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG Approved no  
  Call Number DAG @ dag @ RPS2008 Serial 1077  
Permanent link to this record
 

 
Author H. Chouaib; Oriol Ramos Terrades; Salvatore Tabbone; F. Cloppet; N. Vincent edit  doi
openurl 
  Title Feature Selection Combining Genetic Algorithm and Adaboost Classifiers Type Conference Article
  Year 2008 Publication 19th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 1-4  
  Keywords  
  Abstract  
  Address (down) Tampa, Florida  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG Approved no  
  Call Number Admin @ si @ CRT2008 Serial 1872  
Permanent link to this record
 

 
Author Salvatore Tabbone; Oriol Ramos Terrades; S. Barrat edit  doi
openurl 
  Title Histogram of radon transform. A useful descriptor for shape retrieval Type Conference Article
  Year 2008 Publication 19th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 1-4  
  Keywords  
  Abstract  
  Address (down) Tampa, Florida  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG Approved no  
  Call Number Admin @ si @ TRB2008 Serial 1876  
Permanent link to this record
 

 
Author Partha Pratim Roy; Umapada Pal; Josep Llados; F. Kimura edit  doi
openurl 
  Title Convex Hull based Approach for Multi-oriented Character Recognition form Graphical Documents Type Conference Article
  Year 2008 Publication 19th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address (down) Tampa (Florida)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG Approved no  
  Call Number DAG @ dag @ RPL2008d Serial 1073  
Permanent link to this record
 

 
Author Raul Gomez; Ali Furkan Biten; Lluis Gomez; Jaume Gibert; Marçal Rusiñol; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Selective Style Transfer for Text Type Conference Article
  Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 805-812  
  Keywords transfer; text style transfer; data augmentation; scene text detection  
  Abstract This paper explores the possibilities of image style transfer applied to text maintaining the original transcriptions. Results on different text domains (scene text, machine printed text and handwritten text) and cross-modal results demonstrate that this is feasible, and open different research lines. Furthermore, two architectures for selective style transfer, which means
transferring style to only desired image pixels, are proposed. Finally, scene text selective style transfer is evaluated as a data augmentation technique to expand scene text detection datasets, resulting in a boost of text detectors performance. Our implementation of the described models is publicly available.
 
  Address (down) Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.129; 600.135; 601.338; 601.310; 600.121 Approved no  
  Call Number GBG2019 Serial 3265  
Permanent link to this record
 

 
Author Ali Furkan Biten; Ruben Tito; Andres Mafla; Lluis Gomez; Marçal Rusiñol; M. Mathew; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas edit   pdf
url  openurl
  Title ICDAR 2019 Competition on Scene Text Visual Question Answering Type Conference Article
  Year 2019 Publication 3rd Workshop on Closing the Loop Between Vision and Language, in conjunction with ICCV2019 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This paper presents final results of ICDAR 2019 Scene Text Visual Question Answering competition (ST-VQA). ST-VQA introduces an important aspect that is not addressed
by any Visual Question Answering system up to date, namely the incorporation of scene text to answer questions asked about an image. The competition introduces a new dataset comprising 23, 038 images annotated with 31, 791 question / answer pairs where the answer is always grounded on text instances present in the image. The images are taken from 7 different public computer vision datasets, covering a wide range of scenarios.
The competition was structured in three tasks of increasing difficulty, that require reading the text in a scene and understanding it in the context of the scene, to correctly answer a given question. A novel evaluation metric is presented, which elegantly assesses both key capabilities expected from an optimal model: text recognition and image understanding. A detailed analysis of results from different participants is showcased, which provides insight into the current capabilities of VQA systems that can read. We firmly believe the dataset proposed in this challenge will be an important milestone to consider towards a path of more robust and general models that
can exploit scene text to achieve holistic image understanding.
 
  Address (down) Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CLVL  
  Notes DAG; 600.129; 601.338; 600.135; 600.121 Approved no  
  Call Number Admin @ si @ BTM2019a Serial 3284  
Permanent link to this record
 

 
Author Ali Furkan Biten; Ruben Tito; Andres Mafla; Lluis Gomez; Marçal Rusiñol; M. Mathew; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title ICDAR 2019 Competition on Scene Text Visual Question Answering Type Conference Article
  Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1563-1570  
  Keywords  
  Abstract This paper presents final results of ICDAR 2019 Scene Text Visual Question Answering competition (ST-VQA). ST-VQA introduces an important aspect that is not addressed by any Visual Question Answering system up to date, namely the incorporation of scene text to answer questions asked about an image. The competition introduces a new dataset comprising 23,038 images annotated with 31,791 question / answer pairs where the answer is always grounded on text instances present in the image. The images are taken from 7 different public computer vision datasets, covering a wide range of scenarios. The competition was structured in three tasks of increasing difficulty, that require reading the text in a scene and understanding it in the context of the scene, to correctly answer a given question. A novel evaluation metric is presented, which elegantly assesses both key capabilities expected from an optimal model: text recognition and image understanding. A detailed analysis of results from different participants is showcased, which provides insight into the current capabilities of VQA systems that can read. We firmly believe the dataset proposed in this challenge will be an important milestone to consider towards a path of more robust and general models that can exploit scene text to achieve holistic image understanding.  
  Address (down) Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.129; 601.338; 600.121 Approved no  
  Call Number Admin @ si @ BTM2019c Serial 3286  
Permanent link to this record
 

 
Author Rui Zhang; Yongsheng Zhou; Qianyi Jiang; Qi Song; Nan Li; Kai Zhou; Lei Wang; Dong Wang; Minghui Liao; Mingkun Yang; Xiang Bai; Baoguang Shi; Dimosthenis Karatzas; Shijian Lu; CV Jawahar edit   pdf
url  doi
openurl 
  Title ICDAR 2019 Robust Reading Challenge on Reading Chinese Text on Signboard Type Conference Article
  Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1577-1581  
  Keywords  
  Abstract Chinese scene text reading is one of the most challenging problems in computer vision and has attracted great interest. Different from English text, Chinese has more than 6000 commonly used characters and Chinesecharacters can be arranged in various layouts with numerous fonts. The Chinese signboards in street view are a good choice for Chinese scene text images since they have different backgrounds, fonts and layouts. We organized a competition called ICDAR2019-ReCTS, which mainly focuses on reading Chinese text on signboard. This report presents the final results of the competition. A large-scale dataset of 25,000 annotated signboard images, in which all the text lines and characters are annotated with locations and transcriptions, were released. Four tasks, namely character recognition, text line recognition, text line detection and end-to-end recognition were set up. Besides, considering the Chinese text ambiguity issue, we proposed a multi ground truth (multi-GT) evaluation method to make evaluation fairer. The competition started on March 1, 2019 and ended on April 30, 2019. 262 submissions from 46 teams are received. Most of the participants come from universities, research institutes, and tech companies in China. There are also some participants from the United States, Australia, Singapore, and Korea. 21 teams submit results for Task 1, 23 teams submit results for Task 2, 24 teams submit results for Task 3, and 13 teams submit results for Task 4.  
  Address (down) Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.129; 600.121 Approved no  
  Call Number Admin @ si @ LZZ2019 Serial 3335  
Permanent link to this record
 

 
Author Helena Muñoz; Fernando Vilariño; Dimosthenis Karatzas edit  url
doi  openurl
  Title Eye-Movements During Information Extraction from Administrative Documents Type Conference Article
  Year 2019 Publication International Conference on Document Analysis and Recognition Workshops Abbreviated Journal  
  Volume Issue Pages 6-9  
  Keywords  
  Abstract A key aspect of digital mailroom processes is the extraction of relevant information from administrative documents. More often than not, the extraction process cannot be fully automated, and there is instead an important amount of manual intervention. In this work we study the human process of information extraction from invoice document images. We explore whether the gaze of human annotators during an manual information extraction process could be exploited towards reducing the manual effort and automating the process. To this end, we perform an eye-tracking experiment replicating real-life interfaces for information extraction. Through this pilot study we demonstrate that relevant areas in the document can be identified reliably through automatic fixation classification, and the obtained models generalize well to new subjects. Our findings indicate that it is in principle possible to integrate the human in the document image analysis loop, making use of the scanpath to automate the extraction process or verify extracted information.  
  Address (down) Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDARW  
  Notes DAG; 600.140; 600.121; 600.129;SIAI Approved no  
  Call Number Admin @ si @ MVK2019 Serial 3336  
Permanent link to this record
 

 
Author Mohammed Al Rawi; Ernest Valveny; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Can One Deep Learning Model Learn Script-Independent Multilingual Word-Spotting? Type Conference Article
  Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 260-267  
  Keywords  
  Abstract Word spotting has gained increased attention lately as it can be used to extract textual information from handwritten documents and scene-text images. Current word spotting approaches are designed to work on a single language and/or script. Building intelligent models that learn script-independent multilingual word-spotting is challenging due to the large variability of multilingual alphabets and symbols. We used ResNet-152 and the Pyramidal Histogram of Characters (PHOC) embedding to build a one-model script-independent multilingual word-spotting and we tested it on Latin, Arabic, and Bangla (Indian) languages. The one-model we propose performs on par with the multi-model language-specific word-spotting system, and thus, reduces the number of models needed for each script and/or language.  
  Address (down) Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.129; 600.121 Approved no  
  Call Number Admin @ si @ RVK2019 Serial 3337  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: