toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author George Tom; Minesh Mathew; Sergi Garcia Bordils; Dimosthenis Karatzas; CV Jawahar edit  url
openurl 
  Title ICDAR 2023 Competition on RoadText Video Text Detection, Tracking and Recognition Type Conference Article
  Year (down) 2023 Publication 17th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume 14188 Issue Pages 577–586  
  Keywords  
  Abstract In this report, we present the final results of the ICDAR 2023 Competition on RoadText Video Text Detection, Tracking and Recognition. The RoadText challenge is based on the RoadText-1K dataset and aims to assess and enhance current methods for scene text detection, recognition, and tracking in videos. The RoadText-1K dataset contains 1000 dash cam videos with annotations for text bounding boxes and transcriptions in every frame. The competition features an end-to-end task, requiring systems to accurately detect, track, and recognize text in dash cam videos. The paper presents a comprehensive review of the submitted methods along with a detailed analysis of the results obtained by the methods. The analysis provides valuable insights into the current capabilities and limitations of video text detection, tracking, and recognition systems for dashcam videos.  
  Address San Jose; CA; USA; August 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ TMG2023 Serial 3905  
Permanent link to this record
 

 
Author George Tom; Minesh Mathew; Sergi Garcia Bordils; Dimosthenis Karatzas; CV Jawahar edit  url
openurl 
  Title Reading Between the Lanes: Text VideoQA on the Road Type Conference Article
  Year (down) 2023 Publication 17th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume 14192 Issue Pages 137–154  
  Keywords VideoQA; scene text; driving videos  
  Abstract Text and signs around roads provide crucial information for drivers, vital for safe navigation and situational awareness. Scene text recognition in motion is a challenging problem, while textual cues typically appear for a short time span, and early detection at a distance is necessary. Systems that exploit such information to assist the driver should not only extract and incorporate visual and textual cues from the video stream but also reason over time. To address this issue, we introduce RoadTextVQA, a new dataset for the task of video question answering (VideoQA) in the context of driver assistance. RoadTextVQA consists of 3, 222 driving videos collected from multiple countries, annotated with 10, 500 questions, all based on text or road signs present in the driving videos. We assess the performance of state-of-the-art video question answering models on our RoadTextVQA dataset, highlighting the significant potential for improvement in this domain and the usefulness of the dataset in advancing research on in-vehicle support systems and text-aware multimodal question answering. The dataset is available at http://cvit.iiit.ac.in/research/projects/cvit-projects/roadtextvqa.  
  Address San Jose; CA; USA; August 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ TMG2023 Serial 3906  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: