Record |
Author |
Sergi Garcia Bordils; Dimosthenis Karatzas; Marçal Rusiñol |
Title |
Accelerating Transformer-Based Scene Text Detection and Recognition via Token Pruning |
Type |
Conference Article |
Year |
2023 |
Publication |
17th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
Volume |
14192 |
Issue |
|
Pages |
106-121 |
Keywords |
Scene Text Detection; Scene Text Recognition; Transformer Acceleration |
Abstract |
Scene text detection and recognition is a crucial task in computer vision with numerous real-world applications. Transformer-based approaches are behind all current state-of-the-art models and have achieved excellent performance. However, the computational requirements of the transformer architecture makes training these methods slow and resource heavy. In this paper, we introduce a new token pruning strategy that significantly decreases training and inference times without sacrificing performance, striking a balance between accuracy and speed. We have applied this pruning technique to our own end-to-end transformer-based scene text understanding architecture. Our method uses a separate detection branch to guide the pruning of uninformative image features, which significantly reduces the number of tokens at the input of the transformer. Experimental results show how our network is able to obtain competitive results on multiple public benchmarks while running at significantly higher speeds. |
Address |
San Jose; CA; USA; August 2023 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICDAR |
Notes |
DAG |
Approved |
no |
Call Number |
Admin @ si @ GKR2023a |
Serial |
3907 |
Permanent link to this record |