toggle visibility Search & Display Options

Select All    Deselect All
 | 
Citations
 | 
Lluis Gomez, Ali Furkan Biten, Ruben Tito, Andres Mafla, Marçal Rusiñol, Ernest Valveny, et al. (2021). Multimodal grid features and cell pointers for scene text visual question answering. PRL - Pattern Recognition Letters, 150, 242–249.
toggle visibility
Arka Ujal Dey, Suman Ghosh, Ernest Valveny, & Gaurav Harit. (2021). Beyond Visual Semantics: Exploring the Role of Scene Text in Image Understanding. PRL - Pattern Recognition Letters, 149, 164–171.
toggle visibility
Ruben Tito, Dimosthenis Karatzas, & Ernest Valveny. (2023). Hierarchical multimodal transformers for Multi-Page DocVQA. PR - Pattern Recognition, 144, 109834.
toggle visibility
Souhail Bakkali, Zuheng Ming, Mickael Coustaty, Marçal Rusiñol, & Oriol Ramos Terrades. (2023). VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification. PR - Pattern Recognition, 139, 109419.
toggle visibility
Manuel Carbonell, Alicia Fornes, Mauricio Villegas, & Josep Llados. (2020). A Neural Model for Text Localization, Transcription and Named Entity Recognition in Full Pages. PRL - Pattern Recognition Letters, 136, 219–227.
toggle visibility
Select All    Deselect All
 | 
Citations
 |