toggle visibility Search & Display Options

Select All    Deselect All
 | 
Citations
 | 
   print
Soumya Jahagirdar, Minesh Mathew, Dimosthenis Karatzas, & CV Jawahar. (2023). Understanding Video Scenes Through Text: Insights from Text-Based Video Question Answering. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops.
toggle visibility
Jordy Van Landeghem, Ruben Tito, Lukasz Borchmann, Michal Pietruszka, Pawel Joziak, Rafal Powalski, et al. (2023). Document Understanding Dataset and Evaluation (DUDE). In 20th IEEE International Conference on Computer Vision (pp. 19528–19540).
toggle visibility
Ruben Perez Tito. (2023). Exploring the role of Text in Visual Question Answering on Natural Scenes and Documents (Ernest Valveny, Ed.). Ph.D. thesis, IMPRIMA, .
toggle visibility
Subhajit Maity, Sanket Biswas, Siladittya Manna, Ayan Banerjee, Josep Llados, Saumik Bhattacharya, et al. (2023). SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation. In 17th International Conference on Doccument Analysis and Recognition (Vol. 14187, 342–360).
toggle visibility
Souhail Bakkali, Sanket Biswas, Zuheng Ming, Mickael Coustaty, Marçal Rusiñol, Oriol Ramos Terrades, et al. (2023). TransferDoc: A Self-Supervised Transferable Document Representation Learning Model Unifying Vision and Language.
toggle visibility
Ruben Tito, Khanh Nguyen, Marlon Tobaben, Raouf Kerkouche, Mohamed Ali Souibgui, Kangsoo Jung, et al. (2023). Privacy-Aware Document Visual Question Answering.
toggle visibility
Mohamed Ali Souibgui, Asma Bensalah, Jialuo Chen, Alicia Fornes, & Michelle Waldispühl. (2023). A User Perspective on HTR methods for the Automatic Transcription of Rare Scripts: The Case of Codex Runicus Just Accepted. JOCCH - ACM Journal on Computing and Cultural Heritage, 15(4), 1–18.
toggle visibility
Souhail Bakkali, Zuheng Ming, Mickael Coustaty, Marçal Rusiñol, & Oriol Ramos Terrades. (2023). VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification. PR - Pattern Recognition, 139, 109419.
toggle visibility
Ruben Tito, Dimosthenis Karatzas, & Ernest Valveny. (2023). Hierarchical multimodal transformers for Multi-Page DocVQA. PR - Pattern Recognition, 144, 109834.
toggle visibility
Reuben Dorent, Aaron Kujawa, Marina Ivory, Spyridon Bakas, Nikola Rieke, Samuel Joutard, et al. (2023). CrossMoDA 2021 challenge: Benchmark of Cross-Modality Domain Adaptation techniques for Vestibular Schwannoma and Cochlea Segmentation. MIA - Medical Image Analysis, 83, 102628.
toggle visibility
David Pujol Perich, Albert Clapes, & Sergio Escalera. (2023). SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization.
toggle visibility
Lei Li, Fuping Wu, Sihan Wang, Xinzhe Luo, Carlos Martin-Isla, Shuwei Zhai, et al. (2023). MyoPS: A benchmark of myocardial pathology segmentation combining three-sequence cardiac magnetic resonance images. MIA - Medical Image Analysis, 87, 102808.
toggle visibility
Razieh Rastgoo, Kourosh Kiani, & Sergio Escalera. (2023). ZS-GR: zero-shot gesture recognition from RGB-D videos. MTAP - Multimedia Tools and Applications, 82, 43781–43796.
toggle visibility
Carlos Martin-Isla, Victor M Campello, Cristian Izquierdo, Kaisar Kushibar, Carla Sendra Balcells, Polyxeni Gkontra, et al. (2023). Deep Learning Segmentation of the Right Ventricle in Cardiac MRI: The M&ms Challenge. JBHI - IEEE Journal of Biomedical and Health Informatics, 27(7), 3302–3313.
toggle visibility
Razieh Rastgoo, Kourosh Kiani, & Sergio Escalera. (2023). A deep co-attentive hand-based video question answering framework using multi-view skeleton. MTAP - Multimedia Tools and Applications, 82, 1401–1429.
toggle visibility
Select All    Deselect All
 | 
Citations
 | 
   print