Home | << 1 2 3 4 5 6 7 8 9 >> |
Records | |||||
---|---|---|---|---|---|
Author | Andreea Glavan; Alina Matei; Petia Radeva; Estefania Talavera | ||||
Title | Does our social life influence our nutritional behaviour? Understanding nutritional habits from egocentric photo-streams | Type | Journal Article | ||
Year | 2021 | Publication | Expert Systems with Applications | Abbreviated Journal | ESWA |
Volume | 171 | Issue | Pages | 114506 | |
Keywords | |||||
Abstract | Nutrition and social interactions are both key aspects of the daily lives of humans. In this work, we propose a system to evaluate the influence of social interaction in the nutritional habits of a person from a first-person perspective. In order to detect the routine of an individual, we construct a nutritional behaviour pattern discovery model, which outputs routines over a number of days. Our method evaluates similarity of routines with respect to visited food-related scenes over the collected days, making use of Dynamic Time Warping, as well as considering social engagement and its correlation with food-related activities. The nutritional and social descriptors of the collected days are evaluated and encoded using an LSTM Autoencoder. Later, the obtained latent space is clustered to find similar days unaffected by outliers using the Isolation Forest method. Moreover, we introduce a new score metric to evaluate the performance of the proposed algorithm. We validate our method on 104 days and more than 100 k egocentric images gathered by 7 users. Several different visualizations are evaluated for the understanding of the findings. Our results demonstrate good performance and applicability of our proposed model for social-related nutritional behaviour understanding. At the end, relevant applications of the model are discussed by analysing the discovered routine of particular individuals. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ GMR2021 | Serial | 3634 | ||
Permanent link to this record | |||||
Author | Md. Mostafa Kamal Sarker; Hatem A. Rashwan; Farhan Akram; Vivek Kumar Singh; Syeda Furruka Banu; Forhad U H Chowdhury; Kabir Ahmed Choudhury; Sylvie Chambon; Petia Radeva; Domenec Puig; Mohamed Abdel-Nasser | ||||
Title | SLSNet: Skin lesion segmentation using a lightweight generative adversarial network | Type | Journal Article | ||
Year | 2021 | Publication | Expert Systems With Applications | Abbreviated Journal | ESWA |
Volume | 183 | Issue | Pages | 115433 | |
Keywords | |||||
Abstract | The determination of precise skin lesion boundaries in dermoscopic images using automated methods faces many challenges, most importantly, the presence of hair, inconspicuous lesion edges and low contrast in dermoscopic images, and variability in the color, texture and shapes of skin lesions. Existing deep learning-based skin lesion segmentation algorithms are expensive in terms of computational time and memory. Consequently, running such segmentation algorithms requires a powerful GPU and high bandwidth memory, which are not available in dermoscopy devices. Thus, this article aims to achieve precise skin lesion segmentation with minimum resources: a lightweight, efficient generative adversarial network (GAN) model called SLSNet, which combines 1-D kernel factorized networks, position and channel attention, and multiscale aggregation mechanisms with a GAN model. The 1-D kernel factorized network reduces the computational cost of 2D filtering. The position and channel attention modules enhance the discriminative ability between the lesion and non-lesion feature representations in spatial and channel dimensions, respectively. A multiscale block is also used to aggregate the coarse-to-fine features of input skin images and reduce the effect of the artifacts. SLSNet is evaluated on two publicly available datasets: ISBI 2017 and the ISIC 2018. Although SLSNet has only 2.35 million parameters, the experimental results demonstrate that it achieves segmentation results on a par with the state-of-the-art skin lesion segmentation methods with an accuracy of 97.61%, and Dice and Jaccard similarity coefficients of 90.63% and 81.98%, respectively. SLSNet can run at more than 110 frames per second (FPS) in a single GTX1080Ti GPU, which is faster than well-known deep learning-based image segmentation models, such as FCN. Therefore, SLSNet can be used for practical dermoscopic applications. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ SRA2021 | Serial | 3633 | ||
Permanent link to this record | |||||
Author | Henry Velesaca; Patricia Suarez; Raul Mira; Angel Sappa | ||||
Title | Computer Vision based Food Grain Classification: a Comprehensive Survey | Type | Journal Article | ||
Year | 2021 | Publication | Computers and Electronics in Agriculture | Abbreviated Journal | CEA |
Volume | 187 | Issue | Pages | 106287 | |
Keywords | |||||
Abstract | This manuscript presents a comprehensive survey on recent computer vision based food grain classification techniques. It includes state-of-the-art approaches intended for different grain varieties. The approaches proposed in the literature are analyzed according to the processing stages considered in the classification pipeline, making it easier to identify common techniques and comparisons. Additionally, the type of images considered by each approach (i.e., images from the: visible, infrared, multispectral, hyperspectral bands) together with the strategy used to generate ground truth data (i.e., real and synthetic images) are reviewed. Finally, conclusions highlighting future needs and challenges are presented. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MSIAU; 600.130; 600.122 | Approved | no | ||
Call Number | Admin @ si @ VSM2021 | Serial | 3576 | ||
Permanent link to this record | |||||
Author | Giuseppe Pezzano; Vicent Ribas Ripoll; Petia Radeva | ||||
Title | CoLe-CNN: Context-learning convolutional neural network with adaptive loss function for lung nodule segmentation | Type | Journal Article | ||
Year | 2021 | Publication | Computer Methods and Programs in Biomedicine | Abbreviated Journal | CMPB |
Volume | 198 | Issue | Pages | 105792 | |
Keywords | |||||
Abstract | Background and objective:An accurate segmentation of lung nodules in computed tomography images is a crucial step for the physical characterization of the tumour. Being often completely manually accomplished, nodule segmentation turns to be a tedious and time-consuming procedure and this represents a high obstacle in clinical practice. In this paper, we propose a novel Convolutional Neural Network for nodule segmentation that combines a light and efficient architecture with innovative loss function and segmentation strategy. Methods:In contrast to most of the standard end-to-end architectures for nodule segmentation, our network learns the context of the nodules by producing two masks representing all the background and secondary-important elements in the Computed Tomography scan. The nodule is detected by subtracting the context from the original scan image. Additionally, we introduce an asymmetric loss function that automatically compensates for potential errors in the nodule annotations. We trained and tested our Neural Network on the public LIDC-IDRI database, compared it with the state of the art and run a pseudo-Turing test between four radiologists and the network. Results:The results proved that the behaviour of the algorithm is very near to the human performance and its segmentation masks are almost indistinguishable from the ones made by the radiologists. Our method clearly outperforms the state of the art on CT nodule segmentation in terms of F1 score and IoU of and respectively. Conclusions: The main structure of the network ensures all the properties of the UNet architecture, while the Multi Convolutional Layers give a more accurate pattern recognition. The newly adopted solutions also increase the details on the border of the nodule, even under the noisiest conditions. This method can be applied now for single CT slice nodule segmentation and it represents a starting point for the future development of a fully automatic 3D segmentation software. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ PRR2021 | Serial | 3530 | ||
Permanent link to this record | |||||
Author | Yaxing Wang; Abel Gonzalez-Garcia; Luis Herranz; Joost Van de Weijer | ||||
Title | Controlling biases and diversity in diverse image-to-image translation | Type | Journal Article | ||
Year | 2021 | Publication | Computer Vision and Image Understanding | Abbreviated Journal | CVIU |
Volume | 202 | Issue | Pages | 103082 | |
Keywords | |||||
Abstract | JCR 2019 Q2, IF=3.121
The task of unpaired image-to-image translation is highly challenging due to the lack of explicit cross-domain pairs of instances. We consider here diverse image translation (DIT), an even more challenging setting in which an image can have multiple plausible translations. This is normally achieved by explicitly disentangling content and style in the latent representation and sampling different styles codes while maintaining the image content. Despite the success of current DIT models, they are prone to suffer from bias. In this paper, we study the problem of bias in image-to-image translation. Biased datasets may add undesired changes (e.g. change gender or race in face images) to the output translations as a consequence of the particular underlying visual distribution in the target domain. In order to alleviate the effects of this problem we propose the use of semantic constraints that enforce the preservation of desired image properties. Our proposed model is a step towards unbiased diverse image-to-image translation (UDIT), and results in less unwanted changes in the translated images while still performing the wanted transformation. Experiments on several heavily biased datasets show the effectiveness of the proposed techniques in different domains such as faces, objects, and scenes. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.141; 600.109; 600.147 | Approved | no | ||
Call Number | Admin @ si @ WGH2021 | Serial | 3464 | ||
Permanent link to this record | |||||
Author | Marta Ligero; Alonso Garcia Ruiz; Cristina Viaplana; Guillermo Villacampa; Maria V Raciti; Jaid Landa; Ignacio Matos; Juan Martin Liberal; Maria Ochoa de Olza; Cinta Hierro; Joaquin Mateo; Macarena Gonzalez; Rafael Morales Barrera; Cristina Suarez; Jordi Rodon; Elena Elez; Irene Braña; Eva Muñoz-Couselo; Ana Oaknin; Roberta Fasani; Paolo Nuciforo; Debora Gil; Carlota Rubio Perez; Joan Seoane; Enriqueta Felip; Manuel Escobar; Josep Tabernero; Joan Carles; Rodrigo Dienstmann; Elena Garralda; Raquel Perez Lopez | ||||
Title | A CT-based radiomics signature is associated with response to immune checkpoint inhibitors in advanced solid tumors | Type | Journal Article | ||
Year | 2021 | Publication | Radiology | Abbreviated Journal | |
Volume | 299 | Issue | 1 | Pages | 109-119 |
Keywords | |||||
Abstract | Background Reliable predictive imaging markers of response to immune checkpoint inhibitors are needed. Purpose To develop and validate a pretreatment CT-based radiomics signature to predict response to immune checkpoint inhibitors in advanced solid tumors. Materials and Methods In this retrospective study, a radiomics signature was developed in patients with advanced solid tumors (including breast, cervix, gastrointestinal) treated with anti-programmed cell death-1 or programmed cell death ligand-1 monotherapy from August 2012 to May 2018 (cohort 1). This was tested in patients with bladder and lung cancer (cohorts 2 and 3). Radiomics variables were extracted from all metastases delineated at pretreatment CT and selected by using an elastic-net model. A regression model combined radiomics and clinical variables with response as the end point. Biologic validation of the radiomics score with RNA profiling of cytotoxic cells (cohort 4) was assessed with Mann-Whitney analysis. Results The radiomics signature was developed in 85 patients (cohort 1: mean age, 58 years ± 13 [standard deviation]; 43 men) and tested on 46 patients (cohort 2: mean age, 70 years ± 12; 37 men) and 47 patients (cohort 3: mean age, 64 years ± 11; 40 men). Biologic validation was performed in a further cohort of 20 patients (cohort 4: mean age, 60 years ± 13; 14 men). The radiomics signature was associated with clinical response to immune checkpoint inhibitors (area under the curve [AUC], 0.70; 95% CI: 0.64, 0.77; P < .001). In cohorts 2 and 3, the AUC was 0.67 (95% CI: 0.58, 0.76) and 0.67 (95% CI: 0.56, 0.77; P < .001), respectively. A radiomics-clinical signature (including baseline albumin level and lymphocyte count) improved on radiomics-only performance (AUC, 0.74 [95% CI: 0.63, 0.84; P < .001]; Akaike information criterion, 107.00 and 109.90, respectively). Conclusion A pretreatment CT-based radiomics signature is associated with response to immune checkpoint inhibitors, likely reflecting the tumor immunophenotype. © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Summers in this issue. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; 600.145 | Approved | no | ||
Call Number | Admin @ si @ LGV2021 | Serial | 3593 | ||
Permanent link to this record | |||||
Author | Jose Elias Yauri; Aura Hernandez-Sabate; Pau Folch; Debora Gil | ||||
Title | Mental Workload Detection Based on EEG Analysis | Type | Conference Article | ||
Year | 2021 | Publication | Artificial Intelligent Research and Development. Proceedings 23rd International Conference of the Catalan Association for Artificial Intelligence. | Abbreviated Journal | |
Volume | 339 | Issue | Pages | 268-277 | |
Keywords | Cognitive states; Mental workload; EEG analysis; Neural Networks. | ||||
Abstract | The study of mental workload becomes essential for human work efficiency, health conditions and to avoid accidents, since workload compromises both performance and awareness. Although workload has been widely studied using several physiological measures, minimising the sensor network as much as possible remains both a challenge and a requirement.
Electroencephalogram (EEG) signals have shown a high correlation to specific cognitive and mental states like workload. However, there is not enough evidence in the literature to validate how well models generalize in case of new subjects performing tasks of a workload similar to the ones included during model’s training. In this paper we propose a binary neural network to classify EEG features across different mental workloads. Two workloads, low and medium, are induced using two variants of the N-Back Test. The proposed model was validated in a dataset collected from 16 subjects and shown a high level of generalization capability: model reported an average recall of 81.81% in a leave-one-out subject evaluation. |
||||
Address | Virtual; October 20-22 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CCIA | ||
Notes | IAM; 600.139; 600.118; 600.145 | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3723 | ||
Permanent link to this record | |||||
Author | Joan Codina-Filba; Sergio Escalera; Joan Escudero; Coen Antens; Pau Buch-Cardona; Mireia Farrus | ||||
Title | Mobile eHealth Platform for Home Monitoring of Bipolar Disorder | Type | Conference Article | ||
Year | 2021 | Publication | 27th ACM International Conference on Multimedia Modeling | Abbreviated Journal | |
Volume | 12573 | Issue | Pages | 330-341 | |
Keywords | |||||
Abstract | People suffering Bipolar Disorder (BD) experiment changes in mood status having depressive or manic episodes with normal periods in the middle. BD is a chronic disease with a high level of non-adherence to medication that needs a continuous monitoring of patients to detect when they relapse in an episode, so that physicians can take care of them. Here we present MoodRecord, an easy-to-use, non-intrusive, multilingual, robust and scalable platform suitable for home monitoring patients with BD, that allows physicians and relatives to track the patient state and get alarms when abnormalities occur.
MoodRecord takes advantage of the capabilities of smartphones as a communication and recording device to do a continuous monitoring of patients. It automatically records user activity, and asks the user to answer some questions or to record himself in video, according to a predefined plan designed by physicians. The video is analysed, recognising the mood status from images and bipolar assessment scores are extracted from speech parameters. The data obtained from the different sources are merged periodically to observe if a relapse may start and if so, raise the corresponding alarm. The application got a positive evaluation in a pilot with users from three different countries. During the pilot, the predictions of the voice and image modules showed a coherent correlation with the diagnosis performed by clinicians. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | MMM | ||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ CEE2021 | Serial | 3659 | ||
Permanent link to this record | |||||
Author | Bartlomiej Twardowski; Pawel Zawistowski; Szymon Zaborowski | ||||
Title | Metric Learning for Session-Based Recommendations | Type | Conference Article | ||
Year | 2021 | Publication | 43rd edition of the annual BCS-IRSG European Conference on Information Retrieval | Abbreviated Journal | |
Volume | 12656 | Issue | Pages | 650-665 | |
Keywords | Session-based recommendations; Deep metric learning; Learning to rank | ||||
Abstract | Session-based recommenders, used for making predictions out of users’ uninterrupted sequences of actions, are attractive for many applications. Here, for this task we propose using metric learning, where a common embedding space for sessions and items is created, and distance measures dissimilarity between the provided sequence of users’ events and the next action. We discuss and compare metric learning approaches to commonly used learning-to-rank methods, where some synergies exist. We propose a simple architecture for problem analysis and demonstrate that neither extensively big nor deep architectures are necessary in order to outperform existing methods. The experimental results against strong baselines on four datasets are provided with an ablation study. | ||||
Address | Virtual; March 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECIR | ||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ TZZ2021 | Serial | 3586 | ||
Permanent link to this record | |||||
Author | Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) | ||||
Title | 16th International Conference, 2021, Proceedings, Part I | Type | Book Whole | ||
Year | 2021 | Publication | Document Analysis and Recognition – ICDAR 2021 | Abbreviated Journal | |
Volume | 12821 | Issue | Pages | ||
Keywords | |||||
Abstract | This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: historical document analysis, document analysis systems, handwriting recognition, scene text detection and recognition, document image processing, natural language processing (NLP) for document understanding, and graphics, diagram and math recognition. |
||||
Address | Lausanne, Switzerland, September 5-10, 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Cham | Place of Publication | Editor | Josep Llados; Daniel Lopresti; Seiichi Uchida | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-030-86548-1 | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3725 | ||
Permanent link to this record | |||||
Author | Adria Molina; Pau Riba; Lluis Gomez; Oriol Ramos Terrades; Josep Llados | ||||
Title | Date Estimation in the Wild of Scanned Historical Photos: An Image Retrieval Approach | Type | Conference Article | ||
Year | 2021 | Publication | 16th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | 12822 | Issue | Pages | 306-320 | |
Keywords | |||||
Abstract | This paper presents a novel method for date estimation of historical photographs from archival sources. The main contribution is to formulate the date estimation as a retrieval task, where given a query, the retrieved images are ranked in terms of the estimated date similarity. The closer are their embedded representations the closer are their dates. Contrary to the traditional models that design a neural network that learns a classifier or a regressor, we propose a learning objective based on the nDCG ranking metric. We have experimentally evaluated the performance of the method in two different tasks: date estimation and date-sensitive image retrieval, using the DEW public database, overcoming the baseline methods. | ||||
Address | Lausanne; Suissa; September 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.121; 600.140; 110.312 | Approved | no | ||
Call Number | Admin @ si @ MRG2021b | Serial | 3571 | ||
Permanent link to this record | |||||
Author | Pau Riba; Adria Molina; Lluis Gomez; Oriol Ramos Terrades; Josep Llados | ||||
Title | Learning to Rank Words: Optimizing Ranking Metrics for Word Spotting | Type | Conference Article | ||
Year | 2021 | Publication | 16th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | 12822 | Issue | Pages | 381–395 | |
Keywords | |||||
Abstract | In this paper, we explore and evaluate the use of ranking-based objective functions for learning simultaneously a word string and a word image encoder. We consider retrieval frameworks in which the user expects a retrieval list ranked according to a defined relevance score. In the context of a word spotting problem, the relevance score has been set according to the string edit distance from the query string. We experimentally demonstrate the competitive performance of the proposed model on query-by-string word spotting for both, handwritten and real scene word images. We also provide the results for query-by-example word spotting, although it is not the main focus of this work. | ||||
Address | Lausanne; Suissa; September 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.121; 600.140; 110.312 | Approved | no | ||
Call Number | Admin @ si @ RMG2021 | Serial | 3572 | ||
Permanent link to this record | |||||
Author | Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) | ||||
Title | 16th International Conference, 2021, Proceedings, Part II | Type | Book Whole | ||
Year | 2021 | Publication | Document Analysis and Recognition – ICDAR 2021 | Abbreviated Journal | |
Volume | 12822 | Issue | Pages | ||
Keywords | |||||
Abstract | This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding. |
||||
Address | Lausanne, Switzerland, September 5-10, 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Cham | Place of Publication | Editor | Josep Llados; Daniel Lopresti; Seiichi Uchida | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-030-86330-2 | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3726 | ||
Permanent link to this record | |||||
Author | Ruben Tito; Dimosthenis Karatzas; Ernest Valveny | ||||
Title | Document Collection Visual Question Answering | Type | Conference Article | ||
Year | 2021 | Publication | 16th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | 12822 | Issue | Pages | 778-792 | |
Keywords | Document collection; Visual Question Answering | ||||
Abstract | Current tasks and methods in Document Understanding aims to process documents as single elements. However, documents are usually organized in collections (historical records, purchase invoices), that provide context useful for their interpretation. To address this problem, we introduce Document Collection Visual Question Answering (DocCVQA) a new dataset and related task, where questions are posed over a whole collection of document images and the goal is not only to provide the answer to the given question, but also to retrieve the set of documents that contain the information needed to infer the answer. Along with the dataset we propose a new evaluation metric and baselines which provide further insights to the new dataset and task. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ TKV2021 | Serial | 3622 | ||
Permanent link to this record | |||||
Author | Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) | ||||
Title | 16th International Conference, 2021, Proceedings, Part III | Type | Book Whole | ||
Year | 2021 | Publication | Document Analysis and Recognition – ICDAR 2021 | Abbreviated Journal | |
Volume | 12823 | Issue | Pages | ||
Keywords | |||||
Abstract | This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding. |
||||
Address | Lausanne, Switzerland, September 5-10, 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Cham | Place of Publication | Editor | Josep Llados; Daniel Lopresti; Seiichi Uchida | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-030-86333-3 | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3727 | ||
Permanent link to this record |