Home | << 1 2 3 4 5 6 7 8 9 10 >> |
Records | |||||
---|---|---|---|---|---|
Author | Minesh Mathew; Viraj Bagal; Ruben Tito; Dimosthenis Karatzas; Ernest Valveny; C.V. Jawahar | ||||
Title | InfographicVQA | Type | Conference Article | ||
Year | 2022 | Publication | Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 1697-1706 | ||
Keywords | Document Analysis Datasets; Evaluation and Comparison of Vision Algorithms; Vision and Languages | ||||
Abstract | Infographics communicate information using a combination of textual, graphical and visual elements. This work explores the automatic understanding of infographic images by using a Visual Question Answering technique. To this end, we present InfographicVQA, a new dataset comprising a diverse collection of infographics and question-answer annotations. The questions require methods that jointly reason over the document layout, textual content, graphical elements, and data visualizations. We curate the dataset with an emphasis on questions that require elementary reasoning and basic arithmetic skills. For VQA on the dataset, we evaluate two Transformer-based strong baselines. Both the baselines yield unsatisfactory results compared to near perfect human performance on the dataset. The results suggest that VQA on infographics--images that are designed to communicate information quickly and clearly to human brain--is ideal for benchmarking machine understanding of complex document images. The dataset is available for download at docvqa. org | ||||
Address | Virtual; Waikoloa; Hawai; USA; January 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.155 | Approved | no | ||
Call Number | MBT2022 | Serial | 3625 | ||
Permanent link to this record | |||||
Author | Guillermo Torres; Sonia Baeza; Carles Sanchez; Ignasi Guasch; Antoni Rosell; Debora Gil | ||||
Title | An Intelligent Radiomic Approach for Lung Cancer Screening | Type | Journal Article | ||
Year | 2022 | Publication | Applied Sciences | Abbreviated Journal | APPLSCI |
Volume | 12 | Issue | 3 | Pages | 1568 |
Keywords | Lung cancer; Early diagnosis; Screening; Neural networks; Image embedding; Architecture optimization | ||||
Abstract | The efficiency of lung cancer screening for reducing mortality is hindered by the high rate of false positives. Artificial intelligence applied to radiomics could help to early discard benign cases from the analysis of CT scans. The available amount of data and the fact that benign cases are a minority, constitutes a main challenge for the successful use of state of the art methods (like deep learning), which can be biased, over-fitted and lack of clinical reproducibility. We present an hybrid approach combining the potential of radiomic features to characterize nodules in CT scans and the generalization of the feed forward networks. In order to obtain maximal reproducibility with minimal training data, we propose an embedding of nodules based on the statistical significance of radiomic features for malignancy detection. This representation space of lesions is the input to a feed
forward network, which architecture and hyperparameters are optimized using own-defined metrics of the diagnostic power of the whole system. Results of the best model on an independent set of patients achieve 100% of sensitivity and 83% of specificity (AUC = 0.94) for malignancy detection. |
||||
Address | Jan 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; 600.139; 600.145 | Approved | no | ||
Call Number | Admin @ si @ TBS2022 | Serial | 3699 | ||
Permanent link to this record | |||||
Author | Saad Minhas; Aura Hernandez-Sabate; Shoaib Ehsan; Klaus McDonald Maier | ||||
Title | Effects of Non-Driving Related Tasks during Self-Driving mode | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Transactions on Intelligent Transportation Systems | Abbreviated Journal | TITS |
Volume | 23 | Issue | 2 | Pages | 1391-1399 |
Keywords | |||||
Abstract | Perception reaction time and mental workload have proven to be crucial in manual driving. Moreover, in highly automated cars, where most of the research is focusing on Level 4 Autonomous driving, take-over performance is also a key factor when taking road safety into account. This study aims to investigate how the immersion in non-driving related tasks affects the take-over performance of drivers in given scenarios. The paper also highlights the use of virtual simulators to gather efficient data that can be crucial in easing the transition between manual and autonomous driving scenarios. The use of Computer Aided Simulations is of absolute importance in this day and age since the automotive industry is rapidly moving towards Autonomous technology. An experiment comprising of 40 subjects was performed to examine the reaction times of driver and the influence of other variables in the success of take-over performance in highly automated driving under different circumstances within a highway virtual environment. The results reflect the relationship between reaction times under different scenarios that the drivers might face under the circumstances stated above as well as the importance of variables such as velocity in the success on regaining car control after automated driving. The implications of the results acquired are important for understanding the criteria needed for designing Human Machine Interfaces specifically aimed towards automated driving conditions. Understanding the need to keep drivers in the loop during automation, whilst allowing drivers to safely engage in other non-driving related tasks is an important research area which can be aided by the proposed study. | ||||
Address | Feb. 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; 600.139; 600.145 | Approved | no | ||
Call Number | Admin @ si @ MHE2022 | Serial | 3468 | ||
Permanent link to this record | |||||
Author | Ali Furkan Biten; Andres Mafla; Lluis Gomez; Dimosthenis Karatzas | ||||
Title | Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching | Type | Conference Article | ||
Year | 2022 | Publication | Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 1391-1400 | ||
Keywords | Measurement; Training; Integrated circuits; Annotations; Semantics; Training data; Semisupervised learning | ||||
Abstract | The task of image-text matching aims to map representations from different modalities into a common joint visual-textual embedding. However, the most widely used datasets for this task, MSCOCO and Flickr30K, are actually image captioning datasets that offer a very limited set of relationships between images and sentences in their ground-truth annotations. This limited ground truth information forces us to use evaluation metrics based on binary relevance: given a sentence query we consider only one image as relevant. However, many other relevant images or captions may be present in the dataset. In this work, we propose two metrics that evaluate the degree of semantic relevance of retrieved items, independently of their annotated binary relevance. Additionally, we incorporate a novel strategy that uses an image captioning metric, CIDEr, to define a Semantic Adaptive Margin (SAM) to be optimized in a standard triplet loss. By incorporating our formulation to existing models, a large improvement is obtained in scenarios where available training data is limited. We also demonstrate that the performance on the annotated image-caption pairs is maintained while improving on other non-annotated relevant items when employing the full training set. The code for our new metric can be found at github. com/furkanbiten/ncsmetric and the model implementation at github. com/andrespmd/semanticadaptive_margin. | ||||
Address | Virtual; Waikoloa; Hawai; USA; January 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.155; 302.105; | Approved | no | ||
Call Number | Admin @ si @ BMG2022 | Serial | 3663 | ||
Permanent link to this record | |||||
Author | Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas | ||||
Title | Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning | Type | Conference Article | ||
Year | 2022 | Publication | Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 1381-1390 | ||
Keywords | Measurement; Training; Visualization; Analytical models; Computer vision; Computational modeling; Training data | ||||
Abstract | Explaining an image with missing or non-existent objects is known as object bias (hallucination) in image captioning. This behaviour is quite common in the state-of-the-art captioning models which is not desirable by humans. To decrease the object hallucination in captioning, we propose three simple yet efficient training augmentation method for sentences which requires no new training data or increase
in the model size. By extensive analysis, we show that the proposed methods can significantly diminish our models’ object bias on hallucination metrics. Moreover, we experimentally demonstrate that our methods decrease the dependency on the visual features. All of our code, configuration files and model weights are available online. |
||||
Address | Virtual; Waikoloa; Hawai; USA; January 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.155; 302.105 | Approved | no | ||
Call Number | Admin @ si @ BGK2022 | Serial | 3662 | ||
Permanent link to this record | |||||
Author | Ana Garcia Rodriguez; Yael Tudela; Henry Cordova; S. Carballal; I. Ordas; L. Moreira; E. Vaquero; O. Ortiz; L. Rivero; F. Javier Sanchez; Miriam Cuatrecasas; Maria Pellise; Jorge Bernal; Gloria Fernandez Esparrach | ||||
Title | In vivo computer-aided diagnosis of colorectal polyps using white light endoscopy | Type | Journal Article | ||
Year | 2022 | Publication | Endoscopy International Open | Abbreviated Journal | ENDIO |
Volume | 10 | Issue | 9 | Pages | E1201-E1207 |
Keywords | |||||
Abstract | Background and study aims Artificial intelligence is currently able to accurately predict the histology of colorectal polyps. However, systems developed to date use complex optical technologies and have not been tested in vivo. The objective of this study was to evaluate the efficacy of a new deep learning-based optical diagnosis system, ATENEA, in a real clinical setting using only high-definition white light endoscopy (WLE) and to compare its performance with endoscopists. Methods ATENEA was prospectively tested in real life on consecutive polyps detected in colorectal cancer screening colonoscopies at Hospital Clínic. No images were discarded, and only WLE was used. The in vivo ATENEA's prediction (adenoma vs non-adenoma) was compared with the prediction of four staff endoscopists without specific training in optical diagnosis for the study purposes. Endoscopists were blind to the ATENEA output. Histology was the gold standard. Results Ninety polyps (median size: 5 mm, range: 2-25) from 31 patients were included of which 69 (76.7 %) were adenomas. ATENEA correctly predicted the histology in 63 of 69 (91.3 %, 95 % CI: 82 %-97 %) adenomas and 12 of 21 (57.1 %, 95 % CI: 34 %-78 %) non-adenomas while endoscopists made correct predictions in 52 of 69 (75.4 %, 95 % CI: 60 %-85 %) and 20 of 21 (95.2 %, 95 % CI: 76 %-100 %), respectively. The global accuracy was 83.3 % (95 % CI: 74%-90 %) and 80 % (95 % CI: 70 %-88 %) for ATENEA and endoscopists, respectively. Conclusion ATENEA can accurately be used for in vivo characterization of colorectal polyps, enabling the endoscopist to make direct decisions. ATENEA showed a global accuracy similar to that of endoscopists despite an unsatisfactory performance for non-adenomatous lesions. | ||||
Address | 2022 Sep 14 | ||||
Corporate Author | Thesis | ||||
Publisher | PMID | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE; 600.157 | Approved | no | ||
Call Number | Admin @ si @ GTC2022b | Serial | 3752 | ||
Permanent link to this record | |||||
Author | Mohamed Ali Souibgui; Y.Kessentini | ||||
Title | DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 44 | Issue | 3 | Pages | 1180-1191 |
Keywords | |||||
Abstract | Documents often exhibit various forms of degradation, which make it hard to be read and substantially deteriorate the performance of an OCR system. In this paper, we propose an effective end-to-end framework named Document Enhancement Generative Adversarial Networks (DE-GAN) that uses the conditional GANs (cGANs) to restore severely degraded document images. To the best of our knowledge, this practice has not been studied within the context of generative adversarial deep networks. We demonstrate that, in different tasks (document clean up, binarization, deblurring and watermark removal), DE-GAN can produce an enhanced version of the degraded document with a high quality. In addition, our approach provides consistent improvements compared to state-of-the-art methods over the widely used DIBCO 2013, DIBCO 2017 and H-DIBCO 2018 datasets, proving its ability to restore a degraded document image to its ideal condition. The obtained results on a wide variety of degradation reveal the flexibility of the proposed model to be exploited in other document enhancement problems. | ||||
Address | 1 March 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 602.230; 600.121; 600.140 | Approved | no | ||
Call Number | Admin @ si @ SoK2022 | Serial | 3454 | ||
Permanent link to this record | |||||
Author | Hugo Jair Escalante; Heysem Kaya; Albert Ali Salah; Sergio Escalera; Yagmur Gucluturk; Umut Guçlu; Xavier Baro; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Stephane Ayache; Evelyne Viegas; Furkan Gurpinar; Achmadnoer Sukma Wicaksana; Cynthia Liem; Marcel A. J. Van Gerven; Rob Van Lier | ||||
Title | Modeling, Recognizing, and Explaining Apparent Personality from Videos | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Transactions on Affective Computing | Abbreviated Journal | TAC |
Volume | 13 | Issue | 2 | Pages | 894-911 |
Keywords | |||||
Abstract | Explainability and interpretability are two critical aspects of decision support systems. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in the context of apparent personality recognition. To the best of our knowledge, this is the first effort in this direction. We describe a challenge we organized on explainability in first impressions analysis from video. We analyze in detail the newly introduced data set, evaluation protocol, proposed solutions and summarize the results of the challenge. We investigate the issue of bias in detail. Finally, derived from our study, we outline research opportunities that we foresee will be relevant in this area in the near future. | ||||
Address | 1 April-June 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ EKS2022 | Serial | 3406 | ||
Permanent link to this record | |||||
Author | Jorge Charco; Angel Sappa; Boris X. Vintimilla | ||||
Title | Human Pose Estimation through a Novel Multi-view Scheme | Type | Conference Article | ||
Year | 2022 | Publication | 17th International Conference on Computer Vision Theory and Applications (VISAPP 2022) | Abbreviated Journal | |
Volume | 5 | Issue | Pages | 855-862 | |
Keywords | Multi-view Scheme; Human Pose Estimation; Relative Camera Pose; Monocular Approach | ||||
Abstract | This paper presents a multi-view scheme to tackle the challenging problem of the self-occlusion in human pose estimation problem. The proposed approach first obtains the human body joints of a set of images, which are captured from different views at the same time. Then, it enhances the obtained joints by using a
multi-view scheme. Basically, the joints from a given view are used to enhance poorly estimated joints from another view, especially intended to tackle the self occlusions cases. A network architecture initially proposed for the monocular case is adapted to be used in the proposed multi-view scheme. Experimental results and comparisons with the state-of-the-art approaches on Human3.6m dataset are presented showing improvements in the accuracy of body joints estimations. |
||||
Address | On line; Feb 6, 2022 – Feb 8, 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2184-4321 | ISBN | 978-989-758-555-5 | Medium | |
Area | Expedition | Conference | VISAPP | ||
Notes | MSIAU; 600.160 | Approved | no | ||
Call Number | Admin @ si @ CSV2022 | Serial | 3689 | ||
Permanent link to this record | |||||
Author | Sergi Garcia Bordils; George Tom; Sangeeth Reddy; Minesh Mathew; Marçal Rusiñol; C.V. Jawahar; Dimosthenis Karatzas | ||||
Title | Read While You Drive-Multilingual Text Tracking on the Road | Type | Conference Article | ||
Year | 2022 | Publication | 15th IAPR International workshop on document analysis systems | Abbreviated Journal | |
Volume | 13237 | Issue | Pages | 756–770 | |
Keywords | |||||
Abstract | Visual data obtained during driving scenarios usually contain large amounts of text that conveys semantic information necessary to analyse the urban environment and is integral to the traffic control plan. Yet, research on autonomous driving or driver assistance systems typically ignores this information. To advance research in this direction, we present RoadText-3K, a large driving video dataset with fully annotated text. RoadText-3K is three times bigger than its predecessor and contains data from varied geographical locations, unconstrained driving conditions and multiple languages and scripts. We offer a comprehensive analysis of tracking by detection and detection by tracking methods exploring the limits of state-of-the-art text detection. Finally, we propose a new end-to-end trainable tracking model that yields state-of-the-art results on this challenging dataset. Our experiments demonstrate the complexity and variability of RoadText-3K and establish a new, realistic benchmark for scene text tracking in the wild. | ||||
Address | La Rochelle; France; May 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-031-06554-5 | Medium | ||
Area | Expedition | Conference | DAS | ||
Notes | DAG; 600.155; 611.022; 611.004 | Approved | no | ||
Call Number | Admin @ si @ GTR2022 | Serial | 3783 | ||
Permanent link to this record | |||||
Author | Bhalaji Nagarajan; Ricardo Marques; Marcos Mejia; Petia Radeva | ||||
Title | Class-conditional Importance Weighting for Deep Learning with Noisy Labels | Type | Conference Article | ||
Year | 2022 | Publication | 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications | Abbreviated Journal | |
Volume | 5 | Issue | Pages | 679-686 | |
Keywords | Noisy Labeling; Loss Correction; Class-conditional Importance Weighting; Learning with Noisy Labels | ||||
Abstract | Large-scale accurate labels are very important to the Deep Neural Networks to train them and assure high performance. However, it is very expensive to create a clean dataset since usually it relies on human interaction. To this purpose, the labelling process is made cheap with a trade-off of having noisy labels. Learning with Noisy Labels is an active area of research being at the same time very challenging. The recent advances in Self-supervised learning and robust loss functions have helped in advancing noisy label research. In this paper, we propose a loss correction method that relies on dynamic weights computed based on the model training. We extend the existing Contrast to Divide algorithm coupled with DivideMix using a new class-conditional weighted scheme. We validate the method using the standard noise experiments and achieved encouraging results. | ||||
Address | Virtual; February 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | VISAPP | ||
Notes | MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ NMM2022 | Serial | 3798 | ||
Permanent link to this record | |||||
Author | Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla | ||||
Title | Multi-Image Super-Resolution for Thermal Images | Type | Conference Article | ||
Year | 2022 | Publication | 17th International Conference on Computer Vision Theory and Applications (VISAPP 2022) | Abbreviated Journal | |
Volume | 4 | Issue | Pages | 635-642 | |
Keywords | Thermal Images; Multi-view; Multi-frame; Super-Resolution; Deep Learning; Attention Block | ||||
Abstract | This paper proposes a novel CNN architecture for the multi-thermal image super-resolution problem. In the proposed scheme, the multi-images are synthetically generated by downsampling and slightly shifting the given image; noise is also added to each of these synthesized images. The proposed architecture uses two
attention blocks paths to extract high-frequency details taking advantage of the large information extracted from multiple images of the same scene. Experimental results are provided, showing the proposed scheme has overcome the state-of-the-art approaches. |
||||
Address | Online; Feb 6-8, 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | VISAPP | ||
Notes | MSIAU; 601.349 | Approved | no | ||
Call Number | Admin @ si @ RSV2022a | Serial | 3690 | ||
Permanent link to this record | |||||
Author | Razieh Rastgoo; Kourosh Kiani; Sergio Escalera | ||||
Title | Real-time Isolated Hand Sign Language RecognitioN Using Deep Networks and SVD | Type | Journal | ||
Year | 2022 | Publication | Journal of Ambient Intelligence and Humanized Computing | Abbreviated Journal | |
Volume | 13 | Issue | Pages | 591–611 | |
Keywords | |||||
Abstract | One of the challenges in computer vision models, especially sign language, is real-time recognition. In this work, we present a simple yet low-complex and efficient model, comprising single shot detector, 2D convolutional neural network, singular value decomposition (SVD), and long short term memory, to real-time isolated hand sign language recognition (IHSLR) from RGB video. We employ the SVD method as an efficient, compact, and discriminative feature extractor from the estimated 3D hand keypoints coordinators. Despite the previous works that employ the estimated 3D hand keypoints coordinates as raw features, we propose a novel and revolutionary way to apply the SVD to the estimated 3D hand keypoints coordinates to get more discriminative features. SVD method is also applied to the geometric relations between the consecutive segments of each finger in each hand and also the angles between these sections. We perform a detailed analysis of recognition time and accuracy. One of our contributions is that this is the first time that the SVD method is applied to the hand pose parameters. Results on four datasets, RKS-PERSIANSIGN (99.5±0.04), First-Person (91±0.06), ASVID (93±0.05), and isoGD (86.1±0.04), confirm the efficiency of our method in both accuracy (mean+std) and time recognition. Furthermore, our model outperforms or gets competitive results with the state-of-the-art alternatives in IHSLR and hand action recognition. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ RKE2022a | Serial | 3660 | ||
Permanent link to this record | |||||
Author | Adria Molina; Lluis Gomez; Oriol Ramos Terrades; Josep Llados | ||||
Title | A Generic Image Retrieval Method for Date Estimation of Historical Document Collections | Type | Conference Article | ||
Year | 2022 | Publication | Document Analysis Systems.15th IAPR International Workshop, (DAS2022) | Abbreviated Journal | |
Volume | 13237 | Issue | Pages | 583–597 | |
Keywords | Date estimation; Document retrieval; Image retrieval; Ranking loss; Smooth-nDCG | ||||
Abstract | Date estimation of historical document images is a challenging problem, with several contributions in the literature that lack of the ability to generalize from one dataset to others. This paper presents a robust date estimation system based in a retrieval approach that generalizes well in front of heterogeneous collections. We use a ranking loss function named smooth-nDCG to train a Convolutional Neural Network that learns an ordination of documents for each problem. One of the main usages of the presented approach is as a tool for historical contextual retrieval. It means that scholars could perform comparative analysis of historical images from big datasets in terms of the period where they were produced. We provide experimental evaluation on different types of documents from real datasets of manuscript and newspaper images. | ||||
Address | La Rochelle, France; May 22–25, 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | DAS | ||
Notes | DAG; 600.140; 600.121 | Approved | no | ||
Call Number | Admin @ si @ MGR2022 | Serial | 3694 | ||
Permanent link to this record | |||||
Author | Smriti Joshi; Richard Osuala; Carlos Martin-Isla; Victor M.Campello; Carla Sendra-Balcells; Karim Lekadir; Sergio Escalera | ||||
Title | nn-UNet Training on CycleGAN-Translated Images for Cross-modal Domain Adaptation in Biomedical Imaging | Type | Conference Article | ||
Year | 2022 | Publication | International MICCAI Brainlesion Workshop | Abbreviated Journal | |
Volume | 12963 | Issue | Pages | 540–551 | |
Keywords | Domain adaptation; Vestibular schwannoma (VS); Deep learning; nn-UNet; CycleGAN | ||||
Abstract | In recent years, deep learning models have considerably advanced the performance of segmentation tasks on Brain Magnetic Resonance Imaging (MRI). However, these models show a considerable performance drop when they are evaluated on unseen data from a different distribution. Since annotation is often a hard and costly task requiring expert supervision, it is necessary to develop ways in which existing models can be adapted to the unseen domains without any additional labelled information. In this work, we explore one such technique which extends the CycleGAN [2] architecture to generate label-preserving data in the target domain. The synthetic target domain data is used to train the nn-UNet [3] framework for the task of multi-label segmentation. The experiments are conducted and evaluated on the dataset [1] provided in the ‘Cross-Modality Domain Adaptation for Medical Image Segmentation’ challenge [23] for segmentation of vestibular schwannoma (VS) tumour and cochlea on contrast enhanced (ceT1) and high resolution (hrT2) MRI scans. In the proposed approach, our model obtains dice scores (DSC) 0.73 and 0.49 for tumour and cochlea respectively on the validation set of the dataset. This indicates the applicability of the proposed technique to real-world problems where data may be obtained by different acquisition protocols as in [1] where hrT2 images are more reliable, safer, and lower-cost alternative to ceT1. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | MICCAIW | ||
Notes | HUPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ JOM2022 | Serial | 3800 | ||
Permanent link to this record |