toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Bonifaz Stuhr edit  isbn
openurl 
  Title Towards Unsupervised Representation Learning: Learning, Evaluating and Transferring Visual Representations Type Book Whole
  Year 2023 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (down) Unsupervised representation learning aims at finding methods that learn representations from data without annotation-based signals. Abstaining from annotations not only leads to economic benefits but may – and to some extent already does – result in advantages regarding the representation’s structure, robustness, and generalizability to different tasks. In the long run, unsupervised methods are expected to surpass their supervised counterparts due to the reduction of human intervention and the inherently more general setup that does not bias the optimization towards an objective originating from specific annotation-based signals. While major advantages of unsupervised representation learning have been recently observed in natural language processing, supervised methods still dominate in vision domains for most tasks. In this dissertation, we contribute to the field of unsupervised (visual) representation learning from three perspectives: (i) Learning representations: We design unsupervised, backpropagation-free Convolutional Self-Organizing Neural Networks (CSNNs) that utilize self-organization- and Hebbian-based learning rules to learn convolutional kernels and masks to achieve deeper backpropagation-free models. Thereby, we observe that backpropagation-based and -free methods can suffer from an objective function mismatch between the unsupervised pretext task and the target task. This mismatch can lead to performance decreases for the target task. (ii) Evaluating representations: We build upon the widely used (non-)linear evaluation protocol to define pretext- and target-objective-independent metrics for measuring the objective function mismatch. With these metrics, we evaluate various pretext and target tasks and disclose dependencies of the objective function mismatch concerning different parts of the training and model setup. (iii) Transferring representations: We contribute CARLANE, the first 3-way sim-to-real domain adaptation benchmark for 2D lane detection. We adopt several well-known unsupervised domain adaptation methods as baselines and propose a method based on prototypical cross-domain self-supervised learning. Finally, we focus on pixel-based unsupervised domain adaptation and contribute a content-consistent unpaired image-to-image translation method that utilizes masks, global and local discriminators, and similarity sampling to mitigate content inconsistencies, as well as feature-attentive denormalization to fuse content-based statistics into the generator stream. In addition, we propose the cKVD metric to incorporate class-specific content inconsistencies into perceptual metrics for measuring translation quality.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher IMPRIA Place of Publication Editor Jordi Gonzalez;Jurgen Brauer  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-126409-6-0 Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ Stu2023 Serial 3966  
Permanent link to this record
 

 
Author Eduardo Aguilar; Bogdan Raducanu; Petia Radeva; Joost Van de Weijer edit  url
openurl 
  Title Continual Evidential Deep Learning for Out-of-Distribution Detection Type Conference Article
  Year 2023 Publication Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops Abbreviated Journal  
  Volume Issue Pages 3444-3454  
  Keywords  
  Abstract (down) Uncertainty-based deep learning models have attracted a great deal of interest for their ability to provide accurate and reliable predictions. Evidential deep learning stands out achieving remarkable performance in detecting out-ofdistribution (OOD) data with a single deterministic neural network. Motivated by this fact, in this paper we propose the integration of an evidential deep learning method into a continual learning framework in order to perform simultaneously incremental object classification and OOD detection. Moreover, we analyze the ability of vacuity and dissonance to differentiate between in-distribution data belonging to old classes and OOD data. The proposed method 1, called CEDL, is evaluated on CIFAR-100 considering two settings consisting of 5 and 10 tasks, respectively. From the obtained results, we could appreciate that the proposed method, in addition to provide comparable results in object classification with respect to the baseline, largely outperforms OOD detection compared to several posthoc methods on three evaluation metrics: AUROC, AUPR and FPR95.  
  Address Paris; France; October 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCVW  
  Notes LAMP; MILAB Approved no  
  Call Number Admin @ si @ ARR2023 Serial 3974  
Permanent link to this record
 

 
Author Eduardo Aguilar; Bogdan Raducanu; Petia Radeva; Joost Van de Weijer edit   pdf
url  doi
openurl 
  Title Continual Evidential Deep Learning for Out-of-Distribution Detection Type Conference Article
  Year 2023 Publication IEEE/CVF International Conference on Computer Vision (ICCV) Workshops -Visual Continual Learning workshop Abbreviated Journal  
  Volume Issue Pages 3444-3454  
  Keywords  
  Abstract (down) Uncertainty-based deep learning models have attracted a great deal of interest for their ability to provide accurate and reliable predictions. Evidential deep learning stands out achieving remarkable performance in detecting out-of-distribution (OOD) data with a single deterministic neural network. Motivated by this fact, in this paper we propose the integration of an evidential deep learning method into a continual learning framework in order to perform simultaneously incremental object classification and OOD detection. Moreover, we analyze the ability of vacuity and dissonance to differentiate between in-distribution data belonging to old classes and OOD data. The proposed method, called CEDL, is evaluated on CIFAR-100 considering two settings consisting of 5 and 10 tasks, respectively. From the obtained results, we could appreciate that the proposed method, in addition to provide comparable results in object classification with respect to the baseline, largely outperforms OOD detection compared to several posthoc methods on three evaluation metrics: AUROC, AUPR and FPR95.  
  Address Paris; France; October 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCVW  
  Notes LAMP; MILAB Approved no  
  Call Number Admin @ si @ ARR2023 Serial 3841  
Permanent link to this record
 

 
Author Javier Selva; Anders S. Johansen; Sergio Escalera; Kamal Nasrollahi; Thomas B. Moeslund; Albert Clapes edit  doi
openurl 
  Title Video transformers: A survey Type Journal Article
  Year 2023 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI  
  Volume 45 Issue 11 Pages 12922-12943  
  Keywords Artificial Intelligence; Computer Vision; Self-Attention; Transformers; Video Representations  
  Abstract (down) Transformer models have shown great success handling long-range interactions, making them a promising tool for modeling video. However, they lack inductive biases and scale quadratically with input length. These limitations are further exacerbated when dealing with the high dimensionality introduced by the temporal dimension. While there are surveys analyzing the advances of Transformers for vision, none focus on an in-depth analysis of video-specific designs. In this survey, we analyze the main contributions and trends of works leveraging Transformers to model video. Specifically, we delve into how videos are handled at the input level first. Then, we study the architectural changes made to deal with video more efficiently, reduce redundancy, re-introduce useful inductive biases, and capture long-term temporal dynamics. In addition, we provide an overview of different training regimes and explore effective self-supervised learning strategies for video. Finally, we conduct a performance comparison on the most common benchmark for Video Transformers (i.e., action classification), finding them to outperform 3D ConvNets even with less computational complexity.  
  Address 1 Nov. 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no menciona Approved no  
  Call Number Admin @ si @ SJE2023 Serial 3823  
Permanent link to this record
 

 
Author Shiqi Yang edit  isbn
openurl 
  Title Towards Source-Free Domain Adaption of Neural Networks in an Open World Type Book Whole
  Year 2023 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (down) Though they achieve great success, deep neural networks typically require a huge
amount of labeled data for training. However, collecting labeled data is often laborious and expensive. It would, therefore, be ideal if the knowledge obtained from label-rich datasets could be transferred to unlabeled data. However, deep networks are weak at generalizing to unseen domains, even when the differences are only subtle between the datasets. In real-world situations, a typical factor impairing the model generalization ability is the distribution shift between data from different domains, which is a long-standing problem usually termed as (unsupervised) domain adaptation.
A crucial requirement in the methodology of these domain adaptation methods is that they require access to source domain data during the adaptation process to the target domain. Accessibility to the source data of a trained source model is often impossible in real-world applications, for example, when deploying domain adaptation algorithms on mobile devices where the computational capacity is limited or in situations where data privacy rules limit access to the source domain data. Without access to the source domain data, existing methods suffer from inferior performance. Thus, in this thesis, we investigate domain adaptation without source data (termed as source-free domain adaptation) in multiple different scenarios that focus on image classification tasks.
We first study the source-free domain adaptation problem in a closed-set setting,
where the label space of different domains is identical. Only accessing the pretrained source model, we propose to address source-free domain adaptation from the perspective of unsupervised clustering. We achieve this based on nearest neighborhood clustering. In this way, we can transfer the challenging source-free domain adaptation task to a type of clustering problem. The final optimization objective is an upper bound containing only two simple terms, which can be explained as discriminability and diversity. We show that this allows us to relate several other methods in domain adaptation, unsupervised clustering and contrastive learning via the perspective of discriminability and diversity.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher IMPRIMA Place of Publication Editor Joost  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-126409-3-9 Medium  
  Area Expedition Conference  
  Notes LAMP Approved no  
  Call Number Admin @ si @ Yan2023 Serial 3963  
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Henry Velesaca; Angel Sappa edit  url
doi  openurl
  Title Object Detection in Very Low-Resolution Thermal Images through a Guided-Based Super-Resolution Approach Type Conference Article
  Year 2023 Publication 17th International Conference on Signal-Image Technology & Internet-Based Systems Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (down) This work proposes a novel approach that integrates super-resolution techniques with off-the-shelf object detection methods to tackle the problem of handling very low-resolution thermal images. The suggested approach begins by enhancing the low-resolution (LR) thermal images through a guided super-resolution strategy, leveraging a high-resolution (HR) visible spectrum image. Subsequently, object detection is performed on the high-resolution thermal image. The experimental results demonstrate tremendous improvements in comparison with both scenarios: when object detection is performed on the LR thermal image alone, as well as when object detection is conducted on the up-sampled LR thermal image. Moreover, the proposed approach proves highly valuable in camouflaged scenarios where objects might remain undetected in visible spectrum images.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference SITIS  
  Notes MSIAU Approved no  
  Call Number Admin @ si @ RVS2023 Serial 4010  
Permanent link to this record
 

 
Author Pau Cano; Alvaro Caravaca; Debora Gil; Eva Musulen edit   pdf
url  openurl
  Title Diagnosis of Helicobacter pylori using AutoEncoders for the Detection of Anomalous Staining Patterns in Immunohistochemistry Images Type Miscellaneous
  Year 2023 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages 107241  
  Keywords  
  Abstract (down) This work addresses the detection of Helicobacter pylori a bacterium classified since 1994 as class 1 carcinogen to humans. By its highest specificity and sensitivity, the preferred diagnosis technique is the analysis of histological images with immunohistochemical staining, a process in which certain stained antibodies bind to antigens of the biological element of interest. This analysis is a time demanding task, which is currently done by an expert pathologist that visually inspects the digitized samples.
We propose to use autoencoders to learn latent patterns of healthy tissue and detect H. pylori as an anomaly in image staining. Unlike existing classification approaches, an autoencoder is able to learn patterns in an unsupervised manner (without the need of image annotations) with high performance. In particular, our model has an overall 91% of accuracy with 86\% sensitivity, 96% specificity and 0.97 AUC in the detection of H. pylori.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM Approved no  
  Call Number Admin @ si @ CCG2023 Serial 3855  
Permanent link to this record
 

 
Author Diego Velazquez edit  isbn
openurl 
  Title Towards Robustness in Computer-based Image Understanding Type Book Whole
  Year 2023 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (down) This thesis embarks on an exploratory journey into robustness in deep learning,
with a keen focus on the intertwining facets of generalization, explainability, and
edge cases within the realm of computer vision. In deep learning, robustness
epitomizes a model’s resilience and flexibility, grounded on its capacity to generalize across diverse data distributions, explain its predictions transparently, and navigate the intricacies of edge cases effectively. The challenges associated with robust generalization are multifaceted, encompassing the model’s performance on unseen data and its defense against out-of-distribution data and adversarial attacks. Bridging this gap, the potential of Embedding Propagation (EP) for improving out-of-distribution generalization is explored. EP is depicted as a powerful tool facilitating manifold smoothing, which in turn fortifies the model’s robustness against adversarial onslaughts and bolsters performance in few-shot and self-/semi-supervised learning scenarios. In the labyrinth of deep learning models, the path to robustness often intersects with explainability. As model complexity increases, so does the urgency to decipher their decision-making
processes. Acknowledging this, the thesis introduces a robust framework for
evaluating and comparing various counterfactual explanation methods, echoing
the imperative of explanation quality over quantity and spotlighting the intricacies of diversifying explanations. Simultaneously, the deep learning landscape is fraught with edge cases – anomalies in the form of small objects or rare instances in object detection tasks that defy the norm. Confronting this, the
thesis presents an extension of the DETR (DEtection TRansformer) model to enhance small object detection. The devised DETR-FP, embedding the Feature Pyramid technique, demonstrating improvement in small objects detection accuracy, albeit facing challenges like high computational costs. With emergence of foundation models in mind, the thesis unveils EarthView, the largest scale remote sensing dataset to date, built for the self-supervised learning of a robust foundational model for remote sensing. Collectively, these studies contribute to the grand narrative of robustness in deep learning, weaving together the strands of generalization, explainability, and edge case performance. Through these methodological advancements and novel datasets, the thesis calls for continued exploration, innovation, and refinement to fortify the bastion of robust computer vision.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher IMPRIMA Place of Publication Editor Jordi Gonzalez;Josep M. Gonfaus;Pau Rodriguez  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-81-126409-5-3 Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ Vel2023 Serial 3965  
Permanent link to this record
 

 
Author Guillermo Torres; Jan Rodríguez Dueñas; Sonia Baeza; Antoni Rosell; Carles Sanchez; Debora Gil edit   pdf
url  openurl
  Title Prediction of Malignancy in Lung Cancer using several strategies for the fusion of Multi-Channel Pyradiomics Images Type Conference Article
  Year 2023 Publication 7th Workshop on Digital Image Processing for Medical and Automotive Industry in the framework of SYNASC 2023 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (down) This study shows the generation process and the subsequent study of the representation space obtained by extracting GLCM texture features from computer-aided tomography (CT) scans of pulmonary nodules (PN). For this, data from 92 patients from the Germans Trias i Pujol University Hospital were used. The workflow focuses on feature extraction using Pyradiomics and the VGG16 Convolutional Neural Network (CNN). The aim of the study is to assess whether the data obtained have a positive impact on the diagnosis of lung cancer (LC). To design a machine learning (ML) model training method that allows generalization, we train SVM and neural network (NN) models, evaluating diagnosis performance using metrics defined at slice and nodule level.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DIPMAI  
  Notes IAM Approved no  
  Call Number Admin @ si @ TRB2023 Serial 3926  
Permanent link to this record
 

 
Author Cristhian A. Aguilera-Carrasco; Luis Felipe Gonzalez-Böhme; Francisco Valdes; Francisco Javier Quitral Zapata; Bogdan Raducanu edit  doi
openurl 
  Title A Hand-Drawn Language for Human–Robot Collaboration in Wood Stereotomy Type Journal Article
  Year 2023 Publication IEEE Access Abbreviated Journal ACCESS  
  Volume 11 Issue Pages 100975 - 100985  
  Keywords  
  Abstract (down) This study introduces a novel, hand-drawn language designed to foster human-robot collaboration in wood stereotomy, central to carpentry and joinery professions. Based on skilled carpenters’ line and symbol etchings on timber, this language signifies the location, geometry of woodworking joints, and timber placement within a framework. A proof-of-concept prototype has been developed, integrating object detectors, keypoint regression, and traditional computer vision techniques to interpret this language and enable an extensive repertoire of actions. Empirical data attests to the language’s efficacy, with the successful identification of a specific set of symbols on various wood species’ sawn surfaces, achieving a mean average precision (mAP) exceeding 90%. Concurrently, the system can accurately pinpoint critical positions that facilitate robotic comprehension of carpenter-indicated woodworking joint geometry. The positioning error, approximately 3 pixels, meets industry standards.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP Approved no  
  Call Number Admin @ si @ AGV2023 Serial 3969  
Permanent link to this record
 

 
Author Spencer Low; Oliver Nina; Angel Sappa; Erik Blasch; Nathan Inkawhich edit  url
doi  openurl
  Title Multi-Modal Aerial View Image Challenge: Translation From Synthetic Aperture Radar to Electro-Optical Domain Results-PBVS 2023 Type Conference Article
  Year 2023 Publication Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal  
  Volume Issue Pages 515-523  
  Keywords  
  Abstract (down) This paper unveils the discoveries and outcomes of the inaugural iteration of the Multi-modal Aerial View Image Challenge (MAVIC) aimed at image translation. The primary objective of this competition is to stimulate research efforts towards the development of models capable of translating co-aligned images between multiple modalities. To accomplish the task of image translation, the competition utilizes images obtained from both synthetic aperture radar (SAR) and electro-optical (EO) sources. Specifically, the challenge centers on the translation from the SAR modality to the EO modality, an area of research that has garnered attention. The inaugural challenge demonstrates the feasibility of the task. The dataset utilized in this challenge is derived from the UNIfied COincident Optical and Radar for recognitioN (UNICORN) dataset. We introduce an new version of the UNICORN dataset that is focused on enabling the sensor translation task. Performance evaluation is conducted using a combination of measures to ensure high fidelity and high accuracy translations.  
  Address Vancouver; Canada; June 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes MSIAU Approved no  
  Call Number Admin @ si @ LNS2023a Serial 3913  
Permanent link to this record
 

 
Author Yawei Li; Yulun Zhang; Radu Timofte; Luc Van Gool; Zhijun Tu; Kunpeng Du; Hailing Wang; Hanting Chen; Wei Li; Xiaofei Wang; Jie Hu; Yunhe Wang; Xiangyu Kong; Jinlong Wu; Dafeng Zhang; Jianxing Zhang; Shuai Liu; Furui Bai; Chaoyu Feng; Hao Wang; Yuqian Zhang; Guangqi Shao; Xiaotao Wang; Lei Lei; Rongjian Xu; Zhilu Zhang; Yunjin Chen; Dongwei Ren; Wangmeng Zuo; Qi Wu; Mingyan Han; Shen Cheng; Haipeng Li; Ting Jiang; Chengzhi Jiang; Xinpeng Li; Jinting Luo; Wenjie Lin; Lei Yu; Haoqiang Fan; Shuaicheng Liu; Aditya Arora; Syed Waqas Zamir; Javier Vazquez; Konstantinos G. Derpanis; Michael S. Brown; Hao Li; Zhihao Zhao; Jinshan Pan; Jiangxin Dong; Jinhui Tang; Bo Yang; Jingxiang Chen; Chenghua Li; Xi Zhang; Zhao Zhang; Jiahuan Ren; Zhicheng Ji; Kang Miao; Suiyi Zhao; Huan Zheng; YanYan Wei; Kangliang Liu; Xiangcheng Du; Sijie Liu; Yingbin Zheng; Xingjiao Wu; Cheng Jin; Rajeev Irny; Sriharsha Koundinya; Vighnesh Kamath; Gaurav Khandelwal; Sunder Ali Khowaja; Jiseok Yoon; Ik Hyun Lee; Shijie Chen; Chengqiang Zhao; Huabin Yang; Zhongjian Zhang; Junjia Huang; Yanru Zhang edit  url
doi  openurl
  Title NTIRE 2023 challenge on image denoising: Methods and results Type Conference Article
  Year 2023 Publication Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal  
  Volume Issue Pages 1904-1920  
  Keywords  
  Abstract (down) This paper reviews the NTIRE 2023 challenge on image denoising (σ = 50) with a focus on the proposed solutions and results. The aim is to obtain a network design capable to produce high-quality results with the best performance measured by PSNR for image denoising. Independent additive white Gaussian noise (AWGN) is assumed and the noise level is 50. The challenge had 225 registered participants, and 16 teams made valid submissions. They gauge the state-of-the-art for image denoising.  
  Address Vancouver; Canada; June 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes MACO; CIC Approved no  
  Call Number Admin @ si @ LZT2023 Serial 3910  
Permanent link to this record
 

 
Author Stepan Simsa; Michal Uricar; Milan Sulc; Yash Patel; Ahmed Hamdi; Matej Kocian; Matyas Skalicky; Jiri Matas; Antoine Doucet; Mickael Coustaty; Dimosthenis Karatzas edit  url
doi  openurl
  Title Overview of DocILE 2023: Document Information Localization and Extraction Type Conference Article
  Year 2023 Publication International Conference of the Cross-Language Evaluation Forum for European Languages Abbreviated Journal  
  Volume 14163 Issue Pages 276–293  
  Keywords Information Extraction; Computer Vision; Natural Language Processing; Optical Character Recognition; Document Understanding  
  Abstract (down) This paper provides an overview of the DocILE 2023 Competition, its tasks, participant submissions, the competition results and possible future research directions. This first edition of the competition focused on two Information Extraction tasks, Key Information Localization and Extraction (KILE) and Line Item Recognition (LIR). Both of these tasks require detection of pre-defined categories of information in business documents. The second task additionally requires correctly grouping the information into tuples, capturing the structure laid out in the document. The competition used the recently published DocILE dataset and benchmark that stays open to new submissions. The diversity of the participant solutions indicates the potential of the dataset as the submissions included pure Computer Vision, pure Natural Language Processing, as well as multi-modal solutions and utilized all of the parts of the dataset, including the annotated, synthetic and unlabeled subsets.  
  Address Thessaloniki; Greece; September 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CLEF  
  Notes DAG Approved no  
  Call Number Admin @ si @ SUS2023a Serial 3924  
Permanent link to this record
 

 
Author Patricia Suarez; Angel Sappa edit  openurl
  Title Toward a Thermal Image-Like Representation Type Conference Article
  Year 2023 Publication Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications Abbreviated Journal  
  Volume Issue Pages 133-140  
  Keywords  
  Abstract (down) This paper proposes a novel model to obtain thermal image-like representations to be used as an input in any thermal image compressive sensing approach (e.g., thermal image: filtering, enhancing, super-resolution). Thermal images offer interesting information about the objects in the scene, in addition to their temperature. Unfortunately, in most of the cases thermal cameras acquire low resolution/quality images. Hence, in order to improve these images, there are several state-of-the-art approaches that exploit complementary information from a low-cost channel (visible image) to increase the image quality of an expensive channel (infrared image). In these SOTA approaches visible images are fused at different levels without paying attention the images acquire information at different bands of the spectral. In this paper a novel approach is proposed to generate thermal image-like representations from a low cost visible images, by means of a contrastive cycled GAN network. Obtained representations (synthetic thermal image) can be later on used to improve the low quality thermal image of the same scene. Experimental results on different datasets are presented.  
  Address Lisboa; Portugal; February 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference VISIGRAPP  
  Notes MSIAU Approved no  
  Call Number Admin @ si @ SuS2023b Serial 3927  
Permanent link to this record
 

 
Author Iban Berganzo-Besga; Hector A. Orengo; Felipe Lumbreras; Aftab Alam; Rosie Campbell; Petrus J Gerrits; Jonas Gregorio de Souza; Afifa Khan; Maria Suarez Moreno; Jack Tomaney; Rebecca C Roberts; Cameron A Petrie edit  url
doi  openurl
  Title Curriculum learning-based strategy for low-density archaeological mound detection from historical maps in India and Pakistan Type Journal Article
  Year 2023 Publication Scientific Reports Abbreviated Journal ScR  
  Volume 13 Issue Pages 11257  
  Keywords  
  Abstract (down) This paper presents two algorithms for the large-scale automatic detection and instance segmentation of potential archaeological mounds on historical maps. Historical maps present a unique source of information for the reconstruction of ancient landscapes. The last 100 years have seen unprecedented landscape modifications with the introduction and large-scale implementation of mechanised agriculture, channel-based irrigation schemes, and urban expansion to name but a few. Historical maps offer a window onto disappearing landscapes where many historical and archaeological elements that no longer exist today are depicted. The algorithms focus on the detection and shape extraction of mound features with high probability of being archaeological settlements, mounds being one of the most commonly documented archaeological features to be found in the Survey of India historical map series, although not necessarily recognised as such at the time of surveying. Mound features with high archaeological potential are most commonly depicted through hachures or contour-equivalent form-lines, therefore, an algorithm has been designed to detect each of those features. Our proposed approach addresses two of the most common issues in archaeological automated survey, the low-density of archaeological features to be detected, and the small amount of training data available. It has been applied to all types of maps available of the historic 1″ to 1-mile series, thus increasing the complexity of the detection. Moreover, the inclusion of synthetic data, along with a Curriculum Learning strategy, has allowed the algorithm to better understand what the mound features look like. Likewise, a series of filters based on topographic setting, form, and size have been applied to improve the accuracy of the models. The resulting algorithms have a recall value of 52.61% and a precision of 82.31% for the hachure mounds, and a recall value of 70.80% and a precision of 70.29% for the form-line mounds, which allowed the detection of nearly 6000 mound features over an area of 470,500 km2, the largest such approach to have ever been applied. If we restrict our focus to the maps most similar to those used in the algorithm training, we reach recall values greater than 60% and precision values greater than 90%. This approach has shown the potential to implement an adaptive algorithm that allows, after a small amount of retraining with data detected from a new map, a better general mound feature detection in the same map.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MSIAU Approved no  
  Call Number Admin @ si @ BOL2023 Serial 3976  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: