Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–11] |
Records | |||||
---|---|---|---|---|---|
Author | Gisel Bastidas-Guacho; Patricio Moreno; Boris X. Vintimilla; Angel Sappa | ||||
Title | Application on the Loop of Multimodal Image Fusion: Trends on Deep-Learning Based Approaches | Type | Conference Article | ||
Year | 2023 | Publication | 13th International Conference on Pattern Recognition Systems | Abbreviated Journal | |
Volume | 14234 | Issue | Pages | 25–36 | |
Keywords | |||||
Abstract | Multimodal image fusion allows the combination of information from different modalities, which is useful for tasks such as object detection, edge detection, and tracking, to name a few. Using the fused representation for applications results in better task performance. There are several image fusion approaches, which have been summarized in surveys. However, the existing surveys focus on image fusion approaches where the application on the loop of multimodal image fusion is not considered. On the contrary, this study summarizes deep learning-based multimodal image fusion for computer vision (e.g., object detection) and image processing applications (e.g., semantic segmentation), that is, approaches where the application module leverages the multimodal fusion process to enhance the final result. Firstly, we introduce image fusion and the existing general frameworks for image fusion tasks such as multifocus, multiexposure and multimodal. Then, we describe the multimodal image fusion approaches. Next, we review the state-of-the-art deep learning multimodal image fusion approaches for vision applications. Finally, we conclude our survey with the trends of task-driven multimodal image fusion. | ||||
Address | Guayaquil; Ecuador; July 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPRS | ||
Notes | MSIAU | Approved | no | ||
Call Number | Admin @ si @ BMV2023 | Serial | 3932 | ||
Permanent link to this record | |||||
Author | Sonia Baeza; Debora Gil; Carles Sanchez; Guillermo Torres; Ignasi Garcia Olive; Ignasi Guasch; Samuel Garcia Reina; Felipe Andreo; Jose Luis Mate; Jose Luis Vercher; Antonio Rosell | ||||
Title | Biopsia virtual radiomica para el diagnóstico histológico de nódulos pulmonares – Resultados intermedios del proyecto Radiolung | Type | Conference Article | ||
Year | 2023 | Publication | SEPAR | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Pòster | ||||
Address | Granada; Spain; June 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | SEPAR | ||
Notes | IAM | Approved | no | ||
Call Number | Admin @ si @ BGS2023 | Serial | 3951 | ||
Permanent link to this record | |||||
Author | Asma Bensalah; Antonio Parziale; Giuseppe De Gregorio; Angelo Marcelli; Alicia Fornes; Josep Llados | ||||
Title | I Can’t Believe It’s Not Better: In-air Movement for Alzheimer Handwriting Synthetic Generation | Type | Conference Article | ||
Year | 2023 | Publication | 21st International Graphonomics Conference | Abbreviated Journal | |
Volume | Issue | Pages | 136–148 | ||
Keywords | |||||
Abstract | During recent years, there here has been a boom in terms of deep learning use for handwriting analysis and recognition. One main application for handwriting analysis is early detection and diagnosis in the health field. Unfortunately, most real case problems still suffer a scarcity of data, which makes difficult the use of deep learning-based models. To alleviate this problem, some works resort to synthetic data generation. Lately, more works are directed towards guided data synthetic generation, a generation that uses the domain and data knowledge to generate realistic data that can be useful to train deep learning models. In this work, we combine the domain knowledge about the Alzheimer’s disease for handwriting and use it for a more guided data generation. Concretely, we have explored the use of in-air movements for synthetic data generation. | ||||
Address | Evora; Portugal; October 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IGS | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ BPG2023 | Serial | 3838 | ||
Permanent link to this record | |||||
Author | Subhajit Maity; Sanket Biswas; Siladittya Manna; Ayan Banerjee; Josep Llados; Saumik Bhattacharya; Umapada Pal | ||||
Title | SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation | Type | Conference Article | ||
Year | 2023 | Publication | 17th International Conference on Doccument Analysis and Recognition | Abbreviated Journal | |
Volume | 14187 | Issue | Pages | 342–360 | |
Keywords | |||||
Abstract | Document layout analysis is a known problem to the documents research community and has been vastly explored yielding a multitude of solutions ranging from text mining, and recognition to graph-based representation, visual feature extraction, etc. However, most of the existing works have ignored the crucial fact regarding the scarcity of labeled data. With growing internet connectivity to personal life, an enormous amount of documents had been available in the public domain and thus making data annotation a tedious task. We address this challenge using self-supervision and unlike, the few existing self-supervised document segmentation approaches which use text mining and textual labels, we use a complete vision-based approach in pre-training without any ground-truth label or its derivative. Instead, we generate pseudo-layouts from the document images to pre-train an image encoder to learn the document object representation and localization in a self-supervised framework before fine-tuning it with an object detection model. We show that our pipeline sets a new benchmark in this context and performs at par with the existing methods and the supervised counterparts, if not outperforms. The code is made publicly available at: this https URL | ||||
Address | Document Layout Analysis; Document | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ MBM2023 | Serial | 3990 | ||
Permanent link to this record | |||||
Author | Yi Xiao; Felipe Codevilla; Diego Porres; Antonio Lopez | ||||
Title | Scaling Vision-Based End-to-End Autonomous Driving with Multi-View Attention Learning | Type | Conference Article | ||
Year | 2023 | Publication | International Conference on Intelligent Robots and Systems | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | On end-to-end driving, human driving demonstrations are used to train perception-based driving models by imitation learning. This process is supervised on vehicle signals (e.g., steering angle, acceleration) but does not require extra costly supervision (human labeling of sensor data). As a representative of such vision-based end-to-end driving models, CILRS is commonly used as a baseline to compare with new driving models. So far, some latest models achieve better performance than CILRS by using expensive sensor suites and/or by using large amounts of human-labeled data for training. Given the difference in performance, one may think that it is not worth pursuing vision-based pure end-to-end driving. However, we argue that this approach still has great value and potential considering cost and maintenance. In this paper, we present CIL++, which improves on CILRS by both processing higher-resolution images using a human-inspired HFOV as an inductive bias and incorporating a proper attention mechanism. CIL++ achieves competitive performance compared to models which are more costly to develop. We propose to replace CILRS with CIL++ as a strong vision-based pure end-to-end driving baseline supervised by only vehicle signals and trained by conditional imitation learning. | ||||
Address | Detroit; USA; October 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IROS | ||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ XCP2023 | Serial | 3930 | ||
Permanent link to this record | |||||
Author | Akshita Gupta; Sanath Narayan; Salman Khan; Fahad Shahbaz Khan; Ling Shao; Joost Van de Weijer | ||||
Title | Generative Multi-Label Zero-Shot Learning | Type | Journal Article | ||
Year | 2023 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 45 | Issue | 12 | Pages | 14611-14624 |
Keywords | Generalized zero-shot learning; Multi-label classification; Zero-shot object detection; Feature synthesis | ||||
Abstract | Multi-label zero-shot learning strives to classify images into multiple unseen categories for which no data is available during training. The test samples can additionally contain seen categories in the generalized variant. Existing approaches rely on learning either shared or label-specific attention from the seen classes. Nevertheless, computing reliable attention maps for unseen classes during inference in a multi-label setting is still a challenge. In contrast, state-of-the-art single-label generative adversarial network (GAN) based approaches learn to directly synthesize the class-specific visual features from the corresponding class attribute embeddings. However, synthesizing multi-label features from GANs is still unexplored in the context of zero-shot setting. When multiple objects occur jointly in a single image, a critical question is how to effectively fuse multi-class information. In this work, we introduce different fusion approaches at the attribute-level, feature-level and cross-level (across attribute and feature-levels) for synthesizing multi-label features from their corresponding multi-label class embeddings. To the best of our knowledge, our work is the first to tackle the problem of multi-label feature synthesis in the (generalized) zero-shot setting. Our cross-level fusion-based generative approach outperforms the state-of-the-art on three zero-shot benchmarks: NUS-WIDE, Open Images and MS COCO. Furthermore, we show the generalization capabilities of our fusion approach in the zero-shot detection task on MS COCO, achieving favorable performance against existing methods. | ||||
Address | December 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; PID2021-128178OB-I00 | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3853 | ||
Permanent link to this record | |||||
Author | Debora Gil; Guillermo Torres; Carles Sanchez | ||||
Title | Transforming radiomic features into radiological words | Type | Conference Article | ||
Year | 2023 | Publication | IEEE International Symposium on Biomedical Imaging | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Pòster | ||||
Address | Cartagena de Indias; Colombia; April 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ISBI | ||
Notes | IAM | Approved | no | ||
Call Number | Admin @ si @ GTS2023 | Serial | 3952 | ||
Permanent link to this record | |||||
Author | Pau Cano; Debora Gil; Eva Musulen | ||||
Title | Towards automatic detection of helicobacter pylori in histological samples of gastric tissue | Type | Conference Article | ||
Year | 2023 | Publication | IEEE International Symposium on Biomedical Imaging | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Cartagena de Indias; Colombia; April 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ISBI | ||
Notes | IAM | Approved | no | ||
Call Number | Admin @ si @ CGM2023 | Serial | 3953 | ||
Permanent link to this record | |||||
Author | Guillermo Torres; Debora Gil; Antonio Rosell; Sonia Baeza; Carles Sanchez | ||||
Title | A radiomic biopsy for virtual histology of pulmonary nodules | Type | Conference Article | ||
Year | 2023 | Publication | IEEE International Symposium on Biomedical Imaging | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Pòster | ||||
Address | Cartagena de Indias; Colombia; April 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ISBI | ||
Notes | IAM | Approved | no | ||
Call Number | Admin @ si @ TGR2023b | Serial | 3954 | ||
Permanent link to this record | |||||
Author | Albert Tatjer; Bhalaji Nagarajan; Ricardo Marques; Petia Radeva | ||||
Title | CCLM: Class-Conditional Label Noise Modelling | Type | Conference Article | ||
Year | 2023 | Publication | 11th Iberian Conference on Pattern Recognition and Image Analysis | Abbreviated Journal | |
Volume | 14062 | Issue | Pages | 3-14 | |
Keywords | |||||
Abstract | The performance of deep neural networks highly depends on the quality and volume of the training data. However, cost-effective labelling processes such as crowdsourcing and web crawling often lead to data with noisy (i.e., wrong) labels. Making models robust to this label noise is thus of prime importance. A common approach is using loss distributions to model the label noise. However, the robustness of these methods highly depends on the accuracy of the division of training set into clean and noisy samples. In this work, we dive in this research direction highlighting the existing problem of treating this distribution globally and propose a class-conditional approach to split the clean and noisy samples. We apply our approach to the popular DivideMix algorithm and show how the local treatment fares better with respect to the global treatment of loss distribution. We validate our hypothesis on two popular benchmark datasets and show substantial improvements over the baseline experiments. We further analyze the effectiveness of the proposal using two different metrics – Noise Division Accuracy and Classiness. | ||||
Address | Alicante; Spain; June 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IbPRIA | ||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ TNM2023 | Serial | 3925 | ||
Permanent link to this record | |||||
Author | German Barquero; Sergio Escalera; Cristina Palmero | ||||
Title | BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction | Type | Conference Article | ||
Year | 2023 | Publication | IEEE/CVF International Conference on Computer Vision (ICCV) Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 2317-2327 | ||
Keywords | |||||
Abstract | Stochastic human motion prediction (HMP) has generally been tackled with generative adversarial networks and variational autoencoders. Most prior works aim at predicting highly diverse movements in terms of the skeleton joints’ dispersion. This has led to methods predicting fast and motion-divergent movements, which are often unrealistic and incoherent with past motion. Such methods also neglect contexts that need to anticipate diverse low-range behaviors, or actions, with subtle joint displacements. To address these issues, we present BeLFusion, a model that, for the first time, leverages latent diffusion models in HMP to sample from a latent space where behavior is disentangled from pose and motion. As a result, diversity is encouraged from a behavioral perspective. Thanks to our behavior
coupler’s ability to transfer sampled behavior to ongoing motion, BeLFusion’s predictions display a variety of behaviors that are significantly more realistic than the state of the art. To support it, we introduce two metrics, the Area of the Cumulative Motion Distribution, and the Average Pairwise Distance Error, which are correlated to our definition of realism according to a qualitative study with 126 participants. Finally, we prove BeLFusion’s generalization power in a new cross-dataset scenario for stochastic HMP. |
||||
Address | 2-6 October 2023. Paris (France) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCV | ||
Notes | HUPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ BEP2023 | Serial | 3829 | ||
Permanent link to this record | |||||
Author | Swathikiran Sudhakaran; Sergio Escalera; Oswald Lanz | ||||
Title | Gate-Shift-Fuse for Video Action Recognition | Type | Journal Article | ||
Year | 2023 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 45 | Issue | 9 | Pages | 10913-10928 |
Keywords | Action Recognition; Video Classification; Spatial Gating; Channel Fusion | ||||
Abstract | Convolutional Neural Networks are the de facto models for image recognition. However 3D CNNs, the straight forward extension of 2D CNNs for video recognition, have not achieved the same success on standard action recognition benchmarks. One of the main reasons for this reduced performance of 3D CNNs is the increased computational complexity requiring large scale annotated datasets to train them in scale. 3D kernel factorization approaches have been proposed to reduce the complexity of 3D CNNs. Existing kernel factorization approaches follow hand-designed and hard-wired techniques. In this paper we propose Gate-Shift-Fuse (GSF), a novel spatio-temporal feature extraction module which controls interactions in spatio-temporal decomposition and learns to adaptively route features through time and combine them in a data dependent manner. GSF leverages grouped spatial gating to decompose input tensor and channel weighting to fuse the decomposed tensors. GSF can be inserted into existing 2D CNNs to convert them into an efficient and high performing spatio-temporal feature extractor, with negligible parameter and compute overhead. We perform an extensive analysis of GSF using two popular 2D CNN families and achieve state-of-the-art or competitive performance on five standard action recognition benchmarks. | ||||
Address | 1 Sept. 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ SEL2023 | Serial | 3814 | ||
Permanent link to this record | |||||
Author | Javier Selva; Anders S. Johansen; Sergio Escalera; Kamal Nasrollahi; Thomas B. Moeslund; Albert Clapes | ||||
Title | Video transformers: A survey | Type | Journal Article | ||
Year | 2023 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 45 | Issue | 11 | Pages | 12922-12943 |
Keywords | Artificial Intelligence; Computer Vision; Self-Attention; Transformers; Video Representations | ||||
Abstract | Transformer models have shown great success handling long-range interactions, making them a promising tool for modeling video. However, they lack inductive biases and scale quadratically with input length. These limitations are further exacerbated when dealing with the high dimensionality introduced by the temporal dimension. While there are surveys analyzing the advances of Transformers for vision, none focus on an in-depth analysis of video-specific designs. In this survey, we analyze the main contributions and trends of works leveraging Transformers to model video. Specifically, we delve into how videos are handled at the input level first. Then, we study the architectural changes made to deal with video more efficiently, reduce redundancy, re-introduce useful inductive biases, and capture long-term temporal dynamics. In addition, we provide an overview of different training regimes and explore effective self-supervised learning strategies for video. Finally, we conduct a performance comparison on the most common benchmark for Video Transformers (i.e., action classification), finding them to outperform 3D ConvNets even with less computational complexity. | ||||
Address | 1 Nov. 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ SJE2023 | Serial | 3823 | ||
Permanent link to this record | |||||
Author | Simone Zini; Alex Gomez-Villa; Marco Buzzelli; Bartlomiej Twardowski; Andrew D. Bagdanov; Joost Van de Weijer | ||||
Title | Planckian Jitter: countering the color-crippling effects of color jitter on self-supervised training | Type | Conference Article | ||
Year | 2023 | Publication | 11th International Conference on Learning Representations | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Several recent works on self-supervised learning are trained by mapping different augmentations of the same image to the same feature representation. The data augmentations used are of crucial importance to the quality of learned feature representations. In this paper, we analyze how the color jitter traditionally used in data augmentation negatively impacts the quality of the color features in learned feature representations. To address this problem, we propose a more realistic, physics-based color data augmentation – which we call Planckian Jitter – that creates realistic variations in chromaticity and produces a model robust to illumination changes that can be commonly observed in real life, while maintaining the ability to discriminate image content based on color information. Experiments confirm that such a representation is complementary to the representations learned with the currently-used color jitter augmentation and that a simple concatenation leads to significant performance gains on a wide range of downstream datasets. In addition, we present a color sensitivity analysis that documents the impact of different training methods on model neurons and shows that the performance of the learned features is robust with respect to illuminant variations. | ||||
Address | 1 -5 May 2023, Kigali, Ruanda | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICLR | ||
Notes | LAMP; 600.147; 611.008; 5300006 | Approved | no | ||
Call Number | Admin @ si @ ZGB2023 | Serial | 3820 | ||
Permanent link to this record | |||||
Author | Pau Cano; Alvaro Caravaca; Debora Gil; Eva Musulen | ||||
Title | Diagnosis of Helicobacter pylori using AutoEncoders for the Detection of Anomalous Staining Patterns in Immunohistochemistry Images | Type | Miscellaneous | ||
Year | 2023 | Publication | Arxiv | Abbreviated Journal | |
Volume | Issue | Pages | 107241 | ||
Keywords | |||||
Abstract | This work addresses the detection of Helicobacter pylori a bacterium classified since 1994 as class 1 carcinogen to humans. By its highest specificity and sensitivity, the preferred diagnosis technique is the analysis of histological images with immunohistochemical staining, a process in which certain stained antibodies bind to antigens of the biological element of interest. This analysis is a time demanding task, which is currently done by an expert pathologist that visually inspects the digitized samples.
We propose to use autoencoders to learn latent patterns of healthy tissue and detect H. pylori as an anomaly in image staining. Unlike existing classification approaches, an autoencoder is able to learn patterns in an unsupervised manner (without the need of image annotations) with high performance. In particular, our model has an overall 91% of accuracy with 86\% sensitivity, 96% specificity and 0.97 AUC in the detection of H. pylori. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM | Approved | no | ||
Call Number | Admin @ si @ CCG2023 | Serial | 3855 | ||
Permanent link to this record |