Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	16–30 of 161 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

List View

Citations

Details

	Records
	Author	Patricia Suarez; Dario Carpio; Angel Sappa
	Title	Depth Map Estimation from a Single 2D Image			Type	Conference Article
	Year	2023	Publication	17th International Conference on Signal-Image Technology & Internet-Based Systems	Abbreviated Journal
	Volume		Issue		Pages	347-353
	Keywords
	Abstract	This paper presents an innovative architecture based on a Cycle Generative Adversarial Network (CycleGAN) for the synthesis of high-quality depth maps from monocular images. The proposed architecture leverages a diverse set of loss functions, including cycle consistency, contrastive, identity, and least square losses, to facilitate the generation of depth maps that exhibit realism and high fidelity. A notable feature of the approach is its ability to synthesize depth maps from grayscale images without the need for paired training data. Extensive comparisons with different state-of-the-art methods show the superiority of the proposed approach in both quantitative metrics and visual quality. This work addresses the challenge of depth map synthesis and offers significant advancements in the field.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	SITIS
	Notes	MSIAU			Approved	no
	Call Number	Admin @ si @ SCS2023b			Serial	4009
Permanent link to this record



	Author	Rafael E. Rivadeneira; Henry Velesaca; Angel Sappa
	Title	Object Detection in Very Low-Resolution Thermal Images through a Guided-Based Super-Resolution Approach			Type	Conference Article
	Year	2023	Publication	17th International Conference on Signal-Image Technology & Internet-Based Systems	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	This work proposes a novel approach that integrates super-resolution techniques with off-the-shelf object detection methods to tackle the problem of handling very low-resolution thermal images. The suggested approach begins by enhancing the low-resolution (LR) thermal images through a guided super-resolution strategy, leveraging a high-resolution (HR) visible spectrum image. Subsequently, object detection is performed on the high-resolution thermal image. The experimental results demonstrate tremendous improvements in comparison with both scenarios: when object detection is performed on the LR thermal image alone, as well as when object detection is conducted on the up-sampled LR thermal image. Moreover, the proposed approach proves highly valuable in camouflaged scenarios where objects might remain undetected in visible spectrum images.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	SITIS
	Notes	MSIAU			Approved	no
	Call Number	Admin @ si @ RVS2023			Serial	4010
Permanent link to this record



	Author	Patricia Suarez; Dario Carpio; Angel Sappa
	Title	Boosting Guided Super-Resolution Performance with Synthesized Images			Type	Conference Article
	Year	2023	Publication	17th International Conference on Signal-Image Technology & Internet-Based Systems	Abbreviated Journal
	Volume		Issue		Pages	189-195
	Keywords
	Abstract	Guided image processing techniques are widely used for extracting information from a guiding image to aid in the processing of the guided one. These images may be sourced from different modalities, such as 2D and 3D, or different spectral bands, like visible and infrared. In the case of guided cross-spectral super-resolution, features from the two modal images are extracted and efficiently merged to migrate guidance information from one image, usually high-resolution (HR), toward the guided one, usually low-resolution (LR). Different approaches have been recently proposed focusing on the development of architectures for feature extraction and merging in the cross-spectral domains, but none of them care about the different nature of the given images. This paper focuses on the specific problem of guided thermal image super-resolution, where an LR thermal image is enhanced by an HR visible spectrum image. To improve existing guided super-resolution techniques, a novel scheme is proposed that maps the original guiding information to a thermal image-like representation that is similar to the output. Experimental results evaluating five different approaches demonstrate that the best results are achieved when the guiding and guided images share the same domain.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	SITIS
	Notes	MSIAU			Approved	no
	Call Number	Admin @ si @ SCS2023c			Serial	4011
Permanent link to this record



	Author	Mickael Cormier; Andreas Specker; Julio C. S. Jacques; Lucas Florin; Jurgen Metzler; Thomas B. Moeslund; Kamal Nasrollahi; Sergio Escalera; Jurgen Beyerer
	Title	UPAR Challenge: Pedestrian Attribute Recognition and Attribute-based Person Retrieval – Dataset, Design, and Results			Type	Conference Article
	Year	2023	Publication	2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops	Abbreviated Journal
	Volume		Issue		Pages	166-175
	Keywords
	Abstract	In civilian video security monitoring, retrieving and tracking a person of interest often rely on witness testimony and their appearance description. Deployed systems rely on a large amount of annotated training data and are expected to show consistent performance in diverse areas and gen-eralize well between diverse settings w.r.t. different view-points, illumination, resolution, occlusions, and poses for indoor and outdoor scenes. However, for such generalization, the system would require a large amount of various an-notated data for training and evaluation. The WACV 2023 Pedestrian Attribute Recognition and Attributed-based Per-son Retrieval Challenge (UPAR-Challenge) aimed to spot-light the problem of domain gaps in a real-world surveil-lance context and highlight the challenges and limitations of existing methods. The UPAR dataset, composed of 40 important binary attributes over 12 attribute categories across four datasets, was extended with data captured from a low-flying UAV from the P-DESTRE dataset. To this aim, 0.6M additional annotations were manually labeled and vali-dated. Each track evaluated the robustness of the competing methods to domain shifts by training on limited data from a specific domain and evaluating using data from unseen do-mains. The challenge attracted 41 registered participants, but only one team managed to outperform the baseline on one track, emphasizing the task's difficulty. This work de-scribes the challenge design, the adopted dataset, obtained results, as well as future directions on the topic.
	Address	Waikoloa; Hawai; USA; January 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACVW
	Notes	HUPBA			Approved	no
	Call Number	Admin @ si @ CSJ2023			Serial	3902
Permanent link to this record



	Author	Dawid Rymarczyk; Joost van de Weijer; Bartosz Zielinski; Bartlomiej Twardowski
	Title	ICICLE: Interpretable Class Incremental Continual Learning. Dawid Rymarczyk			Type	Conference Article
	Year	2023	Publication	20th IEEE International Conference on Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	1887-1898
	Keywords
	Abstract	Continual learning enables incremental learning of new tasks without forgetting those previously learned, resulting in positive knowledge transfer that can enhance performance on both new and old tasks. However, continual learning poses new challenges for interpretability, as the rationale behind model predictions may change over time, leading to interpretability concept drift. We address this problem by proposing Interpretable Class-InCremental LEarning (ICICLE), an exemplar-free approach that adopts a prototypical part-based approach. It consists of three crucial novelties: interpretability regularization that distills previously learned concepts while preserving user-friendly positive reasoning; proximity-based prototype initialization strategy dedicated to the fine-grained setting; and task-recency bias compensation devoted to prototypical parts. Our experimental results demonstrate that ICICLE reduces the interpretability concept drift and outperforms the existing exemplar-free methods of common class-incremental learning when applied to concept-based models.
	Address	Paris; France; October 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICCV
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ RWZ2023			Serial	3947
Permanent link to this record



	Author	Jordy Van Landeghem; Ruben Tito; Lukasz Borchmann; Michal Pietruszka; Pawel Joziak; Rafal Powalski; Dawid Jurkiewicz; Mickael Coustaty; Bertrand Anckaert; Ernest Valveny; Matthew Blaschko; Sien Moens; Tomasz Stanislawek
	Title	Document Understanding Dataset and Evaluation (DUDE)			Type	Conference Article
	Year	2023	Publication	20th IEEE International Conference on Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	19528-19540
	Keywords
	Abstract	We call on the Document AI (DocAI) community to re-evaluate current methodologies and embrace the challenge of creating more practically-oriented benchmarks. Document Understanding Dataset and Evaluation (DUDE) seeks to remediate the halted research progress in understanding visually-rich documents (VRDs). We present a new dataset with novelties related to types of questions, answers, and document layouts based on multi-industry, multi-domain, and multi-page VRDs of various origins and dates. Moreover, we are pushing the boundaries of current methods by creating multi-task and multi-domain evaluation setups that more accurately simulate real-world situations where powerful generalization and adaptation under low-resource settings are desired. DUDE aims to set a new standard as a more practical, long-standing benchmark for the community, and we hope that it will lead to future extensions and contributions that address real-world challenges. Finally, our work illustrates the importance of finding more efficient ways to model language, images, and layout in DocAI.
	Address	Paris; France; October 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICCV
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ LTB2023			Serial	3948
Permanent link to this record



	Author	Yuyang Liu; Yang Cong; Dipam Goswami; Xialei Liu; Joost Van de Weijer
	Title	Augmented Box Replay: Overcoming Foreground Shift for Incremental Object Detection			Type	Conference Article
	Year	2023	Publication	20th IEEE International Conference on Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	11367-11377
	Keywords
	Abstract	In incremental learning, replaying stored samples from previous tasks together with current task samples is one of the most efficient approaches to address catastrophic forgetting. However, unlike incremental classification, image replay has not been successfully applied to incremental object detection (IOD). In this paper, we identify the overlooked problem of foreground shift as the main reason for this. Foreground shift only occurs when replaying images of previous tasks and refers to the fact that their background might contain foreground objects of the current task. To overcome this problem, a novel and efficient Augmented Box Replay (ABR) method is developed that only stores and replays foreground objects and thereby circumvents the foreground shift problem. In addition, we propose an innovative Attentive RoI Distillation loss that uses spatial attention from region-of-interest (RoI) features to constrain current model to focus on the most important information from old model. ABR significantly reduces forgetting of previous classes while maintaining high plasticity in current classes. Moreover, it considerably reduces the storage requirements when compared to standard image replay. Comprehensive experiments on Pascal-VOC and COCO datasets support the state-of-the-art performance of our model.
	Address	Paris; France; October 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICCV
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ LCG2023			Serial	3949
Permanent link to this record



	Author	Asma Bensalah; Antonio Parziale; Giuseppe De Gregorio; Angelo Marcelli; Alicia Fornes; Josep Llados
	Title	I Can’t Believe It’s Not Better: In-air Movement for Alzheimer Handwriting Synthetic Generation			Type	Conference Article
	Year	2023	Publication	21st International Graphonomics Conference	Abbreviated Journal
	Volume		Issue		Pages	136–148
	Keywords
	Abstract	During recent years, there here has been a boom in terms of deep learning use for handwriting analysis and recognition. One main application for handwriting analysis is early detection and diagnosis in the health field. Unfortunately, most real case problems still suffer a scarcity of data, which makes difficult the use of deep learning-based models. To alleviate this problem, some works resort to synthetic data generation. Lately, more works are directed towards guided data synthetic generation, a generation that uses the domain and data knowledge to generate realistic data that can be useful to train deep learning models. In this work, we combine the domain knowledge about the Alzheimer’s disease for handwriting and use it for a more guided data generation. Concretely, we have explored the use of in-air movements for synthetic data generation.
	Address	Evora; Portugal; October 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	IGS
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ BPG2023			Serial	3838
Permanent link to this record



	Author	Patricia Suarez; Dario Carpio; Angel Sappa
	Title	A Deep Learning Based Approach for Synthesizing Realistic Depth Maps			Type	Conference Article
	Year	2023	Publication	22nd International Conference on Image Analysis and Processing	Abbreviated Journal
	Volume	14234	Issue		Pages	369–380
	Keywords
	Abstract	This paper presents a novel cycle generative adversarial network (CycleGAN) architecture for synthesizing high-quality depth maps from a given monocular image. The proposed architecture uses multiple loss functions, including cycle consistency, contrastive, identity, and least square losses, to enable the generation of realistic and high-fidelity depth maps. The proposed approach addresses this challenge by synthesizing depth maps from RGB images without requiring paired training data. Comparisons with several state-of-the-art approaches are provided showing the proposed approach overcome other approaches both in terms of quantitative metrics and visual quality.
	Address	Udine; Italia; Setember 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIAP
	Notes	MSIAU			Approved	no
	Call Number	Admin @ si @ SCS2023a			Serial	3968
Permanent link to this record



	Author	Albin Soutif; Antonio Carta; Joost Van de Weijer
	Title	Improving Online Continual Learning Performance and Stability with Temporal Ensembles			Type	Conference Article
	Year	2023	Publication	2nd Conference on Lifelong Learning Agents	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Neural networks are very effective when trained on large datasets for a large number of iterations. However, when they are trained on non-stationary streams of data and in an online fashion, their performance is reduced (1) by the online setup, which limits the availability of data, (2) due to catastrophic forgetting because of the non-stationary nature of the data. Furthermore, several recent works (Caccia et al., 2022; Lange et al., 2023) arXiv:2205.13452 showed that replay methods used in continual learning suffer from the stability gap, encountered when evaluating the model continually (rather than only on task boundaries). In this article, we study the effect of model ensembling as a way to improve performance and stability in online continual learning. We notice that naively ensembling models coming from a variety of training tasks increases the performance in online continual learning considerably. Starting from this observation, and drawing inspirations from semi-supervised learning ensembling methods, we use a lightweight temporal ensemble that computes the exponential moving average of the weights (EMA) at test time, and show that it can drastically increase the performance and stability when used in combination with several methods from the literature.
	Address	Montreal; Canada; August 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	COLLAS
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ SCW2023			Serial	3922
Permanent link to this record



	Author	Senmao Li; Joost Van de Weijer; Yaxing Wang; Fahad Shahbaz Khan; Meiqin Liu; Jian Yang
	Title	3D-Aware Multi-Class Image-to-Image Translation with NeRFs			Type	Conference Article
	Year	2023	Publication	36th IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	12652-12662
	Keywords
	Abstract	Recent advances in 3D-aware generative models (3D-aware GANs) combined with Neural Radiance Fields (NeRF) have achieved impressive results. However no prior works investigate 3D-aware GANs for 3D consistent multiclass image-to-image (3D-aware 121) translation. Naively using 2D-121 translation methods suffers from unrealistic shape/identity change. To perform 3D-aware multiclass 121 translation, we decouple this learning process into a multiclass 3D-aware GAN step and a 3D-aware 121 translation step. In the first step, we propose two novel techniques: a new conditional architecture and an effective training strategy. In the second step, based on the well-trained multiclass 3D-aware GAN architecture, that preserves view-consistency, we construct a 3D-aware 121 translation system. To further reduce the view-consistency problems, we propose several new techniques, including a U-net-like adaptor network design, a hierarchical representation constrain and a relative regularization loss. In exten-sive experiments on two datasets, quantitative and qualitative results demonstrate that we successfully perform 3D-aware 121 translation with multi-view consistency. Code is available in 3DI2I.
	Address	Vancouver; Canada; June 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPR
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ LWW2023b			Serial	3920
Permanent link to this record



	Author	Hugo Bertiche; Niloy J Mitra; Kuldeep Kulkarni; Chun Hao Paul Huang; Tuanfeng Y Wang; Meysam Madadi; Sergio Escalera; Duygu Ceylan
	Title	Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images			Type	Conference Article
	Year	2023	Publication	36th IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	459-468
	Keywords
	Abstract	Cinemagraphs are short looping videos created by adding subtle motions to a static image. This kind of media is popular and engaging. However, automatic generation of cinemagraphs is an underexplored area and current solutions require tedious low-level manual authoring by artists. In this paper, we present an automatic method that allows generating human cinemagraphs from single RGB images. We investigate the problem in the context of dressed humans under the wind. At the core of our method is a novel cyclic neural network that produces looping cinemagraphs for the target loop duration. To circumvent the problem of collecting real data, we demonstrate that it is possible, by working in the image normal space, to learn garment motion dynamics on synthetic data and generalize to real data. We evaluate our method on both synthetic and real data and demonstrate that it is possible to create compelling and plausible cinemagraphs from single RGB images.
	Address	Vancouver; Canada; June 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPR
	Notes	HUPBA			Approved	no
	Call Number	Admin @ si @ BMK2023			Serial	3921
Permanent link to this record



	Author	Dipam Goswami; Yuyang Liu ; Bartlomiej Twardowski; Joost Van de Weijer
	Title	FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning			Type	Conference Article
	Year	2023	Publication	37th Annual Conference on Neural Information Processing Systems	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Poster
	Address	New Orleans; USA; December 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	NEURIPS
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ GLT2023			Serial	3934
Permanent link to this record



	Author	Kai Wang; Fei Yang; Shiqi Yang; Muhammad Atif Butt; Joost Van de Weijer
	Title	Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing			Type	Conference Article
	Year	2023	Publication	37th Annual Conference on Neural Information Processing Systems	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Poster
	Address	New Orleans; USA; December 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	NEURIPS
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ WYY2023			Serial	3935
Permanent link to this record



	Author	ChuanMing Fang; Kai Wang; Joost Van de Weijer
	Title	IterInv: Iterative Inversion for Pixel-Level T2I Models			Type	Conference Article
	Year	2023	Publication	37th Annual Conference on Neural Information Processing Systems	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Large-scale text-to-image diffusion models have been a ground-breaking development in generating convincing images following an input text prompt. The goal of image editing research is to give users control over the generated images by modifying the text prompt. Current image editing techniques are relying on DDIM inversion as a common practice based on the Latent Diffusion Models (LDM). However, the large pretrained T2I models working on the latent space as LDM suffer from losing details due to the first compression stage with an autoencoder mechanism. Instead, another mainstream T2I pipeline working on the pixel level, such as Imagen and DeepFloyd-IF, avoids this problem. They are commonly composed of several stages, normally with a text-to-image stage followed by several super-resolution stages. In this case, the DDIM inversion is unable to find the initial noise to generate the original image given that the super-resolution diffusion models are not compatible with the DDIM technique. According to our experimental findings, iteratively concatenating the noisy image as the condition is the root of this problem. Based on this observation, we develop an iterative inversion (IterInv) technique for this stream of T2I models and verify IterInv with the open-source DeepFloyd-IF model. By combining our method IterInv with a popular image editing method, we prove the application prospects of IterInv. The code will be released at \url{this https URL}.
	Address	New Orleans; USA; December 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	NEURIPS
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ FWW2023			Serial	3936
Permanent link to this record