Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–15]

Details

	Records
	Author	Yaxing Wang; Abel Gonzalez-Garcia; Chenshen Wu; Luis Herranz; Fahad Shahbaz Khan; Shangling Jui; Jian Yang; Joost Van de Weijer
	Title	MineGAN++: Mining Generative Models for Efficient Knowledge Transfer to Limited Data Domains			Type	Journal Article
	Year	2024	Publication	International Journal of Computer Vision	Abbreviated Journal	IJCV
	Volume	132	Issue		Pages	490–514
	Keywords
	Abstract	Given the often enormous effort required to train GANs, both computationally as well as in dataset collection, the re-use of pretrained GANs largely increases the potential impact of generative models. Therefore, we propose a novel knowledge transfer method for generative models based on mining the knowledge that is most beneficial to a specific target domain, either from a single or multiple pretrained GANs. This is done using a miner network that identifies which part of the generative distribution of each pretrained GAN outputs samples closest to the target domain. Mining effectively steers GAN sampling towards suitable regions of the latent space, which facilitates the posterior finetuning and avoids pathologies of other methods, such as mode collapse and lack of flexibility. Furthermore, to prevent overfitting on small target domains, we introduce sparse subnetwork selection, that restricts the set of trainable neurons to those that are relevant for the target dataset. We perform comprehensive experiments on several challenging datasets using various GAN architectures (BigGAN, Progressive GAN, and StyleGAN) and show that the proposed method, called MineGAN, effectively transfers knowledge to domains with few target images, outperforming existing methods. In addition, MineGAN can successfully transfer knowledge from multiple pretrained GANs. MineGAN.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; MACO			Approved	no
	Call Number	Admin @ si @ WGW2024			Serial	3888
Permanent link to this record



	Author	Vacit Oguz Yazici; Longlong Yu; Arnau Ramisa; Luis Herranz; Joost Van de Weijer
	Title	Main product detection with graph networks for fashion			Type	Journal Article
	Year	2024	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
	Volume	83	Issue		Pages	3215–3231
	Keywords
	Abstract	Computer vision has established a foothold in the online fashion retail industry. Main product detection is a crucial step of vision-based fashion product feed parsing pipelines, focused on identifying the bounding boxes that contain the product being sold in the gallery of images of the product page. The current state-of-the-art approach does not leverage the relations between regions in the image, and treats images of the same product independently, therefore not fully exploiting visual and product contextual information. In this paper, we propose a model that incorporates Graph Convolutional Networks (GCN) that jointly represent all detected bounding boxes in the gallery as nodes. We show that the proposed method is better than the state-of-the-art, especially, when we consider the scenario where title-input is missing at inference time and for cross-dataset evaluation, our method outperforms previous approaches by a large margin.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; MACO; 600.147; 600.167; 600.164; 600.161; 600.141; 601.309			Approved	no
	Call Number	Admin @ si @ YYR2024			Serial	4017
Permanent link to this record



	Author	Tao Wu; Kai Wang; Chuanming Tang; Jianlin Zhang
	Title	Diffusion-based network for unsupervised landmark detection			Type	Journal Article
	Year	2024	Publication	Knowledge-Based Systems	Abbreviated Journal
	Volume	292	Issue		Pages	111627
	Keywords
	Abstract	Landmark detection is a fundamental task aiming at identifying specific landmarks that serve as representations of distinct object features within an image. However, the present landmark detection algorithms often adopt complex architectures and are trained in a supervised manner using large datasets to achieve satisfactory performance. When faced with limited data, these algorithms tend to experience a notable decline in accuracy. To address these drawbacks, we propose a novel diffusion-based network (DBN) for unsupervised landmark detection, which leverages the generation ability of the diffusion models to detect the landmark locations. In particular, we introduce a dual-branch encoder (DualE) for extracting visual features and predicting landmarks. Additionally, we lighten the decoder structure for faster inference, referred to as LightD. By this means, we avoid relying on extensive data comparison and the necessity of designing complex architectures as in previous methods. Experiments on CelebA, AFLW, 300W and Deepfashion benchmarks have shown that DBN performs state-of-the-art compared to the existing methods. Furthermore, DBN shows robustness even when faced with limited data cases.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ WWT2024			Serial	4024
Permanent link to this record



	Author	Cristhian A. Aguilera-Carrasco; Luis Felipe Gonzalez-Böhme; Francisco Valdes; Francisco Javier Quitral Zapata; Bogdan Raducanu
	Title	A Hand-Drawn Language for Human–Robot Collaboration in Wood Stereotomy			Type	Journal Article
	Year	2023	Publication	IEEE Access	Abbreviated Journal	ACCESS
	Volume	11	Issue		Pages	100975 - 100985
	Keywords
	Abstract	This study introduces a novel, hand-drawn language designed to foster human-robot collaboration in wood stereotomy, central to carpentry and joinery professions. Based on skilled carpenters’ line and symbol etchings on timber, this language signifies the location, geometry of woodworking joints, and timber placement within a framework. A proof-of-concept prototype has been developed, integrating object detectors, keypoint regression, and traditional computer vision techniques to interpret this language and enable an extensive repertoire of actions. Empirical data attests to the language’s efficacy, with the successful identification of a specific set of symbols on various wood species’ sawn surfaces, achieving a mean average precision (mAP) exceeding 90%. Concurrently, the system can accurately pinpoint critical positions that facilitate robotic comprehension of carpenter-indicated woodworking joint geometry. The positioning error, approximately 3 pixels, meets industry standards.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ AGV2023			Serial	3969
Permanent link to this record



	Author	Akshita Gupta; Sanath Narayan; Salman Khan; Fahad Shahbaz Khan; Ling Shao; Joost Van de Weijer
	Title	Generative Multi-Label Zero-Shot Learning			Type	Journal Article
	Year	2023	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	45	Issue	12	Pages	14611-14624
	Keywords	Generalized zero-shot learning; Multi-label classification; Zero-shot object detection; Feature synthesis
	Abstract	Multi-label zero-shot learning strives to classify images into multiple unseen categories for which no data is available during training. The test samples can additionally contain seen categories in the generalized variant. Existing approaches rely on learning either shared or label-specific attention from the seen classes. Nevertheless, computing reliable attention maps for unseen classes during inference in a multi-label setting is still a challenge. In contrast, state-of-the-art single-label generative adversarial network (GAN) based approaches learn to directly synthesize the class-specific visual features from the corresponding class attribute embeddings. However, synthesizing multi-label features from GANs is still unexplored in the context of zero-shot setting. When multiple objects occur jointly in a single image, a critical question is how to effectively fuse multi-class information. In this work, we introduce different fusion approaches at the attribute-level, feature-level and cross-level (across attribute and feature-levels) for synthesizing multi-label features from their corresponding multi-label class embeddings. To the best of our knowledge, our work is the first to tackle the problem of multi-label feature synthesis in the (generalized) zero-shot setting. Our cross-level fusion-based generative approach outperforms the state-of-the-art on three zero-shot benchmarks: NUS-WIDE, Open Images and MS COCO. Furthermore, we show the generalization capabilities of our fusion approach in the zero-shot detection task on MS COCO, achieving favorable performance against existing methods.
	Address	December 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; PID2021-128178OB-I00			Approved	no
	Call Number	Admin @ si @			Serial	3853
Permanent link to this record

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–15]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: