Publicacions CVC -- Query Results

[41–50] << 51 52 53 54 55 56 57 58 59 60 >> [61–70]

Details

Records
Author	Danna Xue; Fei Yang; Pei Wang; Luis Herranz; Jinqiu Sun; Yu Zhu; Yanning Zhang
Title	SlimSeg: Slimmable Semantic Segmentation with Boundary Supervision			Type	Conference Article
Year	2022	Publication	30th ACM International Conference on Multimedia	Abbreviated Journal
Volume		Issue		Pages	6539-6548
Keywords
Abstract	Accurate semantic segmentation models typically require significant computational resources, inhibiting their use in practical applications. Recent works rely on well-crafted lightweight models to achieve fast inference. However, these models cannot flexibly adapt to varying accuracy and efficiency requirements. In this paper, we propose a simple but effective slimmable semantic segmentation (SlimSeg) method, which can be executed at different capacities during inference depending on the desired accuracy-efficiency tradeoff. More specifically, we employ parametrized channel slimming by stepwise downward knowledge distillation during training. Motivated by the observation that the differences between segmentation results of each submodel are mainly near the semantic borders, we introduce an additional boundary guided semantic segmentation loss to further improve the performance of each submodel. We show that our proposed SlimSeg with various mainstream networks can produce flexible models that provide dynamic adjustment of computational cost and better performance than independent models. Extensive experiments on semantic segmentation benchmarks, Cityscapes and CamVid, demonstrate the generalization ability of our framework.
Address	Lisboa, Portugal, October 2022
Corporate Author				Thesis
Publisher	Association for Computing Machinery	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4503-9203-7	Medium
Area		Expedition		Conference	MM
Notes	MACO; 600.161; 601.400			Approved	no
Call Number	Admin @ si @ XYW2022			Serial	3758
Permanent link to this record



Author	Saiping Zhang; Luis Herranz; Marta Mrak; Marc Gorriz Blanch; Shuai Wan; Fuzheng Yang
Title	DCNGAN: A Deformable Convolution-Based GAN with QP Adaptation for Perceptual Quality Enhancement of Compressed Video			Type	Conference Article
Year	2022	Publication	47th International Conference on Acoustics, Speech, and Signal Processing	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms.
Address	Virtual; May 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICASSP
Notes	MACO; 600.161; 601.379			Approved	no
Call Number	Admin @ si @ ZHM2022a			Serial	3765
Permanent link to this record



Author	Chengyi Zou; Shuai Wan; Marta Mrak; Marc Gorriz Blanch; Luis Herranz; Tiannan Ji
Title	Towards Lightweight Neural Network-based Chroma Intra Prediction for Video Coding			Type	Conference Article
Year	2022	Publication	29th IEEE International Conference on Image Processing	Abbreviated Journal
Volume		Issue		Pages
Keywords	Video coding; Quantization (signal); Computational modeling; Neural networks; Predictive models; Video compression; Syntactics
Abstract	In video compression the luma channel can be useful for predicting chroma channels (Cb, Cr), as has been demonstrated with the Cross-Component Linear Model (CCLM) used in Versatile Video Coding (VVC) standard. More recently, it has been shown that neural networks can even better capture the relationship among different channels. In this paper, a new attention-based neural network is proposed for cross-component intra prediction. With the goal to simplify neural network design, the new framework consists of four branches: boundary branch and luma branch for extracting features from reference samples, attention branch for fusing the first two branches, and prediction branch for computing the predicted chroma samples. The proposed scheme is integrated into VVC test model together with one additional binary block-level syntax flag which indicates whether a given block makes use of the proposed method. Experimental results demonstrate 0.31%/2.36%/2.00% BD-rate reductions on Y/Cb/Cr components, respectively, on top of the VVC Test Model (VTM) 7.0 which uses CCLM.
Address	Bordeaux; France; October 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICIP
Notes	MACO			Approved	no
Call Number	Admin @ si @ ZWM2022			Serial	3790
Permanent link to this record



Author	Javier Vazquez; Graham D. Finlayson; Luis Herranz
Title	Improving the perception of low-light enhanced images			Type	Journal Article
Year	2024	Publication	Optics Express	Abbreviated Journal
Volume	32	Issue	4	Pages	5174-5190
Keywords
Abstract	Improving images captured under low-light conditions has become an important topic in computational color imaging, as it has a wide range of applications. Most current methods are either based on handcrafted features or on end-to-end training of deep neural networks that mostly focus on minimizing some distortion metric —such as PSNR or SSIM— on a set of training images. However, the minimization of distortion metrics does not mean that the results are optimal in terms of perception (i.e. perceptual quality). As an example, the perception-distortion trade-off states that, close to the optimal results, improving distortion results in worsening perception. This means that current low-light image enhancement methods —that focus on distortion minimization— cannot be optimal in the sense of obtaining a good image in terms of perception errors. In this paper, we propose a post-processing approach in which, given the original low-light image and the result of a specific method, we are able to obtain a result that resembles as much as possible that of the original method, but, at the same time, giving an improvement in the perception of the final image. More in detail, our method follows the hypothesis that in order to minimally modify the perception of an input image, any modification should be a combination of a local change in the shading across a scene and a global change in illumination color. We demonstrate the ability of our method quantitatively using perceptual blind image metrics such as BRISQUE, NIQE, or UNIQUE, and through user preference tests.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MACO			Approved	no
Call Number	Admin @ si @ VFH2024			Serial	4018
Permanent link to this record



Author	Pedro Martins; Paulo Carvalho; Carlo Gatta
Title	On the completeness of feature-driven maximally stable extremal regions			Type	Journal Article
Year	2016	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	74	Issue		Pages	9-16
Keywords	Local features; Completeness; Maximally Stable Extremal Regions
Abstract	By definition, local image features provide a compact representation of the image in which most of the image information is preserved. This capability offered by local features has been overlooked, despite being relevant in many application scenarios. In this paper, we analyze and discuss the performance of feature-driven Maximally Stable Extremal Regions (MSER) in terms of the coverage of informative image parts (completeness). This type of features results from an MSER extraction on saliency maps in which features related to objects boundaries or even symmetry axes are highlighted. These maps are intended to be suitable domains for MSER detection, allowing this detector to provide a better coverage of informative image parts. Our experimental results, which were based on a large-scale evaluation, show that feature-driven MSER have relatively high completeness values and provide more complete sets than a traditional MSER detection even when sets of similar cardinality are considered.
Address
Corporate Author				Thesis
Publisher	Elsevier B.V.	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0167-8655	ISBN		Medium
Area		Expedition		Conference
Notes	LAMP;MILAB;			Approved	no
Call Number	Admin @ si @ MCG2016			Serial	2748
Permanent link to this record



Author	Akshita Gupta; Sanath Narayan; Salman Khan; Fahad Shahbaz Khan; Ling Shao; Joost Van de Weijer
Title	Generative Multi-Label Zero-Shot Learning			Type	Journal Article
Year	2023	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
Volume	45	Issue	12	Pages	14611-14624
Keywords	Generalized zero-shot learning; Multi-label classification; Zero-shot object detection; Feature synthesis
Abstract	Multi-label zero-shot learning strives to classify images into multiple unseen categories for which no data is available during training. The test samples can additionally contain seen categories in the generalized variant. Existing approaches rely on learning either shared or label-specific attention from the seen classes. Nevertheless, computing reliable attention maps for unseen classes during inference in a multi-label setting is still a challenge. In contrast, state-of-the-art single-label generative adversarial network (GAN) based approaches learn to directly synthesize the class-specific visual features from the corresponding class attribute embeddings. However, synthesizing multi-label features from GANs is still unexplored in the context of zero-shot setting. When multiple objects occur jointly in a single image, a critical question is how to effectively fuse multi-class information. In this work, we introduce different fusion approaches at the attribute-level, feature-level and cross-level (across attribute and feature-levels) for synthesizing multi-label features from their corresponding multi-label class embeddings. To the best of our knowledge, our work is the first to tackle the problem of multi-label feature synthesis in the (generalized) zero-shot setting. Our cross-level fusion-based generative approach outperforms the state-of-the-art on three zero-shot benchmarks: NUS-WIDE, Open Images and MS COCO. Furthermore, we show the generalization capabilities of our fusion approach in the zero-shot detection task on MS COCO, achieving favorable performance against existing methods.
Address	December 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; PID2021-128178OB-I00			Approved	no
Call Number	Admin @ si @			Serial	3853
Permanent link to this record



Author	Carola Figueroa Flores; Abel Gonzalez-Garcia; Joost Van de Weijer; Bogdan Raducanu
Title	Saliency for fine-grained object recognition in domains with scarce training data			Type	Journal Article
Year	2019	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	94	Issue		Pages	62-73
Keywords
Abstract	This paper investigates the role of saliency to improve the classification accuracy of a Convolutional Neural Network (CNN) for the case when scarce training data is available. Our approach consists in adding a saliency branch to an existing CNN architecture which is used to modulate the standard bottom-up visual features from the original image input, acting as an attentional mechanism that guides the feature extraction process. The main aim of the proposed approach is to enable the effective training of a fine-grained recognition model with limited training samples and to improve the performance on the task, thereby alleviating the need to annotate a large dataset. The vast majority of saliency methods are evaluated on their ability to generate saliency maps, and not on their functionality in a complete vision pipeline. Our proposed pipeline allows to evaluate saliency methods for the high-level task of object recognition. We perform extensive experiments on various fine-grained datasets (Flowers, Birds, Cars, and Dogs) under different conditions and show that saliency can considerably improve the network’s performance, especially for the case of scarce training data. Furthermore, our experiments show that saliency methods that obtain improved saliency maps (as measured by traditional saliency benchmarks) also translate to saliency methods that yield improved performance gains when applied in an object recognition pipeline.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; OR; 600.109; 600.141; 600.120			Approved	no
Call Number	Admin @ si @ FGW2019			Serial	3264
Permanent link to this record



Author	Javad Zolfaghari Bengar; Joost Van de Weijer; Bartlomiej Twardowski; Bogdan Raducanu
Title	Reducing Label Effort: Self- Supervised Meets Active Learning			Type	Conference Article
Year	2021	Publication	International Conference on Computer Vision Workshops	Abbreviated Journal
Volume		Issue		Pages	1631-1639
Keywords
Abstract	Active learning is a paradigm aimed at reducing the annotation effort by training the model on actively selected informative and/or representative samples. Another paradigm to reduce the annotation effort is self-training that learns from a large amount of unlabeled data in an unsupervised way and fine-tunes on few labeled samples. Recent developments in self-training have achieved very impressive results rivaling supervised learning on some datasets. The current work focuses on whether the two paradigms can benefit from each other. We studied object recognition datasets including CIFAR10, CIFAR100 and Tiny ImageNet with several labeling budgets for the evaluations. Our experiments reveal that self-training is remarkably more efficient than active learning at reducing the labeling effort, that for a low labeling budget, active learning offers no benefit to self-training, and finally that the combination of active learning and self-training is fruitful when the labeling budget is high. The performance gap between active learning trained either with self-training or from scratch diminishes as we approach to the point where almost half of the dataset is labeled.
Address	October 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCVW
Notes	LAMP; OR			Approved	no
Call Number	Admin @ si @ ZVT2021			Serial	3672
Permanent link to this record



Author	Javad Zolfaghari Bengar; Bogdan Raducanu; Joost Van de Weijer
Title	When Deep Learners Change Their Mind: Learning Dynamics for Active Learning			Type	Conference Article
Year	2021	Publication	19th International Conference on Computer Analysis of Images and Patterns	Abbreviated Journal
Volume	13052	Issue	1	Pages	403-413
Keywords
Abstract	Active learning aims to select samples to be annotated that yield the largest performance improvement for the learning algorithm. Many methods approach this problem by measuring the informativeness of samples and do this based on the certainty of the network predictions for samples. However, it is well-known that neural networks are overly confident about their prediction and are therefore an untrustworthy source to assess sample informativeness. In this paper, we propose a new informativeness-based active learning method. Our measure is derived from the learning dynamics of a neural network. More precisely we track the label assignment of the unlabeled data pool during the training of the algorithm. We capture the learning dynamics with a metric called label-dispersion, which is low when the network consistently assigns the same label to the sample during the training of the network and high when the assigned label changes frequently. We show that label-dispersion is a promising predictor of the uncertainty of the network, and show on two benchmark datasets that an active learning algorithm based on label-dispersion obtains excellent results.
Address	September 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CAIP
Notes	LAMP; OR			Approved	no
Call Number	Admin @ si @ ZRV2021			Serial	3673
Permanent link to this record



Author	Shiqi Yang; Yaxing Wang; Kai Wang; Shangling Jui; Joost Van de Weijer
Title	One Ring to Bring Them All: Towards Open-Set Recognition under Domain Shift			Type	Miscellaneous
Year	2022	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	In this paper, we investigate model adaptation under domain and category shift, where the final goal is to achieve (SF-UNDA), which addresses the situation where there exist both domain and category shifts between source and target domains. Under the SF-UNDA setting, the model cannot access source data anymore during target adaptation, which aims to address data privacy concerns. We propose a novel training scheme to learn a ( +1)-way classifier to predict the source classes and the unknown class, where samples of only known source categories are available for training. Furthermore, for target adaptation, we simply adopt a weighted entropy minimization to adapt the source pretrained model to the unlabeled target domain without source data. In experiments, we show: After source training, the resulting source model can get excellent performance for ; After target adaptation, our method surpasses current UNDA approaches which demand source data during adaptation. The versatility to several different tasks strongly proves the efficacy and generalization ability of our method. When augmented with a closed-set domain adaptation approach during target adaptation, our source-free method further outperforms the current state-of-the-art UNDA method by 2.5%, 7.2% and 13% on Office-31, Office-Home and VisDA respectively.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; no proj			Approved	no
Call Number	Admin @ si @ YWW2022c			Serial	3818
Permanent link to this record



Author	Marco Cotogni; Fei Yang; Claudio Cusano; Andrew Bagdanov; Joost Van de Weijer
Title	Gated Class-Attention with Cascaded Feature Drift Compensation for Exemplar-free Continual Learning of Vision Transformers			Type	Miscellaneous
Year	2022	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords	Marco Cotogni, Fei Yang, Claudio Cusano, Andrew D. Bagdanov, Joost van de Weijer
Abstract	We propose a new method for exemplar-free class incremental training of ViTs. The main challenge of exemplar-free continual learning is maintaining plasticity of the learner without causing catastrophic forgetting of previously learned tasks. This is often achieved via exemplar replay which can help recalibrate previous task classifiers to the feature drift which occurs when learning new tasks. Exemplar replay, however, comes at the cost of retaining samples from previous tasks which for many applications may not be possible. To address the problem of continual ViT training, we first propose gated class-attention to minimize the drift in the final ViT transformer block. This mask-based gating is applied to class-attention mechanism of the last transformer block and strongly regulates the weights crucial for previous tasks. Importantly, gated class-attention does not require the task-ID during inference, which distinguishes it from other parameter isolation methods. Secondly, we propose a new method of feature drift compensation that accommodates feature drift in the backbone when learning new tasks. The combination of gated class-attention and cascaded feature drift compensation allows for plasticity towards new tasks while limiting forgetting of previous ones. Extensive experiments performed on CIFAR-100, Tiny-ImageNet and ImageNet100 demonstrate that our exemplar-free method obtains competitive results when compared to rehearsal based ViT methods.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; no proj			Approved	no
Call Number	Admin @ si @ CYC2022			Serial	3827
Permanent link to this record



Author	Lu Yu; Lichao Zhang; Joost Van de Weijer; Fahad Shahbaz Khan; Yongmei Cheng; C. Alejandro Parraga
Title	Beyond Eleven Color Names for Image Understanding			Type	Journal Article
Year	2018	Publication	Machine Vision and Applications	Abbreviated Journal	MVAP
Volume	29	Issue	2	Pages	361-373
Keywords	Color name; Discriminative descriptors; Image classification; Re-identification; Tracking
Abstract	Color description is one of the fundamental problems of image understanding. One of the popular ways to represent colors is by means of color names. Most existing work on color names focuses on only the eleven basic color terms of the English language. This could be limiting the discriminative power of these representations, and representations based on more color names are expected to perform better. However, there exists no clear strategy to choose additional color names. We collect a dataset of 28 additional color names. To ensure that the resulting color representation has high discriminative power we propose a method to order the additional color names according to their complementary nature with the basic color names. This allows us to compute color name representations with high discriminative power of arbitrary length. In the experiments we show that these new color name descriptors outperform the existing color name descriptor on the task of visual tracking, person re-identification and image classification.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; NEUROBIT; 600.068; 600.109; 600.120			Approved	no
Call Number	Admin @ si @ YYW2018			Serial	3087
Permanent link to this record



Author	Carlo Gatta; Francesco Ciompi
Title	Stacked Sequential Scale-Space Taylor Context			Type	Journal Article
Year	2014	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
Volume	36	Issue	8	Pages	1694-1700
Keywords
Abstract	We analyze sequential image labeling methods that sample the posterior label field in order to gather contextual information. We propose an effective method that extracts local Taylor coefficients from the posterior at different scales. Results show that our proposal outperforms state-of-the-art methods on MSRC-21, CAMVID, eTRIMS8 and KAIST2 data sets.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0162-8828	ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; MILAB; 601.160; 600.079			Approved	no
Call Number	Admin @ si @ GaC2014			Serial	2466
Permanent link to this record



Author	Carlo Gatta; Adriana Romero; Joost Van de Weijer
Title	Unrolling loopy top-down semantic feedback in convolutional deep networks			Type	Conference Article
Year	2014	Publication	Workshop on Deep Vision: Deep Learning for Computer Vision	Abbreviated Journal
Volume		Issue		Pages	498-505
Keywords
Abstract	In this paper, we propose a novel way to perform top-down semantic feedback in convolutional deep networks for efficient and accurate image parsing. We also show how to add global appearance/semantic features, which have shown to improve image parsing performance in state-of-the-art methods, and was not present in previous convolutional approaches. The proposed method is characterised by an efficient training and a sufficiently fast testing. We use the well known SIFTflow dataset to numerically show the advantages provided by our contributions, and to compare with state-of-the-art image parsing convolutional based approaches.
Address	Columbus; Ohio; June 2014
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	LAMP; MILAB; 601.160; 600.079			Approved	no
Call Number	Admin @ si @ GRW2014			Serial	2490
Permanent link to this record



Author	Eduardo Aguilar; Bogdan Raducanu; Petia Radeva; Joost Van de Weijer
Title	Continual Evidential Deep Learning for Out-of-Distribution Detection			Type	Conference Article
Year	2023	Publication	IEEE/CVF International Conference on Computer Vision (ICCV) Workshops -Visual Continual Learning workshop	Abbreviated Journal
Volume		Issue		Pages	3444-3454
Keywords
Abstract	Uncertainty-based deep learning models have attracted a great deal of interest for their ability to provide accurate and reliable predictions. Evidential deep learning stands out achieving remarkable performance in detecting out-of-distribution (OOD) data with a single deterministic neural network. Motivated by this fact, in this paper we propose the integration of an evidential deep learning method into a continual learning framework in order to perform simultaneously incremental object classification and OOD detection. Moreover, we analyze the ability of vacuity and dissonance to differentiate between in-distribution data belonging to old classes and OOD data. The proposed method, called CEDL, is evaluated on CIFAR-100 considering two settings consisting of 5 and 10 tasks, respectively. From the obtained results, we could appreciate that the proposed method, in addition to provide comparable results in object classification with respect to the baseline, largely outperforms OOD detection compared to several posthoc methods on three evaluation metrics: AUROC, AUPR and FPR95.
Address	Paris; France; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCVW
Notes	LAMP; MILAB			Approved	no
Call Number	Admin @ si @ ARR2023			Serial	3841
Permanent link to this record