Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	31–45 of 140 records found matching your query (RSS):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >>

List View

Citations

Details

	Records
	Author	Yaxing Wang; Joost Van de Weijer; Lu Yu; Shangling Jui
	Title	Distilling GANs with Style-Mixed Triplets for X2I Translation with Limited Data			Type	Conference Article
	Year	2022	Publication	10th International Conference on Learning Representations	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Conditional image synthesis is an integral part of many X2I translation systems, including image-to-image, text-to-image and audio-to-image translation systems. Training these large systems generally requires huge amounts of training data. Therefore, we investigate knowledge distillation to transfer knowledge from a high-quality unconditioned generative model (e.g., StyleGAN) to a conditioned synthetic image generation modules in a variety of systems. To initialize the conditional and reference branch (from a unconditional GAN) we exploit the style mixing characteristics of high-quality GANs to generate an infinite supply of style-mixed triplets to perform the knowledge distillation. Extensive experimental results in a number of image generation tasks (i.e., image-to-image, semantic segmentation-to-image, text-to-image and audio-to-image) demonstrate qualitatively and quantitatively that our method successfully transfers knowledge to the synthetic image generation modules, resulting in more realistic images than previous methods as confirmed by a significant drop in the FID.
	Address	Virtual
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICLR
	Notes	LAMP; 600.147			Approved	no
	Call Number	Admin @ si @ WWY2022			Serial	3791
Permanent link to this record



	Author	Aitor Alvarez-Gila; Joost Van de Weijer; Yaxing Wang; Estibaliz Garrote
	Title	MVMO: A Multi-Object Dataset for Wide Baseline Multi-View Semantic Segmentation			Type	Conference Article
	Year	2022	Publication	29th IEEE International Conference on Image Processing	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	multi-view; cross-view; semantic segmentation; synthetic dataset
	Abstract	We present MVMO (Multi-View, Multi-Object dataset): a synthetic dataset of 116,000 scenes containing randomly placed objects of 10 distinct classes and captured from 25 camera locations in the upper hemisphere. MVMO comprises photorealistic, path-traced image renders, together with semantic segmentation ground truth for every view. Unlike existing multi-view datasets, MVMO features wide baselines between cameras and high density of objects, which lead to large disparities, heavy occlusions and view-dependent object appearance. Single view semantic segmentation is hindered by self and inter-object occlusions that could benefit from additional viewpoints. Therefore, we expect that MVMO will propel research in multi-view semantic segmentation and cross-view semantic transfer. We also provide baselines that show that new research is needed in such fields to exploit the complementary information of multi-view setups 1 .
	Address	Bordeaux; France; October2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIP
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ AWW2022			Serial	3781
Permanent link to this record



	Author	Ahmed M. A. Salih; Ilaria Boscolo Galazzo; Federica Cruciani; Lorenza Brusini; Petia Radeva
	Title	Investigating Explainable Artificial Intelligence for MRI-based Classification of Dementia: a New Stability Criterion for Explainable Methods			Type	Conference Article
	Year	2022	Publication	29th IEEE International Conference on Image Processing	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Image processing; Stability criteria; Machine learning; Robustness; Alzheimer's disease; Monitoring
	Abstract	Individuals diagnosed with Mild Cognitive Impairment (MCI) have shown an increased risk of developing Alzheimer’s Disease (AD). As such, early identification of dementia represents a key prognostic element, though hampered by complex disease patterns. Increasing efforts have focused on Machine Learning (ML) to build accurate classification models relying on a multitude of clinical/imaging variables. However, ML itself does not provide sensible explanations related to the model mechanism and feature contribution. Explainable Artificial Intelligence (XAI) represents the enabling technology in this framework, allowing to understand ML outcomes and derive human-understandable explanations. In this study, we aimed at exploring ML combined with MRI-based features and XAI to solve this classification problem and interpret the outcome. In particular, we propose a new method to assess the robustness of feature rankings provided by XAI methods, especially when multicollinearity exists. Our findings indicate that our method was able to disentangle the list of the informative features underlying dementia, with important implications for aiding personalized monitoring plans.
	Address	Bordeaux; France; October 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIP
	Notes	MILAB			Approved	no
	Call Number	Admin @ si @ SBC2022			Serial	3789
Permanent link to this record



	Author	Chengyi Zou; Shuai Wan; Marta Mrak; Marc Gorriz Blanch; Luis Herranz; Tiannan Ji
	Title	Towards Lightweight Neural Network-based Chroma Intra Prediction for Video Coding			Type	Conference Article
	Year	2022	Publication	29th IEEE International Conference on Image Processing	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Video coding; Quantization (signal); Computational modeling; Neural networks; Predictive models; Video compression; Syntactics
	Abstract	In video compression the luma channel can be useful for predicting chroma channels (Cb, Cr), as has been demonstrated with the Cross-Component Linear Model (CCLM) used in Versatile Video Coding (VVC) standard. More recently, it has been shown that neural networks can even better capture the relationship among different channels. In this paper, a new attention-based neural network is proposed for cross-component intra prediction. With the goal to simplify neural network design, the new framework consists of four branches: boundary branch and luma branch for extracting features from reference samples, attention branch for fusing the first two branches, and prediction branch for computing the predicted chroma samples. The proposed scheme is integrated into VVC test model together with one additional binary block-level syntax flag which indicates whether a given block makes use of the proposed method. Experimental results demonstrate 0.31%/2.36%/2.00% BD-rate reductions on Y/Cb/Cr components, respectively, on top of the VVC Test Model (VTM) 7.0 which uses CCLM.
	Address	Bordeaux; France; October 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIP
	Notes	MACO			Approved	no
	Call Number	Admin @ si @ ZWM2022			Serial	3790
Permanent link to this record



	Author	Marc Oliu; Sarah Adel Bargal; Stan Sclaroff; Xavier Baro; Sergio Escalera
	Title	Multi-varied Cumulative Alignment for Domain Adaptation			Type	Conference Article
	Year	2022	Publication	6th International Conference on Image Analysis and Processing	Abbreviated Journal
	Volume	13232	Issue		Pages	324–334
	Keywords	Domain Adaptation; Computer vision; Neural networks
	Abstract	Domain Adaptation methods can be classified into two basic families of approaches: non-parametric and parametric. Non-parametric approaches depend on statistical indicators such as feature covariances to minimize the domain shift. Non-parametric approaches tend to be fast to compute and require no additional parameters, but they are unable to leverage probability density functions with complex internal structures. Parametric approaches, on the other hand, use models of the probability distributions as surrogates in minimizing the domain shift, but they require additional trainable parameters to model these distributions. In this work, we propose a new statistical approach to minimizing the domain shift based on stochastically projecting and evaluating the cumulative density function in both domains. As with non-parametric approaches, there are no additional trainable parameters. As with parametric approaches, the internal structure of both domains’ probability distributions is considered, thus leveraging a higher amount of information when reducing the domain shift. Evaluation on standard datasets used for Domain Adaptation shows better performance of the proposed model compared to non-parametric approaches while being competitive with parametric ones. (Code available at: https://github.com/moliusimon/mca).
	Address	Indonesia; October 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIAP
	Notes	HuPBA; no menciona			Approved	no
	Call Number	Admin @ si @ OAS2022			Serial	3777
Permanent link to this record



	Author	Giuseppe De Gregorio; Sanket Biswas; Mohamed Ali Souibgui; Asma Bensalah; Josep Llados; Alicia Fornes; Angelo Marcelli
	Title	A Few Shot Multi-representation Approach for N-Gram Spotting in Historical Manuscripts			Type	Conference Article
	Year	2022	Publication	Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022)	Abbreviated Journal
	Volume	13639	Issue		Pages	3-12
	Keywords	N-gram spotting; Few-shot learning; Multimodal understanding; Historical handwritten collections
	Abstract	Despite recent advances in automatic text recognition, the performance remains moderate when it comes to historical manuscripts. This is mainly because of the scarcity of available labelled data to train the data-hungry Handwritten Text Recognition (HTR) models. The Keyword Spotting System (KWS) provides a valid alternative to HTR due to the reduction in error rate, but it is usually limited to a closed reference vocabulary. In this paper, we propose a few-shot learning paradigm for spotting sequences of a few characters (N-gram) that requires a small amount of labelled training data. We exhibit that recognition of important n-grams could reduce the system’s dependency on vocabulary. In this case, an out-of-vocabulary (OOV) word in an input handwritten line image could be a sequence of n-grams that belong to the lexicon. An extensive experimental evaluation of our proposed multi-representation approach was carried out on a subset of Bentham’s historical manuscript collections to obtain some really promising results in this direction.
	Address	December 04 – 07, 2022; Hyderabad, India
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICFHR
	Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
	Call Number	Admin @ si @ GBS2022			Serial	3733
Permanent link to this record



	Author	Arnau Baro; Pau Riba; Alicia Fornes
	Title	Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network			Type	Conference Article
	Year	2022	Publication	Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022)	Abbreviated Journal
	Volume	13639	Issue		Pages	171-184
	Keywords	Object detection; Optical music recognition; Graph neural network
	Abstract	During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results.
	Address	December 04 – 07, 2022; Hyderabad, India
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICFHR
	Notes	DAG; 600.162; 600.140; 602.230			Approved	no
	Call Number	Admin @ si @ BRF2022b			Serial	3740
Permanent link to this record



	Author	Utkarsh Porwal; Alicia Fornes; Faisal Shafait (eds)
	Title	Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022			Type	Book Whole
	Year	2022	Publication	Frontiers in Handwriting Recognition.	Abbreviated Journal
	Volume	13639	Issue		Pages
	Keywords
	Abstract
	Address	ICFHR 2022, Hyderabad, India, December 4–7, 2022
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor	Utkarsh Porwal; Alicia Fornes; Faisal Shafait
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-031-21648-0	Medium
	Area		Expedition		Conference	ICFHR
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ PFS2022			Serial	3809
Permanent link to this record



	Author	Saiping Zhang; Luis Herranz; Marta Mrak; Marc Gorriz Blanch; Shuai Wan; Fuzheng Yang
	Title	DCNGAN: A Deformable Convolution-Based GAN with QP Adaptation for Perceptual Quality Enhancement of Compressed Video			Type	Conference Article
	Year	2022	Publication	47th International Conference on Acoustics, Speech, and Signal Processing	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms.
	Address	Virtual; May 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICASSP
	Notes	MACO; 600.161; 601.379			Approved	no
	Call Number	Admin @ si @ ZHM2022a			Serial	3765
Permanent link to this record



	Author	Guillem Martinez; Maya Aghaei; Martin Dijkstra; Bhalaji Nagarajan; Femke Jaarsma; Jaap van de Loosdrecht; Petia Radeva; Klaas Dijkstra
	Title	Hyper-Spectral Imaging for Overlapping Plastic Flakes Segmentation			Type	Conference Article
	Year	2022	Publication	47th International Conference on Acoustics, Speech, and Signal Processing	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Hyper-spectral imaging; plastic sorting; multi-label segmentation; bitfield encoding
	Abstract	In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms.
	Address	Singapore; May 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICASSP
	Notes	MILAB; no proj			Approved	no
	Call Number	Admin @ si @ MAD2022			Serial	3767
Permanent link to this record



	Author	Nil Ballus; Bhalaji Nagarajan; Petia Radeva
	Title	Opt-SSL: An Enhanced Self-Supervised Framework for Food Recognition			Type	Conference Article
	Year	2022	Publication	10th Iberian Conference on Pattern Recognition and Image Analysis	Abbreviated Journal
	Volume	13256	Issue		Pages
	Keywords	Self-supervised; Contrastive learning; Food recognition
	Abstract	Self-supervised Learning has been showing upbeat performance in several computer vision tasks. The popular contrastive methods make use of a Siamese architecture with different loss functions. In this work, we go deeper into two very recent state of the art frameworks, namely, SimSiam and Barlow Twins. Inspired by them, we propose a new self-supervised learning method we call Opt-SSL that combines both image and feature contrasting. We validate the proposed method on the food recognition task, showing that our proposed framework enables the self-learning networks to learn better visual representations.
	Address	Aveiro; Portugal; May 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	IbPRIA
	Notes	MILAB; no menciona			Approved	no
	Call Number	Admin @ si @ BNR2022			Serial	3782
Permanent link to this record



	Author	Giacomo Magnifico; Beata Megyesi; Mohamed Ali Souibgui; Jialuo Chen; Alicia Fornes
	Title	Lost in Transcription of Graphic Signs in Ciphers			Type	Conference Article
	Year	2022	Publication	International Conference on Historical Cryptology (HistoCrypt 2022)	Abbreviated Journal
	Volume		Issue		Pages	153-158
	Keywords	transcription of ciphers; hand-written text recognition of symbols; graphic signs
	Abstract	Hand-written Text Recognition techniques with the aim to automatically identify and transcribe hand-written text have been applied to historical sources including ciphers. In this paper, we compare the performance of two machine learning architectures, an unsupervised method based on clustering and a deep learning method with few-shot learning. Both models are tested on seen and unseen data from historical ciphers with different symbol sets consisting of various types of graphic signs. We compare the models and highlight their differences in performance, with their advantages and shortcomings.
	Address	Amsterdam, Netherlands, June 20-22, 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	HystoCrypt
	Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
	Call Number	Admin @ si @ MBS2022			Serial	3731
Permanent link to this record



	Author	Emanuele Vivoli; Ali Furkan Biten; Andres Mafla; Dimosthenis Karatzas; Lluis Gomez
	Title	MUST-VQA: MUltilingual Scene-text VQA			Type	Conference Article
	Year	2022	Publication	Proceedings European Conference on Computer Vision Workshops	Abbreviated Journal
	Volume	13804	Issue		Pages	345–358
	Keywords	Visual question answering; Scene text; Translation robustness; Multilingual models; Zero-shot transfer; Power of language models
	Abstract	In this paper, we present a framework for Multilingual Scene Text Visual Question Answering that deals with new languages in a zero-shot fashion. Specifically, we consider the task of Scene Text Visual Question Answering (STVQA) in which the question can be asked in different languages and it is not necessarily aligned to the scene text language. Thus, we first introduce a natural step towards a more generalized version of STVQA: MUST-VQA. Accounting for this, we discuss two evaluation scenarios in the constrained setting, namely IID and zero-shot and we demonstrate that the models can perform on a par on a zero-shot setting. We further provide extensive experimentation and show the effectiveness of adapting multilingual language models into STVQA tasks.
	Address	Tel-Aviv; Israel; October 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCVW
	Notes	DAG; 302.105; 600.155; 611.002			Approved	no
	Call Number	Admin @ si @ VBM2022			Serial	3770
Permanent link to this record



	Author	Sergi Garcia Bordils; Andres Mafla; Ali Furkan Biten; Oren Nuriel; Aviad Aberdam; Shai Mazor; Ron Litman; Dimosthenis Karatzas
	Title	Out-of-Vocabulary Challenge Report			Type	Conference Article
	Year	2022	Publication	Proceedings European Conference on Computer Vision Workshops	Abbreviated Journal
	Volume	13804	Issue		Pages	359–375
	Keywords
	Abstract	This paper presents final results of the Out-Of-Vocabulary 2022 (OOV) challenge. The OOV contest introduces an important aspect that is not commonly studied by Optical Character Recognition (OCR) models, namely, the recognition of unseen scene text instances at training time. The competition compiles a collection of public scene text datasets comprising of 326,385 images with 4,864,405 scene text instances, thus covering a wide range of data distributions. A new and independent validation and test set is formed with scene text instances that are out of vocabulary at training time. The competition was structured in two tasks, end-to-end and cropped scene text recognition respectively. A thorough analysis of results from baselines and different participants is presented. Interestingly, current state-of-the-art models show a significant performance gap under the newly studied setting. We conclude that the OOV dataset proposed in this challenge will be an essential area to be explored in order to develop scene text models that achieve more robust and generalized predictions.
	Address	Tel-Aviv; Israel; October 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCVW
	Notes	DAG; 600.155; 302.105; 611.002			Approved	no
	Call Number	Admin @ si @ GMB2022			Serial	3771
Permanent link to this record



	Author	Andrea Gemelli; Sanket Biswas; Enrico Civitelli; Josep Llados; Simone Marinai
	Title	Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks			Type	Conference Article
	Year	2022	Publication	17th European Conference on Computer Vision Workshops	Abbreviated Journal
	Volume	13804	Issue		Pages	329–344
	Keywords
	Abstract	Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis. The application of Graph Neural Networks (GNNs) has become crucial in various document-related tasks since they can unravel important structural patterns, fundamental in key information extraction processes. Previous works in the literature propose task-driven models and do not take into account the full power of graphs. We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model, to solve different tasks given different types of documents. We evaluated our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-031-25068-2	Medium
	Area		Expedition		Conference	ECCV-TiE
	Notes	DAG; 600.162; 600.140; 110.312			Approved	no
	Call Number	Admin @ si @ GBC2022			Serial	3795
Permanent link to this record