Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	121–135 of 3403 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–20]

List View

Citations

Details

	Records
	Author	Klara Janousckova; Jiri Matas; Lluis Gomez; Dimosthenis Karatzas
	Title	Text Recognition – Real World Data and Where to Find Them			Type	Conference Article
	Year	2020	Publication	25th International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	4489-4496
	Keywords
	Abstract	We present a method for exploiting weakly annotated images to improve text extraction pipelines. The approach uses an arbitrary end-to-end text recognition system to obtain text region proposals and their, possibly erroneous, transcriptions. The method includes matching of imprecise transcriptions to weak annotations and an edit distance guided neighbourhood search. It produces nearly error-free, localised instances of scene text, which we treat as “pseudo ground truth” (PGT). The method is applied to two weakly-annotated datasets. Training with the extracted PGT consistently improves the accuracy of a state of the art recognition model, by 3.7% on average, across different benchmark datasets (image domains) and 24.5% on one of the weakly annotated datasets 1 1 Acknowledgements. The authors were supported by Czech Technical University student grant SGS20/171/0HK3/3TJ13, the MEYS VVV project CZ.02.1.01/0.010.0J16 019/0000765 Research Center for Informatics, the Spanish Research project TIN2017-89779-P and the CERCA Programme / Generalitat de Catalunya.
	Address	Virtual; January 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	DAG; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ JMG2020			Serial	3557
Permanent link to this record



	Author	Armin Mehri; Parichehr Behjati Ardakani; Angel Sappa
	Title	MPRNet: Multi-Path Residual Network for Lightweight Image Super Resolution			Type	Conference Article
	Year	2021	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	2703-2712
	Keywords
	Abstract	Lightweight super resolution networks have extremely importance for real-world applications. In recent years several SR deep learning approaches with outstanding achievement have been introduced by sacrificing memory and computational cost. To overcome this problem, a novel lightweight super resolution network is proposed, which improves the SOTA performance in lightweight SR and performs roughly similar to computationally expensive networks. Multi-Path Residual Network designs with a set of Residual concatenation Blocks stacked with Adaptive Residual Blocks: ($i$) to adaptively extract informative features and learn more expressive spatial context information; ($ii$) to better leverage multi-level representations before up-sampling stage; and ($iii$) to allow an efficient information and gradient flow within the network. The proposed architecture also contains a new attention mechanism, Two-Fold Attention Module, to maximize the representation ability of the model. Extensive experiments show the superiority of our model against other SOTA SR approaches.
	Address	Virtual; January 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	MSIAU; 600.130; 600.122			Approved	no
	Call Number	Admin @ si @ MAS2021b			Serial	3582
Permanent link to this record



	Author	Armin Mehri; Parichehr Behjati Ardakani; Angel Sappa
	Title	LiNet: A Lightweight Network for Image Super Resolution			Type	Conference Article
	Year	2021	Publication	25th International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	7196-7202
	Keywords
	Abstract	This paper proposes a new lightweight network, LiNet, that enhancing technical efficiency in lightweight super resolution and operating approximately like very large and costly networks in terms of number of network parameters and operations. The proposed architecture allows the network to learn more abstract properties by avoiding low-level information via multiple links. LiNet introduces a Compact Dense Module, which contains set of inner and outer blocks, to efficiently extract meaningful information, to better leverage multi-level representations before upsampling stage, and to allow an efficient information and gradient flow within the network. Experiments on benchmark datasets show that the proposed LiNet achieves favorable performance against lightweight state-of-the-art methods.
	Address	Virtual; January 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MSIAU; 600.130; 600.122			Approved	no
	Call Number	Admin @ si @ MAS2021a			Serial	3583
Permanent link to this record



	Author	Ajian Liu; Zichang Tan; Jun Wan; Sergio Escalera; Guodong Guo; Stan Z. Li
	Title	CASIA-SURF CeFA: A Benchmark for Multi-modal Cross-Ethnicity Face Anti-Spoofing			Type	Conference Article
	Year	2021	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	1178-1186
	Keywords
	Abstract	The issue of ethnic bias has proven to affect the performance of face recognition in previous works, while it still remains to be vacant in face anti-spoofing. Therefore, in order to study the ethnic bias for face anti-spoofing, we introduce the largest CASIA-SURF Cross-ethnicity Face Anti-spoofing (CeFA) dataset, covering 3 ethnicities, 3 modalities, 1,607 subjects, and 2D plus 3D attack types. Five protocols are introduced to measure the affect under varied evaluation conditions, such as cross-ethnicity, unknown spoofs or both of them. As our knowledge, CASIA-SURF CeFA is the first dataset including explicit ethnic labels in current released datasets. Then, we propose a novel multi-modal fusion method as a strong baseline to alleviate the ethnic bias, which employs a partially shared fusion strategy to learn complementary information from multiple modalities. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability for other existing datasets, i.e., CASIA-SURF, OULU-NPU and SiW datasets. The dataset is available at https://sites.google.com/qq.com/face-anti-spoofing/welcome/challengecvpr2020?authuser=0.
	Address	Virtual; January 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ LTW2021			Serial	3661
Permanent link to this record



	Author	Bhalaji Nagarajan; Ricardo Marques; Marcos Mejia; Petia Radeva
	Title	Class-conditional Importance Weighting for Deep Learning with Noisy Labels			Type	Conference Article
	Year	2022	Publication	17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications	Abbreviated Journal
	Volume	5	Issue		Pages	679-686
	Keywords	Noisy Labeling; Loss Correction; Class-conditional Importance Weighting; Learning with Noisy Labels
	Abstract	Large-scale accurate labels are very important to the Deep Neural Networks to train them and assure high performance. However, it is very expensive to create a clean dataset since usually it relies on human interaction. To this purpose, the labelling process is made cheap with a trade-off of having noisy labels. Learning with Noisy Labels is an active area of research being at the same time very challenging. The recent advances in Self-supervised learning and robust loss functions have helped in advancing noisy label research. In this paper, we propose a loss correction method that relies on dynamic weights computed based on the model training. We extend the existing Contrast to Divide algorithm coupled with DivideMix using a new class-conditional weighted scheme. We validate the method using the standard noise experiments and achieved encouraging results.
	Address	Virtual; February 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	VISAPP
	Notes	MILAB; no menciona			Approved	no
	Call Number	Admin @ si @ NMM2022			Serial	3798
Permanent link to this record



	Author	Carola Figueroa Flores; Bogdan Raducanu; David Berga; Joost Van de Weijer
	Title	Hallucinating Saliency Maps for Fine-Grained Image Classification for Limited Data Domains			Type	Conference Article
	Year	2021	Publication	16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications	Abbreviated Journal
	Volume	4	Issue		Pages	163-171
	Keywords
	Abstract	arXiv:2007.12562 Most of the saliency methods are evaluated on their ability to generate saliency maps, and not on their functionality in a complete vision pipeline, like for instance, image classification. In the current paper, we propose an approach which does not require explicit saliency maps to improve image classification, but they are learned implicitely, during the training of an end-to-end image classification task. We show that our approach obtains similar results as the case when the saliency maps are provided explicitely. Combining RGB data with saliency maps represents a significant advantage for object recognition, especially for the case when training data is limited. We validate our method on several datasets for fine-grained classification tasks (Flowers, Birds and Cars). In addition, we show that our saliency estimation method, which is trained without any saliency groundtruth data, obtains competitive results on real image saliency benchmark (Toronto), and outperforms deep saliency models with synthetic images (SID4VAM).
	Address	Virtual; February 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	VISAPP
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ FRB2021c			Serial	3540
Permanent link to this record



	Author	Arturo Fuentes; F. Javier Sanchez; Thomas Voncina; Jorge Bernal
	Title	LAMV: Learning to Predict Where Spectators Look in Live Music Performances			Type	Conference Article
	Year	2021	Publication	16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications	Abbreviated Journal
	Volume	5	Issue		Pages	500-507
	Keywords
	Abstract	The advent of artificial intelligence has supposed an evolution on how different daily work tasks are performed. The analysis of cultural content has seen a huge boost by the development of computer-assisted methods that allows easy and transparent data access. In our case, we deal with the automation of the production of live shows, like music concerts, aiming to develop a system that can indicate the producer which camera to show based on what each of them is showing. In this context, we consider that is essential to understand where spectators look and what they are interested in so the computational method can learn from this information. The work that we present here shows the results of a first preliminary study in which we compare areas of interest defined by human beings and those indicated by an automatic system. Our system is based on the extraction of motion textures from dynamic Spatio-Temporal Volumes (STV) and then analyzing the patterns by means of texture analysis techniques. We validate our approach over several video sequences that have been labeled by 16 different experts. Our method is able to match those relevant areas identified by the experts, achieving recall scores higher than 80% when a distance of 80 pixels between method and ground truth is considered. Current performance shows promise when detecting abnormal peaks and movement trends.
	Address	Virtual; February 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	VISIGRAPP
	Notes	MV; ISE; 600.119;			Approved	no
	Call Number	Admin @ si @ FSV2021			Serial	3570
Permanent link to this record



	Author	Diego Porres
	Title	Discriminator Synthesis: On reusing the other half of Generative Adversarial Networks			Type	Conference Article
	Year	2021	Publication	Machine Learning for Creativity and Design, Neurips Workshop	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Generative Adversarial Networks have long since revolutionized the world of computer vision and, tied to it, the world of art. Arduous efforts have gone into fully utilizing and stabilizing training so that outputs of the Generator network have the highest possible fidelity, but little has gone into using the Discriminator after training is complete. In this work, we propose to use the latter and show a way to use the features it has learned from the training dataset to both alter an image and generate one from scratch. We name this method Discriminator Dreaming, and the full code can be found at this https URL.
	Address	Virtual; December 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	NEURIPSW
	Notes	ADAS; 601.365			Approved	no
	Call Number	Admin @ si @ Por2021			Serial	3597
Permanent link to this record



	Author	Albert Rial-Farras; Meysam Madadi; Sergio Escalera
	Title	UV-based reconstruction of 3D garments from a single RGB image			Type	Conference Article
	Year	2021	Publication	16th IEEE International Conference on Automatic Face and Gesture Recognition	Abbreviated Journal
	Volume		Issue		Pages	1-8
	Keywords
	Abstract	Garments are highly detailed and dynamic objects made up of particles that interact with each other and with other objects, making the task of 2D to 3D garment reconstruction extremely challenging. Therefore, having a lightweight 3D representation capable of modelling fine details is of great importance. This work presents a deep learning framework based on Generative Adversarial Networks (GANs) to reconstruct 3D garment models from a single RGB image. It has the peculiarity of using UV maps to represent 3D data, a lightweight representation capable of dealing with high-resolution details and wrinkles. With this model and kind of 3D representation, we achieve state-of-the-art results on the CLOTH3D++ dataset, generating good quality and realistic garment reconstructions regardless of the garment topology and shape, human pose, occlusions and lightning.
	Address	Virtual; December 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	FG
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ RME2021			Serial	3639
Permanent link to this record



	Author	Hugo Bertiche; Meysam Madadi; Sergio Escalera
	Title	Deep Parametric Surfaces for 3D Outfit Reconstruction from Single View Image			Type	Conference Article
	Year	2021	Publication	16th IEEE International Conference on Automatic Face and Gesture Recognition	Abbreviated Journal
	Volume		Issue		Pages	1-8
	Keywords
	Abstract	We present a methodology to retrieve analytical surfaces parametrized as a neural network. Previous works on 3D reconstruction yield point clouds, voxelized objects or meshes. Instead, our approach yields 2-manifolds in the euclidean space through deep learning. To this end, we implement a novel formulation for fully connected layers as parametrized manifolds that allows continuous predictions with differential geometry. Based on this property we propose a novel smoothness loss. Results on CLOTH3D++ dataset show the possibility to infer different topologies and the benefits of the smoothness term based on differential geometry.
	Address	Virtual; December 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	FG
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ BME2021			Serial	3640
Permanent link to this record



	Author	Mohamed Ali Souibgui; Y.Kessentini; Alicia Fornes
	Title	A conditional GAN based approach for distorted camera captured documents recovery			Type	Conference Article
	Year	2020	Publication	4th Mediterranean Conference on Pattern Recognition and Artificial Intelligence	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	Virtual; December 2020
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	MedPRAI
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ SKF2020			Serial	3450
Permanent link to this record



	Author	Riccardo Del Chiaro; Bartlomiej Twardowski; Andrew Bagdanov; Joost Van de Weijer
	Title	Recurrent attention to transient tasks for continual image captioning			Type	Conference Article
	Year	2020	Publication	34th Conference on Neural Information Processing Systems	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Research on continual learning has led to a variety of approaches to mitigating catastrophic forgetting in feed-forward classification networks. Until now surprisingly little attention has been focused on continual learning of recurrent models applied to problems like image captioning. In this paper we take a systematic look at continual learning of LSTM-based models for image captioning. We propose an attention-based approach that explicitly accommodates the transient nature of vocabularies in continual image captioning tasks -- i.e. that task vocabularies are not disjoint. We call our method Recurrent Attention to Transient Tasks (RATT), and also show how to adapt continual learning approaches based on weight egularization and knowledge distillation to recurrent continual learning problems. We apply our approaches to incremental image captioning problem on two new continual learning benchmarks we define using the MS-COCO and Flickr30 datasets. Our results demonstrate that RATT is able to sequentially learn five captioning tasks while incurring no forgetting of previously learned ones.
	Address	virtual; December 2020
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	NEURIPS
	Notes	LAMP; 600.120			Approved	no
	Call Number	Admin @ si @ CTB2020			Serial	3484
Permanent link to this record



	Author	Yaxing Wang; Lu Yu; Joost Van de Weijer
	Title	DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs			Type	Conference Article
	Year	2020	Publication	34th Conference on Neural Information Processing Systems	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Image-to-image translation has recently achieved remarkable results. But despite current success, it suffers from inferior performance when translations between classes require large shape changes. We attribute this to the high-resolution bottlenecks which are used by current state-of-the-art image-to-image methods. Therefore, in this work, we propose a novel deep hierarchical Image-to-Image Translation method, called DeepI2I. We learn a model by leveraging hierarchical features: (a) structural information contained in the shallow layers and (b) semantic information extracted from the deep layers. To enable the training of deep I2I models on small datasets, we propose a novel transfer learning method, that transfers knowledge from pre-trained GANs. Specifically, we leverage the discriminator of a pre-trained GANs (i.e. BigGAN or StyleGAN) to initialize both the encoder and the discriminator and the pre-trained generator to initialize the generator of our model. Applying knowledge transfer leads to an alignment problem between the encoder and generator. We introduce an adaptor network to address this. On many-class image-to-image translation on three datasets (Animal faces, Birds, and Foods) we decrease mFID by at least 35% when compared to the state-of-the-art. Furthermore, we qualitatively and quantitatively demonstrate that transfer learning significantly improves the performance of I2I systems, especially for small datasets. Finally, we are the first to perform I2I translations for domains with over 100 classes.
	Address	virtual; December 2020
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	NEURIPS
	Notes	LAMP; 600.120			Approved	no
	Call Number	Admin @ si @ WYW2020			Serial	3485
Permanent link to this record



	Author	Hugo Bertiche; Meysam Madadi; Sergio Escalera
	Title	PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation			Type	Conference Article
	Year	2021	Publication	14th ACM Siggraph Conference and exhibition on Computer Graphics and Interactive Techniques in Asia	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	We present a methodology to automatically obtain Pose Space Deformation (PSD) basis for rigged garments through deep learning. Classical approaches rely on Physically Based Simulations (PBS) to animate clothes. These are general solutions that, given a sufficiently fine-grained discretization of space and time, can achieve highly realistic results. However, they are computationally expensive and any scene modification prompts the need of re-simulation. Linear Blend Skinning (LBS) with PSD offers a lightweight alternative to PBS, though, it needs huge volumes of data to learn proper PSD. We propose using deep learning, formulated as an implicit PBS, to unsupervisedly learn realistic cloth Pose Space Deformations in a constrained scenario: dressed humans. Furthermore, we show it is possible to train these models in an amount of time comparable to a PBS of a few sequences. To the best of our knowledge, we are the first to propose a neural simulator for cloth. While deep-based approaches in the domain are becoming a trend, these are data-hungry models. Moreover, authors often propose complex formulations to better learn wrinkles from PBS data. Supervised learning leads to physically inconsistent predictions that require collision solving to be used. Also, dependency on PBS data limits the scalability of these solutions, while their formulation hinders its applicability and compatibility. By proposing an unsupervised methodology to learn PSD for LBS models (3D animation standard), we overcome both of these drawbacks. Results obtained show cloth-consistency in the animated garments and meaningful pose-dependant folds and wrinkles. Our solution is extremely efficient, handles multiple layers of cloth, allows unsupervised outfit resizing and can be easily applied to any custom 3D avatar.
	Address	Virtual; December 2020
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	SIGGRAPH
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ BME2021b			Serial	3641
Permanent link to this record



	Author	Raul Gomez; Jaume Gibert; Lluis Gomez; Dimosthenis Karatzas
	Title	Location Sensitive Image Retrieval and Tagging			Type	Conference Article
	Year	2020	Publication	16th European Conference on Computer Vision	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	People from different parts of the globe describe objects and concepts in distinct manners. Visual appearance can thus vary across different geographic locations, which makes location a relevant contextual information when analysing visual data. In this work, we address the task of image retrieval related to a given tag conditioned on a certain location on Earth. We present LocSens, a model that learns to rank triplets of images, tags and coordinates by plausibility, and two training strategies to balance the location influence in the final ranking. LocSens learns to fuse textual and location information of multimodal queries to retrieve related images at different levels of location granularity, and successfully utilizes location information to improve image tagging.
	Address	Virtual; August 2020
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCV
	Notes	DAG; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ GGG2020b			Serial	3420
Permanent link to this record