toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Albert Suso; Pau Riba; Oriol Ramos Terrades; Josep Llados edit  url
openurl 
  Title A Self-supervised Inverse Graphics Approach for Sketch Parametrization Type Conference Article
  Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume 12916 Issue Pages 28-42  
  Keywords  
  Abstract The study of neural generative models of handwritten text and human sketches is a hot topic in the computer vision field. The landmark SketchRNN provided a breakthrough by sequentially generating sketches as a sequence of waypoints, and more recent articles have managed to generate fully vector sketches by coding the strokes as Bézier curves. However, the previous attempts with this approach need them all a ground truth consisting in the sequence of points that make up each stroke, which seriously limits the datasets the model is able to train in. In this work, we present a self-supervised end-to-end inverse graphics approach that learns to embed each image to its best fit of Bézier curves. The self-supervised nature of the training process allows us to train the model in a wider range of datasets, but also to perform better after-training predictions by applying an overfitting process on the input binary image. We report qualitative an quantitative evaluations on the MNIST and the Quick, Draw! datasets.  
  Address (up) Lausanne; Suissa; September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ SRR2021 Serial 3675  
Permanent link to this record
 

 
Author Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal edit   pdf
url  doi
openurl 
  Title Graph-Based Deep Generative Modelling for Document Layout Generation Type Conference Article
  Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume 12917 Issue Pages 525-537  
  Keywords  
  Abstract One of the major prerequisites for any deep learning approach is the availability of large-scale training data. When dealing with scanned document images in real world scenarios, the principal information of its content is stored in the layout itself. In this work, we have proposed an automated deep generative model using Graph Neural Networks (GNNs) to generate synthetic data with highly variable and plausible document layouts that can be used to train document interpretation systems, in this case, specially in digital mailroom applications. It is also the first graph-based approach for document layout generation task experimented on administrative document images, in this case, invoices.  
  Address (up) Lausanne; Suissa; September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.140; 110.312 Approved no  
  Call Number Admin @ si @ BRL2021 Serial 3676  
Permanent link to this record
 

 
Author Josep Llados edit  openurl
  Title The 5G of Document Intelligence Type Conference Article
  Year 2021 Publication 3rd Workshop on Future of Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address (up) Lausanne; Suissa; September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference FDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ Serial 3677  
Permanent link to this record
 

 
Author Carola Figueroa Flores edit  isbn
openurl 
  Title Visual Saliency for Object Recognition, and Object Recognition for Visual Saliency Type Book Whole
  Year 2021 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords computer vision; visual saliency; fine-grained object recognition; convolutional neural networks; images classification  
  Abstract For humans, the recognition of objects is an almost instantaneous, precise and
extremely adaptable process. Furthermore, we have the innate capability to learn
new object classes from only few examples. The human brain lowers the complexity
of the incoming data by filtering out part of the information and only processing
those things that capture our attention. This, mixed with our biological predisposition to respond to certain shapes or colors, allows us to recognize in a simple
glance the most important or salient regions from an image. This mechanism can
be observed by analyzing on which parts of images subjects place attention; where
they fix their eyes when an image is shown to them. The most accurate way to
record this behavior is to track eye movements while displaying images.
Computational saliency estimation aims to identify to what extent regions or
objects stand out with respect to their surroundings to human observers. Saliency
maps can be used in a wide range of applications including object detection, image
and video compression, and visual tracking. The majority of research in the field has
focused on automatically estimating saliency maps given an input image. Instead, in
this thesis, we set out to incorporate saliency maps in an object recognition pipeline:
we want to investigate whether saliency maps can improve object recognition
results.
In this thesis, we identify several problems related to visual saliency estimation.
First, to what extent the estimation of saliency can be exploited to improve the
training of an object recognition model when scarce training data is available. To
solve this problem, we design an image classification network that incorporates
saliency information as input. This network processes the saliency map through a
dedicated network branch and uses the resulting characteristics to modulate the
standard bottom-up visual characteristics of the original image input. We will refer to this technique as saliency-modulated image classification (SMIC). In extensive
experiments on standard benchmark datasets for fine-grained object recognition,
we show that our proposed architecture can significantly improve performance,
especially on dataset with scarce training data.
Next, we address the main drawback of the above pipeline: SMIC requires an
explicit saliency algorithm that must be trained on a saliency dataset. To solve this,
we implement a hallucination mechanism that allows us to incorporate the saliency
estimation branch in an end-to-end trained neural network architecture that only
needs the RGB image as an input. A side-effect of this architecture is the estimation
of saliency maps. In experiments, we show that this architecture can obtain similar
results on object recognition as SMIC but without the requirement of ground truth
saliency maps to train the system.
Finally, we evaluated the accuracy of the saliency maps that occur as a sideeffect of object recognition. For this purpose, we use a set of benchmark datasets
for saliency evaluation based on eye-tracking experiments. Surprisingly, the estimated saliency maps are very similar to the maps that are computed from human
eye-tracking experiments. Our results show that these saliency maps can obtain
competitive results on benchmark saliency maps. On one synthetic saliency dataset
this method even obtains the state-of-the-art without the need of ever having seen
an actual saliency image for training.
 
  Address (up) March 2021  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Joost Van de Weijer;Bogdan Raducanu  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-122714-4-7 Medium  
  Area Expedition Conference  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ Fig2021 Serial 3600  
Permanent link to this record
 

 
Author Javad Zolfaghari Bengar; Joost Van de Weijer; Bartlomiej Twardowski; Bogdan Raducanu edit  url
doi  openurl
  Title Reducing Label Effort: Self- Supervised Meets Active Learning Type Conference Article
  Year 2021 Publication International Conference on Computer Vision Workshops Abbreviated Journal  
  Volume Issue Pages 1631-1639  
  Keywords  
  Abstract Active learning is a paradigm aimed at reducing the annotation effort by training the model on actively selected informative and/or representative samples. Another paradigm to reduce the annotation effort is self-training that learns from a large amount of unlabeled data in an unsupervised way and fine-tunes on few labeled samples. Recent developments in self-training have achieved very impressive results rivaling supervised learning on some datasets. The current work focuses on whether the two paradigms can benefit from each other. We studied object recognition datasets including CIFAR10, CIFAR100 and Tiny ImageNet with several labeling budgets for the evaluations. Our experiments reveal that self-training is remarkably more efficient than active learning at reducing the labeling effort, that for a low labeling budget, active learning offers no benefit to self-training, and finally that the combination of active learning and self-training is fruitful when the labeling budget is high. The performance gap between active learning trained either with self-training or from scratch diminishes as we approach to the point where almost half of the dataset is labeled.  
  Address (up) October 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCVW  
  Notes LAMP; Approved no  
  Call Number Admin @ si @ ZVT2021 Serial 3672  
Permanent link to this record
 

 
Author Shiqi Yang; Yaxing Wang; Joost Van de Weijer; Luis Herranz; Shangling Jui edit   pdf
url  openurl
  Title Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation Type Conference Article
  Year 2021 Publication Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021) Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Domain adaptation (DA) aims to alleviate the domain shift between source domain and target domain. Most DA methods require access to the source data, but often that is not possible (e.g. due to data privacy or intellectual property). In this paper, we address the challenging source-free domain adaptation (SFDA) problem, where the source pretrained model is adapted to the target domain in the absence of source data. Our method is based on the observation that target data, which might no longer align with the source domain classifier, still forms clear clusters. We capture this intrinsic structure by defining local affinity of the target data, and encourage label consistency among data with high local affinity. We observe that higher affinity should be assigned to reciprocal neighbors, and propose a self regularization loss to decrease the negative impact of noisy neighbors. Furthermore, to aggregate information with more context, we consider expanded neighborhoods with small affinity values. In the experimental results we verify that the inherent structure of the target features is an important source of information for domain adaptation. We demonstrate that this local structure can be efficiently captured by considering the local neighbors, the reciprocal neighbors, and the expanded neighborhood. Finally, we achieve state-of-the-art performance on several 2D image and 3D point cloud recognition datasets. Code is available in https://github.com/Albert0147/SFDA_neighbors.  
  Address (up) Online; December 7-10, 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference NIPS  
  Notes LAMP; 600.147; 600.141 Approved no  
  Call Number Admin @ si @ Serial 3691  
Permanent link to this record
 

 
Author Hassan Ahmed Sial edit  isbn
openurl 
  Title Estimating Light Effects from a Single Image: Deep Architectures and Ground-Truth Generation Type Book Whole
  Year 2021 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In this thesis, we explore how to estimate the effects of the light interacting with the scene objects from a single image. To achieve this goal, we focus on recovering intrinsic components like reflectance, shading, or light properties such as color and position using deep architectures. The success of these approaches relies on training on large and diversified image datasets. Therefore, we present several contributions on this such as: (a) a data-augmentation technique; (b) a ground-truth for an existing multi-illuminant dataset; (c) a family of synthetic datasets, SID for Surreal Intrinsic Datasets, with diversified backgrounds and coherent light conditions; and (d) a practical pipeline to create hybrid ground-truths to overcome the complexity of acquiring realistic light conditions in a massive way. In parallel with the creation of datasets, we trained different flexible encoder-decoder deep architectures incorporating physical constraints from the image formation models.

In the last part of the thesis, we apply all the previous experience to two different problems. Firstly, we create a large hybrid Doc3DShade dataset with real shading and synthetic reflectance under complex illumination conditions, that is used to train a two-stage architecture that improves the character recognition task in complex lighting conditions of unwrapped documents. Secondly, we tackle the problem of single image scene relighting by extending both, the SID dataset to present stronger shading and shadows effects, and the deep architectures to use intrinsic components to estimate new relit images.
 
  Address (up) September 2021  
  Corporate Author Thesis Ph.D. thesis  
  Publisher IMPRIMA Place of Publication Editor Maria Vanrell;Ramon Baldrich  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-122714-8-5 Medium  
  Area Expedition Conference  
  Notes CIC; Approved no  
  Call Number Admin @ si @ Sia2021 Serial 3607  
Permanent link to this record
 

 
Author Javad Zolfaghari Bengar; Bogdan Raducanu; Joost Van de Weijer edit  url
openurl 
  Title When Deep Learners Change Their Mind: Learning Dynamics for Active Learning Type Conference Article
  Year 2021 Publication 19th International Conference on Computer Analysis of Images and Patterns Abbreviated Journal  
  Volume 13052 Issue 1 Pages 403-413  
  Keywords  
  Abstract Active learning aims to select samples to be annotated that yield the largest performance improvement for the learning algorithm. Many methods approach this problem by measuring the informativeness of samples and do this based on the certainty of the network predictions for samples. However, it is well-known that neural networks are overly confident about their prediction and are therefore an untrustworthy source to assess sample informativeness. In this paper, we propose a new informativeness-based active learning method. Our measure is derived from the learning dynamics of a neural network. More precisely we track the label assignment of the unlabeled data pool during the training of the algorithm. We capture the learning dynamics with a metric called label-dispersion, which is low when the network consistently assigns the same label to the sample during the training of the network and high when the assigned label changes frequently. We show that label-dispersion is a promising predictor of the uncertainty of the network, and show on two benchmark datasets that an active learning algorithm based on label-dispersion obtains excellent results.  
  Address (up) September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CAIP  
  Notes LAMP; Approved no  
  Call Number Admin @ si @ ZRV2021 Serial 3673  
Permanent link to this record
 

 
Author Hugo Bertiche; Meysam Madadi; Sergio Escalera edit   pdf
openurl 
  Title PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation Type Conference Article
  Year 2021 Publication 14th ACM Siggraph Conference and exhibition on Computer Graphics and Interactive Techniques in Asia Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract We present a methodology to automatically obtain Pose Space Deformation (PSD) basis for rigged garments through deep learning. Classical approaches rely on Physically Based Simulations (PBS) to animate clothes. These are general solutions that, given a sufficiently fine-grained discretization of space and time, can achieve highly realistic results. However, they are computationally expensive and any scene modification prompts the need of re-simulation. Linear Blend Skinning (LBS) with PSD offers a lightweight alternative to PBS, though, it needs huge volumes of data to learn proper PSD. We propose using deep learning, formulated as an implicit PBS, to unsupervisedly learn realistic cloth Pose Space Deformations in a constrained scenario: dressed humans. Furthermore, we show it is possible to train these models in an amount of time comparable to a PBS of a few sequences. To the best of our knowledge, we are the first to propose a neural simulator for cloth.
While deep-based approaches in the domain are becoming a trend, these are data-hungry models. Moreover, authors often propose complex formulations to better learn wrinkles from PBS data. Supervised learning leads to physically inconsistent predictions that require collision solving to be used. Also, dependency on PBS data limits the scalability of these solutions, while their formulation hinders its applicability and compatibility. By proposing an unsupervised methodology to learn PSD for LBS models (3D animation standard), we overcome both of these drawbacks. Results obtained show cloth-consistency in the animated garments and meaningful pose-dependant folds and wrinkles. Our solution is extremely efficient, handles multiple layers of cloth, allows unsupervised outfit resizing and can be easily applied to any custom 3D avatar.
 
  Address (up) Virtual; December 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference SIGGRAPH  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ BME2021b Serial 3641  
Permanent link to this record
 

 
Author Diego Porres edit   pdf
url  openurl
  Title Discriminator Synthesis: On reusing the other half of Generative Adversarial Networks Type Conference Article
  Year 2021 Publication Machine Learning for Creativity and Design, Neurips Workshop Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Generative Adversarial Networks have long since revolutionized the world of computer vision and, tied to it, the world of art. Arduous efforts have gone into fully utilizing and stabilizing training so that outputs of the Generator network have the highest possible fidelity, but little has gone into using the Discriminator after training is complete. In this work, we propose to use the latter and show a way to use the features it has learned from the training dataset to both alter an image and generate one from scratch. We name this method Discriminator Dreaming, and the full code can be found at this https URL.  
  Address (up) Virtual; December 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference NEURIPSW  
  Notes ADAS; 601.365 Approved no  
  Call Number Admin @ si @ Por2021 Serial 3597  
Permanent link to this record
 

 
Author Albert Rial-Farras; Meysam Madadi; Sergio Escalera edit   pdf
url  doi
openurl 
  Title UV-based reconstruction of 3D garments from a single RGB image Type Conference Article
  Year 2021 Publication 16th IEEE International Conference on Automatic Face and Gesture Recognition Abbreviated Journal  
  Volume Issue Pages 1-8  
  Keywords  
  Abstract Garments are highly detailed and dynamic objects made up of particles that interact with each other and with other objects, making the task of 2D to 3D garment reconstruction extremely challenging. Therefore, having a lightweight 3D representation capable of modelling fine details is of great importance. This work presents a deep learning framework based on Generative Adversarial Networks (GANs) to reconstruct 3D garment models from a single RGB image. It has the peculiarity of using UV maps to represent 3D data, a lightweight representation capable of dealing with high-resolution details and wrinkles. With this model and kind of 3D representation, we achieve state-of-the-art results on the CLOTH3D++ dataset, generating good quality and realistic garment reconstructions regardless of the garment topology and shape, human pose, occlusions and lightning.  
  Address (up) Virtual; December 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference FG  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ RME2021 Serial 3639  
Permanent link to this record
 

 
Author Hugo Bertiche; Meysam Madadi; Sergio Escalera edit   pdf
url  doi
openurl 
  Title Deep Parametric Surfaces for 3D Outfit Reconstruction from Single View Image Type Conference Article
  Year 2021 Publication 16th IEEE International Conference on Automatic Face and Gesture Recognition Abbreviated Journal  
  Volume Issue Pages 1-8  
  Keywords  
  Abstract We present a methodology to retrieve analytical surfaces parametrized as a neural network. Previous works on 3D reconstruction yield point clouds, voxelized objects or meshes. Instead, our approach yields 2-manifolds in the euclidean space through deep learning. To this end, we implement a novel formulation for fully connected layers as parametrized manifolds that allows continuous predictions with differential geometry. Based on this property we propose a novel smoothness loss. Results on CLOTH3D++ dataset show the possibility to infer different topologies and the benefits of the smoothness term based on differential geometry.  
  Address (up) Virtual; December 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference FG  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ BME2021 Serial 3640  
Permanent link to this record
 

 
Author Carola Figueroa Flores; Bogdan Raducanu; David Berga; Joost Van de Weijer edit   pdf
openurl 
  Title Hallucinating Saliency Maps for Fine-Grained Image Classification for Limited Data Domains Type Conference Article
  Year 2021 Publication 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications Abbreviated Journal  
  Volume 4 Issue Pages 163-171  
  Keywords  
  Abstract arXiv:2007.12562
Most of the saliency methods are evaluated on their ability to generate saliency maps, and not on their functionality in a complete vision pipeline, like for instance, image classification. In the current paper, we propose an approach which does not require explicit saliency maps to improve image classification, but they are learned implicitely, during the training of an end-to-end image classification task. We show that our approach obtains similar results as the case when the saliency maps are provided explicitely. Combining RGB data with saliency maps represents a significant advantage for object recognition, especially for the case when training data is limited. We validate our method on several datasets for fine-grained classification tasks (Flowers, Birds and Cars). In addition, we show that our saliency estimation method, which is trained without any saliency groundtruth data, obtains competitive results on real image saliency benchmark (Toronto), and outperforms deep saliency models with synthetic images (SID4VAM).
 
  Address (up) Virtual; February 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference VISAPP  
  Notes LAMP Approved no  
  Call Number Admin @ si @ FRB2021c Serial 3540  
Permanent link to this record
 

 
Author Arturo Fuentes; F. Javier Sanchez; Thomas Voncina; Jorge Bernal edit  doi
openurl 
  Title LAMV: Learning to Predict Where Spectators Look in Live Music Performances Type Conference Article
  Year 2021 Publication 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications Abbreviated Journal  
  Volume 5 Issue Pages 500-507  
  Keywords  
  Abstract The advent of artificial intelligence has supposed an evolution on how different daily work tasks are performed. The analysis of cultural content has seen a huge boost by the development of computer-assisted methods that allows easy and transparent data access. In our case, we deal with the automation of the production of live shows, like music concerts, aiming to develop a system that can indicate the producer which camera to show based on what each of them is showing. In this context, we consider that is essential to understand where spectators look and what they are interested in so the computational method can learn from this information. The work that we present here shows the results of a first preliminary study in which we compare areas of interest defined by human beings and those indicated by an automatic system. Our system is based on the extraction of motion textures from dynamic Spatio-Temporal Volumes (STV) and then analyzing the patterns by means of texture analysis techniques. We validate our approach over several video sequences that have been labeled by 16 different experts. Our method is able to match those relevant areas identified by the experts, achieving recall scores higher than 80% when a distance of 80 pixels between method and ground truth is considered. Current performance shows promise when detecting abnormal peaks and movement trends.  
  Address (up) Virtual; February 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference VISIGRAPP  
  Notes MV; ISE; 600.119; Approved no  
  Call Number Admin @ si @ FSV2021 Serial 3570  
Permanent link to this record
 

 
Author Parichehr Behjati Ardakani; Pau Rodriguez; Armin Mehri; Isabelle Hupont; Carles Fernandez; Jordi Gonzalez edit   pdf
doi  openurl
  Title OverNet: Lightweight Multi-Scale Super-Resolution with Overscaling Network Type Conference Article
  Year 2021 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 2693-2702  
  Keywords  
  Abstract Super-resolution (SR) has achieved great success due to the development of deep convolutional neural networks (CNNs). However, as the depth and width of the networks increase, CNN-based SR methods have been faced with the challenge of computational complexity in practice. More- over, most SR methods train a dedicated model for each target resolution, losing generality and increasing memory requirements. To address these limitations we introduce OverNet, a deep but lightweight convolutional network to solve SISR at arbitrary scale factors with a single model. We make the following contributions: first, we introduce a lightweight feature extractor that enforces efficient reuse of information through a novel recursive structure of skip and dense connections. Second, to maximize the performance of the feature extractor, we propose a model agnostic reconstruction module that generates accurate high-resolution images from overscaled feature maps obtained from any SR architecture. Third, we introduce a multi-scale loss function to achieve generalization across scales. Experiments show that our proposal outperforms previous state-of-the-art approaches in standard benchmarks, while maintaining relatively low computation and memory requirements.  
  Address (up) Virtual; January 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes ISE; 600.119; 600.098 Approved no  
  Call Number Admin @ si @ BRM2021 Serial 3512  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: