Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–20] |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Santiago Segui; Michal Drozdzal; Guillem Pascual; Petia Radeva; Carolina Malagelada; Fernando Azpiroz; Jordi Vitria | ||||
Title | Generic Feature Learning for Wireless Capsule Endoscopy Analysis | Type | Journal Article | ||
Year | 2016 | Publication | Computers in Biology and Medicine | Abbreviated Journal | CBM |
Volume | 79 | Issue ![]() |
Pages | 163-172 | |
Keywords | Wireless capsule endoscopy; Deep learning; Feature learning; Motility analysis | ||||
Abstract | The interpretation and analysis of wireless capsule endoscopy (WCE) recordings is a complex task which requires sophisticated computer aided decision (CAD) systems to help physicians with video screening and, finally, with the diagnosis. Most CAD systems used in capsule endoscopy share a common system design, but use very different image and video representations. As a result, each time a new clinical application of WCE appears, a new CAD system has to be designed from the scratch. This makes the design of new CAD systems very time consuming. Therefore, in this paper we introduce a system for small intestine motility characterization, based on Deep Convolutional Neural Networks, which circumvents the laborious step of designing specific features for individual motility events. Experimental results show the superiority of the learned features over alternative classifiers constructed using state-of-the-art handcrafted features. In particular, it reaches a mean classification accuracy of 96% for six intestinal motility events, outperforming the other classifiers by a large margin (a 14% relative performance increase). | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | OR; MILAB;MV; | Approved | no | ||
Call Number | Admin @ si @ SDP2016 | Serial | 2836 | ||
Permanent link to this record | |||||
Author | Mikkel Thogersen; Sergio Escalera; Jordi Gonzalez; Thomas B. Moeslund | ||||
Title | Segmentation of RGB-D Indoor scenes by Stacking Random Forests and Conditional Random Fields | Type | Journal Article | ||
Year | 2016 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 80 | Issue ![]() |
Pages | 208–215 | |
Keywords | |||||
Abstract | This paper proposes a technique for RGB-D scene segmentation using Multi-class
Multi-scale Stacked Sequential Learning (MMSSL) paradigm. Following recent trends in state-of-the-art, a base classifier uses an initial SLIC segmentation to obtain superpixels which provide a diminution of data while retaining object boundaries. A series of color and depth features are extracted from the superpixels, and are used in a Conditional Random Field (CRF) to predict superpixel labels. Furthermore, a Random Forest (RF) classifier using random offset features is also used as an input to the CRF, acting as an initial prediction. As a stacked classifier, another Random Forest is used acting on a spatial multi-scale decomposition of the CRF confidence map to correct the erroneous labels assigned by the previous classifier. The model is tested on the popular NYU-v2 dataset. The approach shows that simple multi-modal features with the power of the MMSSL paradigm can achieve better performance than state of the art results on the same dataset. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; ISE;MILAB; 600.098; 600.119 | Approved | no | ||
Call Number | Admin @ si @ TEG2016 | Serial | 2843 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Jordi Gonzalez; Xavier Baro; Jamie Shotton | ||||
Title | Guest Editor Introduction to the Special Issue on Multimodal Human Pose Recovery and Behavior Analysis | Type | Journal Article | ||
Year | 2016 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 28 | Issue ![]() |
Pages | 1489 - 1491 | |
Keywords | |||||
Abstract | The sixteen papers in this special section focus on human pose recovery and behavior analysis (HuPBA). This is one of the most challenging topics in computer vision, pattern analysis, and machine learning. It is of critical importance for application areas that include gaming, computer interaction, human robot interaction, security, commerce, assistive technologies and rehabilitation, sports, sign language recognition, and driver assistance technology, to mention just a few. In essence, HuPBA requires dealing with the articulated nature of the human body, changes in appearance due to clothing, and the inherent problems of clutter scenes, such as background artifacts, occlusions, and illumination changes. These papers represent the most recent research in this field, including new methods considering still images, image sequences, depth data, stereo vision, 3D vision, audio, and IMUs, among others. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; ISE;MV; | Approved | no | ||
Call Number | Admin @ si @ | Serial | 2851 | ||
Permanent link to this record | |||||
Author | Pau Riba; Josep Llados; Alicia Fornes; Anjan Dutta | ||||
Title | Large-scale graph indexing using binary embeddings of node contexts for information spotting in document image databases | Type | Journal Article | ||
Year | 2017 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 87 | Issue ![]() |
Pages | 203-211 | |
Keywords | |||||
Abstract | Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations. However, retrieving a query graph from a large dataset of graphs implies a high computational complexity. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. With this aim, in this paper we propose a graph indexation formalism applied to visual retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Then, each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in different real scenarios such as handwritten word spotting in images of historical documents or symbol spotting in architectural floor plans. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.097; 602.006; 603.053; 600.121 | Approved | no | ||
Call Number | RLF2017b | Serial | 2873 | ||
Permanent link to this record | |||||
Author | Lluis Gomez; Dimosthenis Karatzas | ||||
Title | TextProposals: a Text‐specific Selective Search Algorithm for Word Spotting in the Wild | Type | Journal Article | ||
Year | 2017 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 70 | Issue ![]() |
Pages | 60-74 | |
Keywords | |||||
Abstract | Motivated by the success of powerful while expensive techniques to recognize words in a holistic way (Goel et al., 2013; Almazán et al., 2014; Jaderberg et al., 2016) object proposals techniques emerge as an alternative to the traditional text detectors. In this paper we introduce a novel object proposals method that is specifically designed for text. We rely on a similarity based region grouping algorithm that generates a hierarchy of word hypotheses. Over the nodes of this hierarchy it is possible to apply a holistic word recognition method in an efficient way.
Our experiments demonstrate that the presented method is superior in its ability of producing good quality word proposals when compared with class-independent algorithms. We show impressive recall rates with a few thousand proposals in different standard benchmarks, including focused or incidental text datasets, and multi-language scenarios. Moreover, the combination of our object proposals with existing whole-word recognizers (Almazán et al., 2014; Jaderberg et al., 2016) shows competitive performance in end-to-end word spotting, and, in some benchmarks, outperforms previously published results. Concretely, in the challenging ICDAR2015 Incidental Text dataset, we overcome in more than 10% F-score the best-performing method in the last ICDAR Robust Reading Competition (Karatzas, 2015). Source code of the complete end-to-end system is available at https://github.com/lluisgomez/TextProposals. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.084; 601.197; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ GoK2017 | Serial | 2886 | ||
Permanent link to this record | |||||
Author | Lluis Gomez; Anguelos Nicolaou; Dimosthenis Karatzas | ||||
Title | Improving patch‐based scene text script identification with ensembles of conjoined networks | Type | Journal Article | ||
Year | 2017 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 67 | Issue ![]() |
Pages | 85-96 | |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.084; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ GNK2017 | Serial | 2887 | ||
Permanent link to this record | |||||
Author | Miguel Oliveira; Victor Santos; Angel Sappa; P. Dias; A. Moreira | ||||
Title | Incremental texture mapping for autonomous driving | Type | Journal Article | ||
Year | 2016 | Publication | Robotics and Autonomous Systems | Abbreviated Journal | RAS |
Volume | 84 | Issue ![]() |
Pages | 113-128 | |
Keywords | Scene reconstruction; Autonomous driving; Texture mapping | ||||
Abstract | Autonomous vehicles have a large number of on-board sensors, not only for providing coverage all around the vehicle, but also to ensure multi-modality in the observation of the scene. Because of this, it is not trivial to come up with a single, unique representation that feeds from the data given by all these sensors. We propose an algorithm which is capable of mapping texture collected from vision based sensors onto a geometric description of the scenario constructed from data provided by 3D sensors. The algorithm uses a constrained Delaunay triangulation to produce a mesh which is updated using a specially devised sequence of operations. These enforce a partial configuration of the mesh that avoids bad quality textures and ensures that there are no gaps in the texture. Results show that this algorithm is capable of producing fine quality textures. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.086 | Approved | no | ||
Call Number | Admin @ si @ OSS2016b | Serial | 2912 | ||
Permanent link to this record | |||||
Author | Pau Rodriguez; Guillem Cucurull; Jordi Gonzalez; Josep M. Gonfaus; Kamal Nasrollahi; Thomas B. Moeslund; Xavier Roca | ||||
Title | Deep Pain: Exploiting Long Short-Term Memory Networks for Facial Expression Classification | Type | Journal Article | ||
Year | 2017 | Publication | IEEE Transactions on cybernetics | Abbreviated Journal | Cyber |
Volume | Issue ![]() |
Pages | 1-11 | ||
Keywords | |||||
Abstract | Pain is an unpleasant feeling that has been shown to be an important factor for the recovery of patients. Since this is costly in human resources and difficult to do objectively, there is the need for automatic systems to measure it. In this paper, contrary to current state-of-the-art techniques in pain assessment, which are based on facial features only, we suggest that the performance can be enhanced by feeding the raw frames to deep learning models, outperforming the latest state-of-the-art results while also directly facing the problem of imbalanced data. As a baseline, our approach first uses convolutional neural networks (CNNs) to learn facial features from VGG_Faces, which are then linked to a long short-term memory to exploit the temporal relation between video frames. We further compare the performances of using the so popular schema based on the canonically normalized appearance versus taking into account the whole image. As a result, we outperform current state-of-the-art area under the curve performance in the UNBC-McMaster Shoulder Pain Expression Archive Database. In addition, to evaluate the generalization properties of our proposed methodology on facial motion recognition, we also report competitive results in the Cohn Kanade+ facial expression database. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE; 600.119; 600.098 | Approved | no | ||
Call Number | Admin @ si @ RCG2017a | Serial | 2926 | ||
Permanent link to this record | |||||
Author | Anjan Dutta; Josep Llados; Horst Bunke; Umapada Pal | ||||
Title | Product graph-based higher order contextual similarities for inexact subgraph matching | Type | Journal Article | ||
Year | 2018 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 76 | Issue ![]() |
Pages | 596-611 | |
Keywords | |||||
Abstract | Many algorithms formulate graph matching as an optimization of an objective function of pairwise quantification of nodes and edges of two graphs to be matched. Pairwise measurements usually consider local attributes but disregard contextual information involved in graph structures. We address this issue by proposing contextual similarities between pairs of nodes. This is done by considering the tensor product graph (TPG) of two graphs to be matched, where each node is an ordered pair of nodes of the operand graphs. Contextual similarities between a pair of nodes are computed by accumulating weighted walks (normalized pairwise similarities) terminating at the corresponding paired node in TPG. Once the contextual similarities are obtained, we formulate subgraph matching as a node and edge selection problem in TPG. We use contextual similarities to construct an objective function and optimize it with a linear programming approach. Since random walk formulation through TPG takes into account higher order information, it is not a surprise that we obtain more reliable similarities and better discrimination among the nodes and edges. Experimental results shown on synthetic as well as real benchmarks illustrate that higher order contextual similarities increase discriminating power and allow one to find approximate solutions to the subgraph matching problem. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 602.167; 600.097; 600.121 | Approved | no | ||
Call Number | Admin @ si @ DLB2018 | Serial | 3083 | ||
Permanent link to this record | |||||
Author | David Vazquez; Jorge Bernal; F. Javier Sanchez; Gloria Fernandez Esparrach; Antonio Lopez; Adriana Romero; Michal Drozdzal; Aaron Courville | ||||
Title | A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images | Type | Journal Article | ||
Year | 2017 | Publication | Journal of Healthcare Engineering | Abbreviated Journal | JHCE |
Volume | Issue ![]() |
Pages | 2040-2295 | ||
Keywords | Colonoscopy images; Deep Learning; Semantic Segmentation | ||||
Abstract | Colorectal cancer (CRC) is the third cause of cancer death world-wide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss- rate and inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing Decision Support Systems (DSS) aim- ing to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image segmentation, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. The proposed dataset consists of 4 relevant classes to inspect the endolumninal scene, tar- geting different clinical needs. Together with the dataset and taking advantage of advances in semantic segmentation literature, we provide new baselines by training standard fully convolutional networks (FCN). We perform a compar- ative study to show that FCN significantly outperform, without any further post-processing, prior results in endoluminal scene segmentation, especially with respect to polyp segmentation and localization. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; MV; 600.075; 600.085; 600.076; 601.281; 600.118 | Approved | no | ||
Call Number | VBS2017b | Serial | 2940 | ||
Permanent link to this record | |||||
Author | Marta Diez-Ferrer; Debora Gil; Elena Carreño; Susana Padrones; Samantha Aso; Vanesa Vicens; Noelia Cubero de Frutos; Rosa Lopez Lisbona; Carles Sanchez; Agnes Borras; Antoni Rosell | ||||
Title | Positive Airway Pressure-Enhanced CT to Improve Virtual Bronchoscopic Navigation | Type | Journal Article | ||
Year | 2017 | Publication | European Respiratory Journal | Abbreviated Journal | ERJ |
Volume | Issue ![]() |
Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM | Approved | no | ||
Call Number | Admin @ si @ DGC2017b | Serial | 3632 | ||
Permanent link to this record | |||||
Author | Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades | ||||
Title | Sparse representation over learned dictionary for symbol recognition | Type | Journal Article | ||
Year | 2016 | Publication | Signal Processing | Abbreviated Journal | SP |
Volume | 125 | Issue ![]() |
Pages | 36-47 | |
Keywords | Symbol Recognition; Sparse Representation; Learned Dictionary; Shape Context; Interest Points | ||||
Abstract | In this paper we propose an original sparse vector model for symbol retrieval task. More specically, we apply the K-SVD algorithm for learning a visual dictionary based on symbol descriptors locally computed around interest points. Results on benchmark datasets show that the obtained sparse representation is competitive related to state-of-the-art methods. Moreover, our sparse representation is invariant to rotation and scale transforms and also robust to degraded images and distorted symbols. Thereby, the learned visual dictionary is able to represent instances of unseen classes of symbols. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.061; 600.077 | Approved | no | ||
Call Number | Admin @ si @ DTR2016 | Serial | 2946 | ||
Permanent link to this record | |||||
Author | Joan Serrat; Felipe Lumbreras; Francisco Blanco; Manuel Valiente; Montserrat Lopez-Mesas | ||||
Title | myStone: A system for automatic kidney stone classification | Type | Journal Article | ||
Year | 2017 | Publication | Expert Systems with Applications | Abbreviated Journal | ESA |
Volume | 89 | Issue ![]() |
Pages | 41-51 | |
Keywords | Kidney stone; Optical device; Computer vision; Image classification | ||||
Abstract | Kidney stone formation is a common disease and the incidence rate is constantly increasing worldwide. It has been shown that the classification of kidney stones can lead to an important reduction of the recurrence rate. The classification of kidney stones by human experts on the basis of certain visual color and texture features is one of the most employed techniques. However, the knowledge of how to analyze kidney stones is not widespread, and the experts learn only after being trained on a large number of samples of the different classes. In this paper we describe a new device specifically designed for capturing images of expelled kidney stones, and a method to learn and apply the experts knowledge with regard to their classification. We show that with off the shelf components, a carefully selected set of features and a state of the art classifier it is possible to automate this difficult task to a good degree. We report results on a collection of 454 kidney stones, achieving an overall accuracy of 63% for a set of eight classes covering almost all of the kidney stones taxonomy. Moreover, for more than 80% of samples the real class is the first or the second most probable class according to the system, being then the patient recommendations for the two top classes similar. This is the first attempt towards the automatic visual classification of kidney stones, and based on the current results we foresee better accuracies with the increase of the dataset size. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; MSIAU; 603.046; 600.122; 600.118 | Approved | no | ||
Call Number | Admin @ si @ SLB2017 | Serial | 3026 | ||
Permanent link to this record | |||||
Author | Pau Rodriguez; Guillem Cucurull; Josep M. Gonfaus; Xavier Roca; Jordi Gonzalez | ||||
Title | Age and gender recognition in the wild with deep attention | Type | Journal Article | ||
Year | 2017 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 72 | Issue ![]() |
Pages | 563-571 | |
Keywords | Age recognition; Gender recognition; Deep neural networks; Attention mechanisms | ||||
Abstract | Face analysis in images in the wild still pose a challenge for automatic age and gender recognition tasks, mainly due to their high variability in resolution, deformation, and occlusion. Although the performance has highly increased thanks to Convolutional Neural Networks (CNNs), it is still far from optimal when compared to other image recognition tasks, mainly because of the high sensitiveness of CNNs to facial variations. In this paper, inspired by biology and the recent success of attention mechanisms on visual question answering and fine-grained recognition, we propose a novel feedforward attention mechanism that is able to discover the most informative and reliable parts of a given face for improving age and gender classification. In particular, given a downsampled facial image, the proposed model is trained based on a novel end-to-end learning framework to extract the most discriminative patches from the original high-resolution image. Experimental validation on the standard Adience, Images of Groups, and MORPH II benchmarks show that including attention mechanisms enhances the performance of CNNs in terms of robustness and accuracy. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE; 600.098; 602.133; 600.119 | Approved | no | ||
Call Number | Admin @ si @ RCG2017b | Serial | 2962 | ||
Permanent link to this record | |||||
Author | Parichehr Behjati Ardakani; Pau Rodriguez; Carles Fernandez; Armin Mehri; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez | ||||
Title | Frequency-based Enhancement Network for Efficient Super-Resolution | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Access | Abbreviated Journal | ACCESS |
Volume | 10 | Issue ![]() |
Pages | 57383-57397 | |
Keywords | Deep learning; Frequency-based methods; Lightweight architectures; Single image super-resolution | ||||
Abstract | Recently, deep convolutional neural networks (CNNs) have provided outstanding performance in single image super-resolution (SISR). Despite their remarkable performance, the lack of high-frequency information in the recovered images remains a core problem. Moreover, as the networks increase in depth and width, deep CNN-based SR methods are faced with the challenge of computational complexity in practice. A promising and under-explored solution is to adapt the amount of compute based on the different frequency bands of the input. To this end, we present a novel Frequency-based Enhancement Block (FEB) which explicitly enhances the information of high frequencies while forwarding low-frequencies to the output. In particular, this block efficiently decomposes features into low- and high-frequency and assigns more computation to high-frequency ones. Thus, it can help the network generate more discriminative representations by explicitly recovering finer details. Our FEB design is simple and generic and can be used as a direct replacement of commonly used SR blocks with no need to change network architectures. We experimentally show that when replacing SR blocks with FEB we consistently improve the reconstruction error, while reducing the number of parameters in the model. Moreover, we propose a lightweight SR model — Frequency-based Enhancement Network (FENet) — based on FEB that matches the performance of larger models. Extensive experiments demonstrate that our proposal performs favorably against the state-of-the-art SR algorithms in terms of visual quality, memory footprint, and inference time. The code is available at https://github.com/pbehjatii/FENet | ||||
Address | 18 May 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | IEEE | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ BRF2022a | Serial | 3747 | ||
Permanent link to this record |