Home | [111–120] << 121 122 123 124 125 126 127 128 129 130 >> [131–140] |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Carolina Malagelada; F.De Lorio; Santiago Segui; S. Mendez; Michal Drozdzal; Jordi Vitria; Petia Radeva; J.Santos; Anna Accarino; Juan R. Malagelada; Fernando Azpiroz | ||||
Title ![]() |
Functional gut disorders or disordered gut function? Small bowel dysmotility evidenced by an original technique | Type | Journal Article | ||
Year | 2012 | Publication | Neurogastroenterology & Motility | Abbreviated Journal | NEUMOT |
Volume | 24 | Issue | 3 | Pages | 223-230 |
Keywords | capsule endoscopy;computer vision analysis;machine learning technique;small bowel motility | ||||
Abstract | JCR Impact Factor 2010: 3.349
Background This study aimed to determine the proportion of cases with abnormal intestinal motility among patients with functional bowel disorders. To this end, we applied an original method, previously developed in our laboratory, for analysis of endoluminal images obtained by capsule endoscopy. This novel technology is based on computer vision and machine learning techniques. Methods The endoscopic capsule (Pillcam SB1; Given Imaging, Yokneam, Israel) was administered to 80 patients with functional bowel disorders and 70 healthy subjects. Endoluminal image analysis was performed with a computer vision program developed for the evaluation of contractile events (luminal occlusions and radial wrinkles), non-contractile patterns (open tunnel and smooth wall patterns), type of content (secretions, chyme) and motion of wall and contents. Normality range and discrimination of abnormal cases were established by a machine learning technique. Specifically, an iterative classifier (one-class support vector machine) was applied in a random population of 50 healthy subjects as a training set and the remaining subjects (20 healthy subjects and 80 patients) as a test set. Key Results The classifier identified as abnormal 29% of patients with functional diseases of the bowel (23 of 80), and as normal 97% of healthy subjects (68 of 70) (P < 0.05 by chi-squared test). Patients identified as abnormal clustered in two groups, which exhibited either a hyper- or a hypodynamic motility pattern. The motor behavior was unrelated to clinical features. Conclusions & Inferences With appropriate methodology, abnormal intestinal motility can be demonstrated in a significant proportion of patients with functional bowel disorders, implying a pathologic disturbance of gut physiology. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Wiley Online Library | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; OR; MV | Approved | no | ||
Call Number | Admin @ si @ MLS2012 | Serial | 1830 | ||
Permanent link to this record | |||||
Author | Dena Bazazian | ||||
Title ![]() |
Fully Convolutional Networks for Text Understanding in Scene Images | Type | Book Whole | ||
Year | 2018 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Text understanding in scene images has gained plenty of attention in the computer vision community and it is an important task in many applications as text carries semantically rich information about scene content and context. For instance, reading text in a scene can be applied to autonomous driving, scene understanding or assisting visually impaired people. The general aim of scene text understanding is to localize and recognize text in scene images. Text regions are first localized in the original image by a trained detector model and afterwards fed into a recognition module. The tasks of localization and recognition are highly correlated since an inaccurate localization can affect the recognition task.
The main purpose of this thesis is to devise efficient methods for scene text understanding. We investigate how the latest results on deep learning can advance text understanding pipelines. Recently, Fully Convolutional Networks (FCNs) and derived methods have achieved a significant performance on semantic segmentation and pixel level classification tasks. Therefore, we took benefit of the strengths of FCN approaches in order to detect text in natural scenes. In this thesis we have focused on two challenging tasks of scene text understanding which are Text Detection and Word Spotting. For the task of text detection, we have proposed an efficient text proposal technique in scene images. We have considered the Text Proposals method as the baseline which is an approach to reduce the search space of possible text regions in an image. In order to improve the Text Proposals method we combined it with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same level of accuracy and thus gaining a significant speed up. Our experiments demonstrate that this text proposal approach yields significantly higher recall rates than the line based text localization techniques, while also producing better-quality localization. We have also applied this technique on compressed images such as videos from wearable egocentric cameras. For the task of word spotting, we have introduced a novel mid-level word representation method. We have proposed a technique to create and exploit an intermediate representation of images based on text attributes which roughly correspond to character probability maps. Our representation extends the concept of Pyramidal Histogram Of Characters (PHOC) by exploiting Fully Convolutional Networks to derive a pixel-wise mapping of the character distribution within candidate word regions. We call this representation the Soft-PHOC. Furthermore, we show how to use Soft-PHOC descriptors for word spotting tasks through an efficient text line proposal algorithm. To evaluate the detected text, we propose a novel line based evaluation along with the classic bounding box based approach. We test our method on incidental scene text images which comprises real-life scenarios such as urban scenes. The importance of incidental scene text images is due to the complexity of backgrounds, perspective, variety of script and language, short text and little linguistic context. All of these factors together makes the incidental scene text images challenging. |
||||
Address | November 2018 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Dimosthenis Karatzas;Andrew Bagdanov | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-948531-1-1 | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ Baz2018 | Serial | 3220 | ||
Permanent link to this record | |||||
Author | Utkarsh Porwal; Alicia Fornes; Faisal Shafait (eds) | ||||
Title ![]() |
Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022 | Type | Book Whole | ||
Year | 2022 | Publication | Frontiers in Handwriting Recognition. | Abbreviated Journal | |
Volume | 13639 | Issue | Pages | ||
Keywords | |||||
Abstract | |||||
Address | ICFHR 2022, Hyderabad, India, December 4–7, 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer | Place of Publication | Editor | Utkarsh Porwal; Alicia Fornes; Faisal Shafait | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-031-21648-0 | Medium | ||
Area | Expedition | Conference | ICFHR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ PFS2022 | Serial | 3809 | ||
Permanent link to this record | |||||
Author | A. Martinez; Jordi Vitria | ||||
Title ![]() |
From Visual Scanning to Object Recognition | Type | Conference Article | ||
Year | 1997 | Publication | (SNRFAI’97) 7th Spanish National Symposium on Pattern Recognition and Image Analysis. | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Barcelona | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | OR;MV | Approved | no | ||
Call Number | BCNPCL @ bcnpcl @ MaV1997c | Serial | 58 | ||
Permanent link to this record | |||||
Author | A. Martinez; Jordi Vitria | ||||
Title ![]() |
From visual scanning to object recognition | Type | Report | ||
Year | 1996 | Publication | CVC Technical Report #17 | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | CVC (UAB) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | OR;MV | Approved | no | ||
Call Number | BCNPCL @ bcnpcl @ MaV1996b | Serial | 528 | ||
Permanent link to this record | |||||
Author | Antonio Lopez; Jiaolong Xu; Jose Luis Gomez; David Vazquez; German Ros | ||||
Title ![]() |
From Virtual to Real World Visual Perception using Domain Adaptation -- The DPM as Example | Type | Book Chapter | ||
Year | 2017 | Publication | Domain Adaptation in Computer Vision Applications | Abbreviated Journal | |
Volume | Issue | 13 | Pages | 243-258 | |
Keywords | Domain Adaptation | ||||
Abstract | Supervised learning tends to produce more accurate classifiers than unsupervised learning in general. This implies that training data is preferred with annotations. When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck. The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images. A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required. This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous. As an alternative we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision. However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA). In this chapter we revisit the DA of a deformable part-based model (DPM) as an exemplifying case of virtual- to-real-world DA. As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data. While doing so, we investigate questions such as: how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer | Place of Publication | Editor | Gabriela Csurka | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.085; 601.223; 600.076; 600.118 | Approved | no | ||
Call Number | ADAS @ adas @ LXG2017 | Serial | 2872 | ||
Permanent link to this record | |||||
Author | Svebor Karaman; Giuseppe Lisanti; Andrew Bagdanov; Alberto del Bimbo | ||||
Title ![]() |
From re-identification to identity inference: Labeling consistency by local similarity constraints | Type | Book Chapter | ||
Year | 2014 | Publication | Person Re-Identification | Abbreviated Journal | |
Volume | 2 | Issue | Pages | 287-307 | |
Keywords | re-identification; Identity inference; Conditional random fields; Video surveillance | ||||
Abstract | In this chapter, we introduce the problem of identity inference as a generalization of person re-identification. It is most appropriate to distinguish identity inference from re-identification in situations where a large number of observations must be identified without knowing a priori that groups of test images represent the same individual. The standard single- and multishot person re-identification common in the literature are special cases of our formulation. We present an approach to solving identity inference by modeling it as a labeling problem in a Conditional Random Field (CRF). The CRF model ensures that the final labeling gives similar labels to detections that are similar in feature space. Experimental results are given on the ETHZ, i-LIDS and CAVIAR datasets. Our approach yields state-of-the-art performance for multishot re-identification, and our results on the more general identity inference problem demonstrate that we are able to infer the identity of very many examples even with very few labeled images in the gallery. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer London | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2191-6586 | ISBN | 978-1-4471-6295-7 | Medium | |
Area | Expedition | Conference | |||
Notes | LAMP; 600.079 | Approved | no | ||
Call Number | Admin @ si @KLB2014b | Serial | 2521 | ||
Permanent link to this record | |||||
Author | Antonio Hernandez | ||||
Title ![]() |
From pixels to gestures: learning visual representations for human analysis in color and depth data sequences | Type | Book Whole | ||
Year | 2015 | Publication | PhD Thesis, Universitat de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | The visual analysis of humans from images is an important topic of interest due to its relevance to many computer vision applications like pedestrian detection, monitoring and surveillance, human-computer interaction, e-health or content-based image retrieval, among others.
In this dissertation we are interested in learning different visual representations of the human body that are helpful for the visual analysis of humans in images and video sequences. To that end, we analyze both RGB and depth image modalities and address the problem from three different research lines, at different levels of abstraction; from pixels to gestures: human segmentation, human pose estimation and gesture recognition. First, we show how binary segmentation (object vs. background) of the human body in image sequences is helpful to remove all the background clutter present in the scene. The presented method, based on Graph cuts optimization, enforces spatio-temporal consistency of the produced segmentation masks among consecutive frames. Secondly, we present a framework for multi-label segmentation for obtaining much more detailed segmentation masks: instead of just obtaining a binary representation separating the human body from the background, finer segmentation masks can be obtained separating the different body parts. At a higher level of abstraction, we aim for a simpler yet descriptive representation of the human body. Human pose estimation methods usually rely on skeletal models of the human body, formed by segments (or rectangles) that represent the body limbs, appropriately connected following the kinematic constraints of the human body. In practice, such skeletal models must fulfill some constraints in order to allow for efficient inference, while actually limiting the expressiveness of the model. In order to cope with this, we introduce a top-down approach for predicting the position of the body parts in the model, using a mid-level part representation based on Poselets. Finally, we propose a framework for gesture recognition based on the bag of visual words framework. We leverage the benefits of RGB and depth image modalities by combining modality-specific visual vocabularies in a late fusion fashion. A new rotation-variant depth descriptor is presented, yielding better results than other state-of-the-art descriptors. Moreover, spatio-temporal pyramids are used to encode rough spatial and temporal structure. In addition, we present a probabilistic reformulation of Dynamic Time Warping for gesture segmentation in video sequences. A Gaussian-based probabilistic model of a gesture is learnt, implicitly encoding possible deformations in both spatial and time domains. |
||||
Address | January 2015 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Sergio Escalera;Stan Sclaroff | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-940902-0-2 | Medium | ||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ Her2015 | Serial | 2576 | ||
Permanent link to this record | |||||
Author | Arnau Baro; Pau Riba; Jorge Calvo-Zaragoza; Alicia Fornes | ||||
Title ![]() |
From Optical Music Recognition to Handwritten Music Recognition: a Baseline | Type | Journal Article | ||
Year | 2019 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 123 | Issue | Pages | 1-8 | |
Keywords | |||||
Abstract | Optical Music Recognition (OMR) is the branch of document image analysis that aims to convert images of musical scores into a computer-readable format. Despite decades of research, the recognition of handwritten music scores, concretely the Western notation, is still an open problem, and the few existing works only focus on a specific stage of OMR. In this work, we propose a full Handwritten Music Recognition (HMR) system based on Convolutional Recurrent Neural Networks, data augmentation and transfer learning, that can serve as a baseline for the research community. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.097; 601.302; 601.330; 600.140; 600.121 | Approved | no | ||
Call Number | Admin @ si @ BRC2019 | Serial | 3275 | ||
Permanent link to this record | |||||
Author | Florin Popescu; Stephane Ayache; Sergio Escalera; Xavier Baro; Cecile Capponi; Patrick Panciatici; Isabelle Guyon | ||||
Title ![]() |
From geospatial observations of ocean currents to causal predictors of spatio-economic activity using computer vision and machine learning | Type | Conference Article | ||
Year | 2016 | Publication | European Geosciences Union General Assembly | Abbreviated Journal | |
Volume | 18 | Issue | Pages | ||
Keywords | |||||
Abstract | The big data transformation currently revolutionizing science and industry forges novel possibilities in multimodal analysis scarcely imaginable only a decade ago. One of the important economic and industrial problems that stand to benefit from the recent expansion of data availability and computational prowess is the prediction of electricity demand and renewable energy generation. Both are correlates of human activity: spatiotemporal energy consumption patterns in society are a factor of both demand (weather dependent) and supply, which determine cost – a relation expected to strengthen along with increasing renewable energy dependence. One of the main drivers of European weather patterns is the activity of the Atlantic Ocean and in particular its dominant Northern Hemisphere current: the Gulf Stream. We choose this particular current as a test case in part due to larger amount of relevant data and scientific literature available for refinement of analysis techniques.
This data richness is due not only to its economic importance but also to its size being clearly visible in radar and infrared satellite imagery, which makes it easier to detect using Computer Vision (CV). The power of CV techniques makes basic analysis thus developed scalable to other smaller and less known, but still influential, currents, which are not just curves on a map, but complex, evolving, moving branching trees in 3D projected onto a 2D image. We investigate means of extracting, from several image modalities (including recently available Copernicus radar and earlier Infrared satellites), a parameterized presentation of the state of the Gulf Stream and its environment that is useful as feature space representation in a machine learning context, in this case with the EC’s H2020-sponsored ‘See.4C’ project, in the context of which data scientists may find novel predictors of spatiotemporal energy flow. Although automated extractors of Gulf Stream position exist, they differ in methodology and result. We shall attempt to extract more complex feature representation including branching points, eddies and parameterized changes in transport and velocity. Other related predictive features will be similarly developed, such as inference of deep water flux long the current path and wider spatial scale features such as Hough transform, surface turbulence indicators and temperature gradient indexes along with multi-time scale analysis of ocean height and temperature dynamics. The geospatial imaging and ML community may therefore benefit from a baseline of open-source techniques useful and expandable to other related prediction and/or scientific analysis tasks. |
||||
Address | Vienna; Austria; April 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | EGU | ||
Notes | HuPBA;MV; | Approved | no | ||
Call Number | Admin @ si @ PAE2016 | Serial | 2772 | ||
Permanent link to this record | |||||
Author | Adria Ruiz; Joost Van de Weijer; Xavier Binefa | ||||
Title ![]() |
From emotions to action units with hidden and semi-hidden-task learning | Type | Conference Article | ||
Year | 2015 | Publication | 16th IEEE International Conference on Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 3703-3711 | ||
Keywords | |||||
Abstract | Limited annotated training data is a challenging problem in Action Unit recognition. In this paper, we investigate how the use of large databases labelled according to the 6 universal facial expressions can increase the generalization ability of Action Unit classifiers. For this purpose, we propose a novel learning framework: Hidden-Task Learning. HTL aims to learn a set of Hidden-Tasks (Action Units)for which samples are not available but, in contrast, training data is easier to obtain from a set of related VisibleTasks (Facial Expressions). To that end, HTL is able to exploit prior knowledge about the relation between Hidden and Visible-Tasks. In our case, we base this prior knowledge on empirical psychological studies providing statistical correlations between Action Units and universal facial expressions. Additionally, we extend HTL to Semi-Hidden Task Learning (SHTL) assuming that Action Unit training samples are also provided. Performing exhaustive experiments over four different datasets, we show that HTL and SHTL improve the generalization ability of AU classifiers by training them with additional facial expression data. Additionally, we show that SHTL achieves competitive performance compared with state-of-the-art Transductive Learning approaches which face the problem of limited training data by using unlabelled test samples during training. | ||||
Address | Santiago de Chile; Chile; December 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCV | ||
Notes | LAMP; 600.068; 600.079 | Approved | no | ||
Call Number | Admin @ si @ RWB2015 | Serial | 2671 | ||
Permanent link to this record | |||||
Author | Albert Clapes; Ozan Bilici; Dariia Temirova; Egils Avots; Gholamreza Anbarjafari; Sergio Escalera | ||||
Title ![]() |
From apparent to real age: gender, age, ethnic, makeup, and expression bias analysis in real age estimation | Type | Conference Article | ||
Year | 2018 | Publication | IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 2373-2382 | ||
Keywords | |||||
Abstract | |||||
Address | Salt Lake City; USA; June 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3116 | ||
Permanent link to this record | |||||
Author | Egils Avots; Meysam Madadi; Sergio Escalera; Jordi Gonzalez; Xavier Baro; Paul Pallin; Gholamreza Anbarjafari | ||||
Title ![]() |
From 2D to 3D geodesic-based garment matching | Type | Journal Article | ||
Year | 2019 | Publication | Multimedia Tools and Applications | Abbreviated Journal | MTAP |
Volume | 78 | Issue | 18 | Pages | 25829–25853 |
Keywords | Shape matching; Geodesic distance; Texture mapping; RGBD image processing; Gaussian mixture model | ||||
Abstract | A new approach for 2D to 3D garment retexturing is proposed based on Gaussian mixture models and thin plate splines (TPS). An automatically segmented garment of an individual is matched to a new source garment and rendered, resulting in augmented images in which the target garment has been retextured using the texture of the source garment. We divide the problem into garment boundary matching based on Gaussian mixture models and then interpolate inner points using surface topology extracted through geodesic paths, which leads to a more realistic result than standard approaches. We evaluated and compared our system quantitatively by root mean square error (RMS) and qualitatively using the mean opinion score (MOS), showing the benefits of the proposed methodology on our gathered dataset. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; ISE; 600.098; 600.119; 602.133 | Approved | no | ||
Call Number | Admin @ si @ AME2019 | Serial | 3317 | ||
Permanent link to this record | |||||
Author | Parichehr Behjati Ardakani; Pau Rodriguez; Carles Fernandez; Armin Mehri; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez | ||||
Title ![]() |
Frequency-based Enhancement Network for Efficient Super-Resolution | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Access | Abbreviated Journal | ACCESS |
Volume | 10 | Issue | Pages | 57383-57397 | |
Keywords | Deep learning; Frequency-based methods; Lightweight architectures; Single image super-resolution | ||||
Abstract | Recently, deep convolutional neural networks (CNNs) have provided outstanding performance in single image super-resolution (SISR). Despite their remarkable performance, the lack of high-frequency information in the recovered images remains a core problem. Moreover, as the networks increase in depth and width, deep CNN-based SR methods are faced with the challenge of computational complexity in practice. A promising and under-explored solution is to adapt the amount of compute based on the different frequency bands of the input. To this end, we present a novel Frequency-based Enhancement Block (FEB) which explicitly enhances the information of high frequencies while forwarding low-frequencies to the output. In particular, this block efficiently decomposes features into low- and high-frequency and assigns more computation to high-frequency ones. Thus, it can help the network generate more discriminative representations by explicitly recovering finer details. Our FEB design is simple and generic and can be used as a direct replacement of commonly used SR blocks with no need to change network architectures. We experimentally show that when replacing SR blocks with FEB we consistently improve the reconstruction error, while reducing the number of parameters in the model. Moreover, we propose a lightweight SR model — Frequency-based Enhancement Network (FENet) — based on FEB that matches the performance of larger models. Extensive experiments demonstrate that our proposal performs favorably against the state-of-the-art SR algorithms in terms of visual quality, memory footprint, and inference time. The code is available at https://github.com/pbehjatii/FENet | ||||
Address | 18 May 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | IEEE | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ BRF2022a | Serial | 3747 | ||
Permanent link to this record | |||||
Author | H.Martin Kjer; Jens Fagertuna; Sergio Vera; Debora Gil; Miguel Angel Gonzalez Ballester; Rasmus R. Paulsena | ||||
Title ![]() |
Free-form image registration of human cochlear uCT data using skeleton similarity as anatomical prior | Type | Journal Article | ||
Year | 2016 | Publication | Patter Recognition Letters | Abbreviated Journal | PRL |
Volume | 76 | Issue | 1 | Pages | 76-82 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; 600.060 | Approved | no | ||
Call Number | Admin @ si @ MFV2017b | Serial | 2941 | ||
Permanent link to this record |