toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author (up) Parichehr Behjati Ardakani; Pau Rodriguez; Carles Fernandez; Armin Mehri; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez edit  doi
openurl 
  Title Frequency-based Enhancement Network for Efficient Super-Resolution Type Journal Article
  Year 2022 Publication IEEE Access Abbreviated Journal ACCESS  
  Volume 10 Issue Pages 57383-57397  
  Keywords Deep learning; Frequency-based methods; Lightweight architectures; Single image super-resolution  
  Abstract Recently, deep convolutional neural networks (CNNs) have provided outstanding performance in single image super-resolution (SISR). Despite their remarkable performance, the lack of high-frequency information in the recovered images remains a core problem. Moreover, as the networks increase in depth and width, deep CNN-based SR methods are faced with the challenge of computational complexity in practice. A promising and under-explored solution is to adapt the amount of compute based on the different frequency bands of the input. To this end, we present a novel Frequency-based Enhancement Block (FEB) which explicitly enhances the information of high frequencies while forwarding low-frequencies to the output. In particular, this block efficiently decomposes features into low- and high-frequency and assigns more computation to high-frequency ones. Thus, it can help the network generate more discriminative representations by explicitly recovering finer details. Our FEB design is simple and generic and can be used as a direct replacement of commonly used SR blocks with no need to change network architectures. We experimentally show that when replacing SR blocks with FEB we consistently improve the reconstruction error, while reducing the number of parameters in the model. Moreover, we propose a lightweight SR model — Frequency-based Enhancement Network (FENet) — based on FEB that matches the performance of larger models. Extensive experiments demonstrate that our proposal performs favorably against the state-of-the-art SR algorithms in terms of visual quality, memory footprint, and inference time. The code is available at https://github.com/pbehjatii/FENet  
  Address 18 May 2022  
  Corporate Author Thesis  
  Publisher IEEE Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ BRF2022a Serial 3747  
Permanent link to this record
 

 
Author (up) Parichehr Behjati; Pau Rodriguez; Carles Fernandez; Isabelle Hupont; Armin Mehri; Jordi Gonzalez edit  url
openurl 
  Title Single image super-resolution based on directional variance attention network Type Journal Article
  Year 2023 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 133 Issue Pages 108997  
  Keywords  
  Abstract Recent advances in single image super-resolution (SISR) explore the power of deep convolutional neural networks (CNNs) to achieve better performance. However, most of the progress has been made by scaling CNN architectures, which usually raise computational demands and memory consumption. This makes modern architectures less applicable in practice. In addition, most CNN-based SR methods do not fully utilize the informative hierarchical features that are helpful for final image recovery. In order to address these issues, we propose a directional variance attention network (DiVANet), a computationally efficient yet accurate network for SISR. Specifically, we introduce a novel directional variance attention (DiVA) mechanism to capture long-range spatial dependencies and exploit inter-channel dependencies simultaneously for more discriminative representations. Furthermore, we propose a residual attention feature group (RAFG) for parallelizing attention and residual block computation. The output of each residual block is linearly fused at the RAFG output to provide access to the whole feature hierarchy. In parallel, DiVA extracts most relevant features from the network for improving the final output and preventing information loss along the successive operations inside the network. Experimental results demonstrate the superiority of DiVANet over the state of the art in several datasets, while maintaining relatively low computation and memory footprint. The code is available at https://github.com/pbehjatii/DiVANet.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ BPF2023 Serial 3861  
Permanent link to this record
 

 
Author (up) Pau Baiget; Carles Fernandez; Xavier Roca; Jordi Gonzalez edit  doi
openurl 
  Title Generation of Augmented Video Sequences Combining Behavioral Animation and Multi Object Tracking Type Journal Article
  Year 2009 Publication Computer Animation and Virtual Worlds Abbreviated Journal  
  Volume 20 Issue 4 Pages 473–489  
  Keywords  
  Abstract In this paper we present a novel approach to generate augmented video sequences in real-time, involving interactions between virtual and real agents in real scenarios. On the one hand, real agent motion is estimated by means of a multi-object tracking algorithm, which determines real objects' position over the scenario for each time step. On the other hand, virtual agents are provided with behavior models considering their interaction with the environment and with other agents. The resulting framework allows to generate video sequences involving behavior-based virtual agents that react to real agent behavior and has applications in education, simulation, and in the game and movie industries. We show the performance of the proposed approach in an indoor and outdoor scenario simulating human and vehicle agents. Copyright © 2009 John Wiley & Sons, Ltd.

We present a novel approach to generate augmented video sequences in real-time, involving interactions between virtual and real agents in real scenarios. On the one hand, real agent motion is estimated by means of a multi-object tracking algorithm, which determines real objects' position over the scenario for each time step. On the other hand, virtual agents are provided with behavior models considering their interaction with the environment and with other agents. © 2009 Wiley Periodicals, Inc.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number ISE @ ise @ BFR2009 Serial 1170  
Permanent link to this record
 

 
Author (up) Pau Rodriguez; Diego Velazquez; Guillem Cucurull; Josep M. Gonfaus; Xavier Roca; Jordi Gonzalez edit   pdf
doi  openurl
  Title Pay attention to the activations: a modular attention mechanism for fine-grained image recognition Type Journal Article
  Year 2020 Publication IEEE Transactions on Multimedia Abbreviated Journal TMM  
  Volume 22 Issue 2 Pages 502-514  
  Keywords  
  Abstract Fine-grained image recognition is central to many multimedia tasks such as search, retrieval, and captioning. Unfortunately, these tasks are still challenging since the appearance of samples of the same class can be more different than those from different classes. This issue is mainly due to changes in deformation, pose, and the presence of clutter. In the literature, attention has been one of the most successful strategies to handle the aforementioned problems. Attention has been typically implemented in neural networks by selecting the most informative regions of the image that improve classification. In contrast, in this paper, attention is not applied at the image level but to the convolutional feature activations. In essence, with our approach, the neural model learns to attend to lower-level feature activations without requiring part annotations and uses those activations to update and rectify the output likelihood distribution. The proposed mechanism is modular, architecture-independent, and efficient in terms of both parameters and computation required. Experiments demonstrate that well-known networks such as wide residual networks and ResNeXt, when augmented with our approach, systematically improve their classification accuracy and become more robust to changes in deformation and pose and to the presence of clutter. As a result, our proposal reaches state-of-the-art classification accuracies in CIFAR-10, the Adience gender recognition task, Stanford Dogs, and UEC-Food100 while obtaining competitive performance in ImageNet, CIFAR-100, CUB200 Birds, and Stanford Cars. In addition, we analyze the different components of our model, showing that the proposed attention modules succeed in finding the most discriminative regions of the image. Finally, as a proof of concept, we demonstrate that with only local predictions, an augmented neural network can successfully classify an image before reaching any fully connected layer, thus reducing the computational amount up to 10%.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE; 600.119; 600.098 Approved no  
  Call Number Admin @ si @ RVC2020a Serial 3417  
Permanent link to this record
 

 
Author (up) Pau Rodriguez; Diego Velazquez; Guillem Cucurull; Josep M. Gonfaus; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez edit  url
doi  openurl
  Title Personality Trait Analysis in Social Networks Based on Weakly Supervised Learning of Shared Images Type Journal Article
  Year 2020 Publication Applied Sciences Abbreviated Journal APPLSCI  
  Volume 10 Issue 22 Pages 8170  
  Keywords sentiment analysis, personality trait analysis; weakly-supervised learning; visual classification; OCEAN model; social networks  
  Abstract Social networks have attracted the attention of psychologists, as the behavior of users can be used to assess personality traits, and to detect sentiments and critical mental situations such as depression or suicidal tendencies. Recently, the increasing amount of image uploads to social networks has shifted the focus from text to image-based personality assessment. However, obtaining the ground-truth requires giving personality questionnaires to the users, making the process very costly and slow, and hindering research on large populations. In this paper, we demonstrate that it is possible to predict which images are most associated with each personality trait of the OCEAN personality model, without requiring ground-truth personality labels. Namely, we present a weakly supervised framework which shows that the personality scores obtained using specific images textually associated with particular personality traits are highly correlated with scores obtained using standard text-based personality questionnaires. We trained an OCEAN trait model based on Convolutional Neural Networks (CNNs), learned from 120K pictures posted with specific textual hashtags, to infer whether the personality scores from the images uploaded by users are consistent with those scores obtained from text. In order to validate our claims, we performed a personality test on a heterogeneous group of 280 human subjects, showing that our model successfully predicts which kind of image will match a person with a given level of a trait. Looking at the results, we obtained evidence that personality is not only correlated with text, but with image content too. Interestingly, different visual patterns emerged from those images most liked by persons with a particular personality trait: for instance, pictures most associated with high conscientiousness usually contained healthy food, while low conscientiousness pictures contained injuries, guns, and alcohol. These findings could pave the way to complement text-based personality questionnaires with image-based questions.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE; 600.119 Approved no  
  Call Number Admin @ si @ RVC2020b Serial 3553  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: