toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Sudeep Katakol; Basem Elbarashy; Luis Herranz; Joost Van de Weijer; Antonio Lopez edit   pdf
url  doi
openurl 
  Title Distributed Learning and Inference with Compressed Images Type Journal Article
  Year 2021 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 30 Issue Pages 3069 - 3083  
  Keywords  
  Abstract Modern computer vision requires processing large amounts of data, both while training the model and/or during inference, once the model is deployed. Scenarios where images are captured and processed in physically separated locations are increasingly common (e.g. autonomous vehicles, cloud computing). In addition, many devices suffer from limited resources to store or transmit data (e.g. storage space, channel capacity). In these scenarios, lossy image compression plays a crucial role to effectively increase the number of images collected under such constraints. However, lossy compression entails some undesired degradation of the data that may harm the performance of the downstream analysis task at hand, since important semantic information may be lost in the process. Moreover, we may only have compressed images at training time but are able to use original images at inference time, or vice versa, and in such a case, the downstream model suffers from covariate shift. In this paper, we analyze this phenomenon, with a special focus on vision-based perception for autonomous driving as a paradigmatic scenario. We see that loss of semantic information and covariate shift do indeed exist, resulting in a drop in performance that depends on the compression rate. In order to address the problem, we propose dataset restoration, based on image restoration with generative adversarial networks (GANs). Our method is agnostic to both the particular image compression method and the downstream task; and has the advantage of not adding additional cost to the deployed models, which is particularly important in resource-limited devices. The presented experiments focus on semantic segmentation as a challenging use case, cover a broad range of compression rates and diverse datasets, and show how our method is able to significantly alleviate the negative effects of compression on the downstream visual task.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; ADAS; 600.120; 600.118;CIC Approved no  
  Call Number (up) Admin @ si @ KEH2021 Serial 3543  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Muhammad Anwer Rao; Joost Van de Weijer; Andrew Bagdanov; Antonio Lopez; Michael Felsberg edit   pdf
doi  openurl
  Title Coloring Action Recognition in Still Images Type Journal Article
  Year 2013 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 105 Issue 3 Pages 205-221  
  Keywords  
  Abstract In this article we investigate the problem of human action recognition in static images. By action recognition we intend a class of problems which includes both action classification and action detection (i.e. simultaneous localization and classification). Bag-of-words image representations yield promising results for action classification, and deformable part models perform very well object detection. The representations for action recognition typically use only shape cues and ignore color information. Inspired by the recent success of color in image classification and object detection, we investigate the potential of color for action classification and detection in static images. We perform a comprehensive evaluation of color descriptors and fusion approaches for action recognition. Experiments were conducted on the three datasets most used for benchmarking action recognition in still images: Willow, PASCAL VOC 2010 and Stanford-40. Our experiments demonstrate that incorporating color information considerably improves recognition performance, and that a descriptor based on color names outperforms pure color descriptors. Our experiments demonstrate that late fusion of color and shape information outperforms other approaches on action recognition. Finally, we show that the different color–shape fusion approaches result in complementary information and combining them yields state-of-the-art performance for action classification.  
  Address  
  Corporate Author Thesis  
  Publisher Springer US Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0920-5691 ISBN Medium  
  Area Expedition Conference  
  Notes CIC; ADAS; 600.057; 600.048 Approved no  
  Call Number (up) Admin @ si @ KRW2013 Serial 2285  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Muhammad Anwer Rao; Joost Van de Weijer; Michael Felsberg; J.Laaksonen edit  doi
openurl 
  Title Compact color texture description for texture classification Type Journal Article
  Year 2015 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 51 Issue Pages 16-22  
  Keywords  
  Abstract Describing textures is a challenging problem in computer vision and pattern recognition. The classification problem involves assigning a category label to the texture class it belongs to. Several factors such as variations in scale, illumination and viewpoint make the problem of texture description extremely challenging. A variety of histogram based texture representations exists in literature.
However, combining multiple texture descriptors and assessing their complementarity is still an open research problem. In this paper, we first show that combining multiple local texture descriptors significantly improves the recognition performance compared to using a single best method alone. This
gain in performance is achieved at the cost of high-dimensional final image representation. To counter this problem, we propose to use an information-theoretic compression technique to obtain a compact texture description without any significant loss in accuracy. In addition, we perform a comprehensive
evaluation of pure color descriptors, popular in object recognition, for the problem of texture classification. Experiments are performed on four challenging texture datasets namely, KTH-TIPS-2a, KTH-TIPS-2b, FMD and Texture-10. The experiments clearly demonstrate that our proposed compact multi-texture approach outperforms the single best texture method alone. In all cases, discriminative color names outperforms other color features for texture classification. Finally, we show that combining discriminative color names with compact texture representation outperforms state-of-the-art methods by 7:8%, 4:3% and 5:0% on KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets respectively.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.068; 600.079;ADAS;CIC Approved no  
  Call Number (up) Admin @ si @ KRW2015a Serial 2587  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Michael Felsberg; Carlo Gatta edit   pdf
doi  openurl
  Title Semantic Pyramids for Gender and Action Recognition Type Journal Article
  Year 2014 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 23 Issue 8 Pages 3633-3645  
  Keywords  
  Abstract Person description is a challenging problem in computer vision. We investigated two major aspects of person description: 1) gender and 2) action recognition in still images. Most state-of-the-art approaches for gender and action recognition rely on the description of a single body part, such as face or full-body. However, relying on a single body part is suboptimal due to significant variations in scale, viewpoint, and pose in real-world images. This paper proposes a semantic pyramid approach for pose normalization. Our approach is fully automatic and based on combining information from full-body, upper-body, and face regions for gender and action recognition in still images. The proposed approach does not require any annotations for upper-body and face of a person. Instead, we rely on pretrained state-of-the-art upper-body and face detectors to automatically extract semantic information of a person. Given multiple bounding boxes from each body part detector, we then propose a simple method to select the best candidate bounding box, which is used for feature extraction. Finally, the extracted features from the full-body, upper-body, and face regions are combined into a single representation for classification. To validate the proposed approach for gender recognition, experiments are performed on three large data sets namely: 1) human attribute; 2) head-shoulder; and 3) proxemics. For action recognition, we perform experiments on four data sets most used for benchmarking action recognition in still images: 1) Sports; 2) Willow; 3) PASCAL VOC 2010; and 4) Stanford-40. Our experiments clearly demonstrate that the proposed approach, despite its simplicity, outperforms state-of-the-art methods for gender and action recognition.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes CIC; LAMP; 601.160; 600.074; 600.079;MILAB;ADAS Approved no  
  Call Number (up) Admin @ si @ KWR2014 Serial 2507  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Andrew Bagdanov; Michael Felsberg; Jorma edit   pdf
url  openurl
  Title Scale coding bag of deep features for human attribute and action recognition Type Journal Article
  Year 2018 Publication Machine Vision and Applications Abbreviated Journal MVAP  
  Volume 29 Issue 1 Pages 55-71  
  Keywords Action recognition; Attribute recognition; Bag of deep features  
  Abstract Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding. Both in bag-of-words and the recently popular representations based on convolutional neural networks, local features are computed at multiple scales. However, these multi-scale convolutional features are pooled into a single scale-invariant representation. We argue that entirely scale-invariant image representations are sub-optimal and investigate approaches to scale coding within a bag of deep features framework. Our approach encodes multi-scale information explicitly during the image encoding stage. We propose two strategies to encode multi-scale information explicitly in the final image representation. We validate our two scale coding techniques on five datasets: Willow, PASCAL VOC 2010, PASCAL VOC 2012, Stanford-40 and Human Attributes (HAT-27). On all datasets, the proposed scale coding approaches outperform both the scale-invariant method and the standard deep features of the same network. Further, combining our scale coding approaches with standard deep features leads to consistent improvement over the state of the art.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.068; 600.079; 600.106; 600.120;CIC;ADAS Approved no  
  Call Number (up) Admin @ si @ KWR2018 Serial 3107  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Jiaolong Xu; Muhammad Anwer Rao; Joost Van de Weijer; Andrew Bagdanov; Antonio Lopez edit  doi
openurl 
  Title Recognizing Actions through Action-specific Person Detection Type Journal Article
  Year 2015 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 24 Issue 11 Pages 4422-4432  
  Keywords  
  Abstract Action recognition in still images is a challenging problem in computer vision. To facilitate comparative evaluation independently of person detection, the standard evaluation protocol for action recognition uses an oracle person detector to obtain perfect bounding box information at both training and test time. The assumption is that, in practice, a general person detector will provide candidate bounding boxes for action recognition. In this paper, we argue that this paradigm is suboptimal and that action class labels should already be considered during the detection stage. Motivated by the observation that body pose is strongly conditioned on action class, we show that: 1) the existing state-of-the-art generic person detectors are not adequate for proposing candidate bounding boxes for action classification; 2) due to limited training examples, the direct training of action-specific person detectors is also inadequate; and 3) using only a small number of labeled action examples, the transfer learning is able to adapt an existing detector to propose higher quality bounding boxes for subsequent action classification. To the best of our knowledge, we are the first to investigate transfer learning for the task of action-specific person detection in still images. We perform extensive experiments on two benchmark data sets: 1) Stanford-40 and 2) PASCAL VOC 2012. For the action detection task (i.e., both person localization and classification of the action performed), our approach outperforms methods based on general person detection by 5.7% mean average precision (MAP) on Stanford-40 and 2.1% MAP on PASCAL VOC 2012. Our approach also significantly outperforms the state of the art with a MAP of 45.4% on Stanford-40 and 31.4% on PASCAL VOC 2012. We also evaluate our action detection approach for the task of action classification (i.e., recognizing actions without localizing them). For this task, our approach, without using any ground-truth person localization at test tim- , outperforms on both data sets state-of-the-art methods, which do use person locations.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; LAMP; 600.076; 600.079;CIC Approved no  
  Call Number (up) Admin @ si @ KXR2015 Serial 2668  
Permanent link to this record
 

 
Author Joan Marc Llargues Asensio; Juan Peralta; Raul Arrabales; Manuel Gonzalez Bedia; Paulo Cortez; Antonio Lopez edit  doi
openurl 
  Title Artificial Intelligence Approaches for the Generation and Assessment of Believable Human-Like Behaviour in Virtual Characters Type Journal Article
  Year 2014 Publication Expert Systems With Applications Abbreviated Journal EXSY  
  Volume 41 Issue 16 Pages 7281–7290  
  Keywords Turing test; Human-like behaviour; Believability; Non-player characters; Cognitive architectures; Genetic algorithm; Artificial neural networks  
  Abstract Having artificial agents to autonomously produce human-like behaviour is one of the most ambitious original goals of Artificial Intelligence (AI) and remains an open problem nowadays. The imitation game originally proposed by Turing constitute a very effective method to prove the indistinguishability of an artificial agent. The behaviour of an agent is said to be indistinguishable from that of a human when observers (the so-called judges in the Turing test) cannot tell apart humans and non-human agents. Different environments, testing protocols, scopes and problem domains can be established to develop limited versions or variants of the original Turing test. In this paper we use a specific version of the Turing test, based on the international BotPrize competition, built in a First-Person Shooter video game, where both human players and non-player characters interact in complex virtual environments. Based on our past experience both in the BotPrize competition and other robotics and computer game AI applications we have developed three new more advanced controllers for believable agents: two based on a combination of the CERA–CRANIUM and SOAR cognitive architectures and other based on ADANN, a system for the automatic evolution and adaptation of artificial neural networks. These two new agents have been put to the test jointly with CCBot3, the winner of BotPrize 2010 competition (Arrabales et al., 2012), and have showed a significant improvement in the humanness ratio. Additionally, we have confronted all these bots to both First-person believability assessment (BotPrize original judging protocol) and Third-person believability assessment, demonstrating that the active involvement of the judge has a great impact in the recognition of human-like behaviour.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.055; 600.057; 600.076 Approved no  
  Call Number (up) Admin @ si @ LPA2014 Serial 2500  
Permanent link to this record
 

 
Author Antonio Lopez; Gabriel Villalonga; Laura Sellart; German Ros; David Vazquez; Jiaolong Xu; Javier Marin; Azadeh S. Mozafari edit   pdf
url  openurl
  Title Training my car to see using virtual worlds Type Journal Article
  Year 2017 Publication Image and Vision Computing Abbreviated Journal IMAVIS  
  Volume 38 Issue Pages 102-118  
  Keywords  
  Abstract Computer vision technologies are at the core of different advanced driver assistance systems (ADAS) and will play a key role in oncoming autonomous vehicles too. One of the main challenges for such technologies is to perceive the driving environment, i.e. to detect and track relevant driving information in a reliable manner (e.g. pedestrians in the vehicle route, free space to drive through). Nowadays it is clear that machine learning techniques are essential for developing such a visual perception for driving. In particular, the standard working pipeline consists of collecting data (i.e. on-board images), manually annotating the data (e.g. drawing bounding boxes around pedestrians), learning a discriminative data representation taking advantage of such annotations (e.g. a deformable part-based model, a deep convolutional neural network), and then assessing the reliability of such representation with the acquired data. In the last two decades most of the research efforts focused on representation learning (first, designing descriptors and learning classifiers; later doing it end-to-end). Hence, collecting data and, especially, annotating it, is essential for learning good representations. While this has been the case from the very beginning, only after the disruptive appearance of deep convolutional neural networks that it became a serious issue due to their data hungry nature. In this context, the problem is that manual data annotation is a tiresome work prone to errors. Accordingly, in the late 00’s we initiated a research line consisting of training visual models using photo-realistic computer graphics, especially focusing on assisted and autonomous driving. In this paper, we summarize such a work and show how it has become a new tendency with increasing acceptance.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number (up) Admin @ si @ LVS2017 Serial 2985  
Permanent link to this record
 

 
Author Javier Marin; Sergio Escalera edit   pdf
url  openurl
  Title SSSGAN: Satellite Style and Structure Generative Adversarial Networks Type Journal Article
  Year 2021 Publication Remote Sensing Abbreviated Journal  
  Volume 13 Issue 19 Pages 3984  
  Keywords  
  Abstract This work presents Satellite Style and Structure Generative Adversarial Network (SSGAN), a generative model of high resolution satellite imagery to support image segmentation. Based on spatially adaptive denormalization modules (SPADE) that modulate the activations with respect to segmentation map structure, in addition to global descriptor vectors that capture the semantic information in a vector with respect to Open Street Maps (OSM) classes, this model is able to produce
consistent aerial imagery. By decoupling the generation of aerial images into a structure map and a carefully defined style vector, we were able to improve the realism and geodiversity of the synthesis with respect to the state-of-the-art baseline. Therefore, the proposed model allows us to control the generation not only with respect to the desired structure, but also with respect to a geographic area.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj;MILAB;ADAS Approved no  
  Call Number (up) Admin @ si @ MaE2021 Serial 3651  
Permanent link to this record
 

 
Author T. Mouats; N. Aouf; Angel Sappa; Cristhian A. Aguilera-Carrasco; Ricardo Toledo edit  doi
openurl 
  Title Multi-Spectral Stereo Odometry Type Journal Article
  Year 2015 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS  
  Volume 16 Issue 3 Pages 1210-1224  
  Keywords Egomotion estimation; feature matching; multispectral odometry (MO); optical flow; stereo odometry; thermal imagery  
  Abstract In this paper, we investigate the problem of visual odometry for ground vehicles based on the simultaneous utilization of multispectral cameras. It encompasses a stereo rig composed of an optical (visible) and thermal sensors. The novelty resides in the localization of the cameras as a stereo setup rather
than two monocular cameras of different spectrums. To the best of our knowledge, this is the first time such task is attempted. Log-Gabor wavelets at different orientations and scales are used to extract interest points from both images. These are then described using a combination of frequency and spatial information within the local neighborhood. Matches between the pairs of multimodal images are computed using the cosine similarity function based
on the descriptors. Pyramidal Lucas–Kanade tracker is also introduced to tackle temporal feature matching within challenging sequences of the data sets. The vehicle egomotion is computed from the triangulated 3-D points corresponding to the matched features. A windowed version of bundle adjustment incorporating
Gauss–Newton optimization is utilized for motion estimation. An outlier removal scheme is also included within the framework to deal with outliers. Multispectral data sets were generated and used as test bed. They correspond to real outdoor scenarios captured using our multimodal setup. Finally, detailed results validating the proposed strategy are illustrated.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1524-9050 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.055; 600.076 Approved no  
  Call Number (up) Admin @ si @ MAS2015a Serial 2533  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: