|   | 
Details
   web
Records
Author (up) Xavier Soria; Angel Sappa; Arash Akbarinia
Title Multispectral Single-Sensor RGB-NIR Imaging: New Challenges and Opportunities Type Conference Article
Year 2017 Publication 7th International Conference on Image Processing Theory, Tools & Applications Abbreviated Journal
Volume Issue Pages
Keywords Color restoration; Neural networks; Singlesensor cameras; Multispectral images; RGB-NIR dataset
Abstract Multispectral images captured with a single sensor camera have become an attractive alternative for numerous computer vision applications. However, in order to fully exploit their potentials, the color restoration problem (RGB representation) should be addressed. This problem is more evident in outdoor scenarios containing vegetation, living beings, or specular materials. The problem of color distortion emerges from the sensitivity of sensors due to the overlap of visible and near infrared spectral bands. This paper empirically evaluates the variability of the near infrared (NIR) information with respect to the changes of light throughout the day. A tiny neural network is proposed to restore the RGB color representation from the given RGBN (Red, Green, Blue, NIR) images. In order to evaluate the proposed algorithm, different experiments on a RGBN outdoor dataset are conducted, which include various challenging cases. The obtained result shows the challenge and the importance of addressing color restoration in single sensor multispectral images.
Address Montreal; Canada; November 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IPTA
Notes NEUROBIT; MSIAU; 600.122 Approved no
Call Number Admin @ si @ SSA2017 Serial 3074
Permanent link to this record
 

 
Author (up) Xialei Liu; Joost Van de Weijer; Andrew Bagdanov
Title RankIQA: Learning from Rankings for No-reference Image Quality Assessment Type Conference Article
Year 2017 Publication 17th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages
Keywords
Abstract We propose a no-reference image quality assessment (NR-IQA) approach that learns from rankings (RankIQA). To address the problem of limited IQA dataset size, we train a Siamese Network to rank images in terms of image quality by using synthetically generated distortions for which relative image quality is known. These ranked image sets can be automatically generated without laborious human labeling. We then use fine-tuning to transfer the knowledge represented in the trained Siamese Network to a traditional CNN that estimates absolute image quality from single images. We demonstrate how our approach can be made significantly more efficient than traditional Siamese Networks by forward propagating a batch of images through a single network and backpropagating gradients derived from all pairs of images in the batch. Experiments on the TID2013 benchmark show that we improve the state-of-the-art by over 5%. Furthermore, on the LIVE benchmark we show that our approach is superior to existing NR-IQA techniques and that we even outperform the state-of-the-art in full-reference IQA (FR-IQA) methods without having to resort to high-quality reference images to infer IQA.
Address Venice; Italy; October 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes LAMP; 600.106; 600.109; 600.120 Approved no
Call Number Admin @ si @ LWB2017b Serial 3036
Permanent link to this record
 

 
Author (up) Xinhang Song; Luis Herranz; Shuqiang Jiang
Title Depth CNNs for RGB-D Scene Recognition: Learning from Scratch Better than Transferring from RGB-CNNs Type Conference Article
Year 2017 Publication 31st AAAI Conference on Artificial Intelligence Abbreviated Journal
Volume Issue Pages
Keywords RGB-D scene recognition; weakly supervised; fine tune; CNN
Abstract Scene recognition with RGB images has been extensively studied and has reached very remarkable recognition levels, thanks to convolutional neural networks (CNN) and large scene datasets. In contrast, current RGB-D scene data is much more limited, so often leverages RGB large datasets, by transferring pretrained RGB CNN models and fine-tuning with the target RGB-D dataset. However, we show that this approach has the limitation of hardly reaching bottom layers, which is key to learn modality-specific features. In contrast, we focus on the bottom layers, and propose an alternative strategy to learn depth features combining local weakly supervised training from patches followed by global fine tuning with images. This strategy is capable of learning very discriminative depth-specific features with limited depth images, without resorting to Places-CNN. In addition we propose a modified CNN architecture to further match the complexity of the model and the amount of data available. For RGB-D scene recognition, depth and RGB features are combined by projecting them in a common space and further leaning a multilayer classifier, which is jointly optimized in an end-to-end network. Our framework achieves state-of-the-art accuracy on NYU2 and SUN RGB-D in both depth only and combined RGB-D data.
Address San Francisco CA; February 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference AAAI
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ SHJ2017 Serial 2967
Permanent link to this record
 

 
Author (up) Xinhang Song; Shuqiang Jiang; Luis Herranz
Title Combining Models from Multiple Sources for RGB-D Scene Recognition Type Conference Article
Year 2017 Publication 26th International Joint Conference on Artificial Intelligence Abbreviated Journal
Volume Issue Pages 4523-4529
Keywords Robotics and Vision; Vision and Perception
Abstract Depth can complement RGB with useful cues about object volumes and scene layout. However, RGB-D image datasets are still too small for directly training deep convolutional neural networks (CNNs), in contrast to the massive monomodal RGB datasets. Previous works in RGB-D recognition typically combine two separate networks for RGB and depth data, pretrained with a large RGB dataset and then fine tuned to the respective target RGB and depth datasets. These approaches have several limitations: 1) only use low-level filters learned from RGB data, thus not being able to exploit properly depth-specific patterns, and 2) RGB and depth features are only combined at high-levels but rarely at lower-levels. In this paper, we propose a framework that leverages both knowledge acquired from large RGB datasets together with depth-specific cues learned from the limited depth data, obtaining more effective multi-source and multi-modal representations. We propose a multi-modal combination method that selects discriminative combinations of layers from the different source models and target modalities, capturing both high-level properties of the task and intrinsic low-level properties of both modalities.
Address Melbourne; Australia; August 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IJCAI
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ SJH2017b Serial 2966
Permanent link to this record
 

 
Author (up) Xinhang Song; Shuqiang Jiang; Luis Herranz
Title Multi-Scale Multi-Feature Context Modeling for Scene Recognition in the Semantic Manifold Type Journal Article
Year 2017 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP
Volume 26 Issue 6 Pages 2721-2735
Keywords
Abstract Before the big data era, scene recognition was often approached with two-step inference using localized intermediate representations (objects, topics, and so on). One of such approaches is the semantic manifold (SM), in which patches and images are modeled as points in a semantic probability simplex. Patch models are learned resorting to weak supervision via image labels, which leads to the problem of scene categories co-occurring in this semantic space. Fortunately, each category has its own co-occurrence patterns that are consistent across the images in that category. Thus, discovering and modeling these patterns are critical to improve the recognition performance in this representation. Since the emergence of large data sets, such as ImageNet and Places, these approaches have been relegated in favor of the much more powerful convolutional neural networks (CNNs), which can automatically learn multi-layered representations from the data. In this paper, we address many limitations of the original SM approach and related works. We propose discriminative patch representations using neural networks and further propose a hybrid architecture in which the semantic manifold is built on top of multiscale CNNs. Both representations can be computed significantly faster than the Gaussian mixture models of the original SM. To combine multiple scales, spatial relations, and multiple features, we formulate rich context models using Markov random fields. To solve the optimization problem, we analyze global and local approaches, where a top-down hierarchical algorithm has the best performance. Experimental results show that exploiting different types of contextual relations jointly consistently improves the recognition accuracy.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ SJH2017a Serial 2963
Permanent link to this record
 

 
Author (up) Yagmur Gucluturk; Umut Guclu; Marc Perez; Hugo Jair Escalante; Xavier Baro; Isabelle Guyon; Carlos Andujar; Julio C. S. Jacques Junior; Meysam Madadi; Sergio Escalera
Title Visualizing Apparent Personality Analysis with Deep Residual Networks Type Conference Article
Year 2017 Publication Chalearn Workshop on Action, Gesture, and Emotion Recognition: Large Scale Multimodal Gesture Recognition and Real versus Fake expressed emotions at ICCV Abbreviated Journal
Volume Issue Pages 3101-3109
Keywords
Abstract Automatic prediction of personality traits is a subjective task that has recently received much attention. Specifically, automatic apparent personality trait prediction from multimodal data has emerged as a hot topic within the filed of computer vision and, more particularly, the so called “looking
at people” sub-field. Considering “apparent” personality traits as opposed to real ones considerably reduces the subjectivity of the task. The real world applications are encountered in a wide range of domains, including entertainment, health, human computer interaction, recruitment and security. Predictive models of personality traits are useful for individuals in many scenarios (e.g., preparing for job interviews, preparing for public speaking). However, these predictions in and of themselves might be deemed to be untrustworthy without human understandable supportive evidence. Through a series of experiments on a recently released benchmark dataset for automatic apparent personality trait prediction, this paper characterizes the audio and
visual information that is used by a state-of-the-art model while making its predictions, so as to provide such supportive evidence by explaining predictions made. Additionally, the paper describes a new web application, which gives feedback on apparent personality traits of its users by combining
model predictions with their explanations.
Address Venice; Italy; October 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCVW
Notes HUPBA; 6002.143 Approved no
Call Number Admin @ si @ GGP2017 Serial 3067
Permanent link to this record
 

 
Author (up) Zhijie Fang; David Vazquez; Antonio Lopez
Title On-Board Detection of Pedestrian Intentions Type Journal Article
Year 2017 Publication Sensors Abbreviated Journal SENS
Volume 17 Issue 10 Pages 2193
Keywords pedestrian intention; ADAS; self-driving
Abstract Avoiding vehicle-to-pedestrian crashes is a critical requirement for nowadays advanced driver assistant systems (ADAS) and future self-driving vehicles. Accordingly, detecting pedestrians from raw sensor data has a history of more than 15 years of research, with vision playing a central role.
During the last years, deep learning has boosted the accuracy of image-based pedestrian detectors.
However, detection is just the first step towards answering the core question, namely is the vehicle going to crash with a pedestrian provided preventive actions are not taken? Therefore, knowing as soon as possible if a detected pedestrian has the intention of crossing the road ahead of the vehicle is
essential for performing safe and comfortable maneuvers that prevent a crash. However, compared to pedestrian detection, there is relatively little literature on detecting pedestrian intentions. This paper aims to contribute along this line by presenting a new vision-based approach which analyzes the
pose of a pedestrian along several frames to determine if he or she is going to enter the road or not. We present experiments showing 750 ms of anticipation for pedestrians crossing the road, which at a typical urban driving speed of 50 km/h can provide 15 additional meters (compared to a pure pedestrian detector) for vehicle automatic reactions or to warn the driver. Moreover, in contrast with state-of-the-art methods, our approach is monocular, neither requiring stereo nor optical flow information.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.085; 600.076; 601.223; 600.116; 600.118 Approved no
Call Number Admin @ si @ FVL2017 Serial 2983
Permanent link to this record