toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links (up)
Author Hassan Ahmed Sial; S. Sancho; Ramon Baldrich; Robert Benavente; Maria Vanrell edit   pdf
url  openurl
  Title Color-based data augmentation for Reflectance Estimation Type Conference Article
  Year 2018 Publication 26th Color Imaging Conference Abbreviated Journal  
  Volume Issue Pages 284-289  
  Keywords  
  Abstract Deep convolutional architectures have shown to be successful frameworks to solve generic computer vision problems. The estimation of intrinsic reflectance from single image is not a solved problem yet. Encoder-Decoder architectures are a perfect approach for pixel-wise reflectance estimation, although it usually suffers from the lack of large datasets. Lack of data can be partially solved with data augmentation, however usual techniques focus on geometric changes which does not help for reflectance estimation. In this paper we propose a color-based data augmentation technique that extends the training data by increasing the variability of chromaticity. Rotation on the red-green blue-yellow plane of an opponent space enable to increase the training set in a coherent and sound way that improves network generalization capability for reflectance estimation. We perform some experiments on the Sintel dataset showing that our color-based augmentation increase performance and overcomes one of the state-of-the-art methods.  
  Address Vancouver; November 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CIC  
  Notes CIC Approved no  
  Call Number Admin @ si @ SSB2018a Serial 3129  
Permanent link to this record
 

 
Author Xavier Soria; Angel Sappa; Riad I. Hammoud edit   pdf
url  doi
openurl 
  Title Wide-Band Color Imagery Restoration for RGB-NIR Single Sensor Images Type Journal Article
  Year 2018 Publication Sensors Abbreviated Journal SENS  
  Volume 18 Issue 7 Pages 2059  
  Keywords RGB-NIR sensor; multispectral imaging; deep learning; CNNs  
  Abstract Multi-spectral RGB-NIR sensors have become ubiquitous in recent years. These sensors allow the visible and near-infrared spectral bands of a given scene to be captured at the same time. With such cameras, the acquired imagery has a compromised RGB color representation due to near-infrared bands (700–1100 nm) cross-talking with the visible bands (400–700 nm).
This paper proposes two deep learning-based architectures to recover the full RGB color images, thus removing the NIR information from the visible bands. The proposed approaches directly restore the high-resolution RGB image by means of convolutional neural networks. They are evaluated with several outdoor images; both architectures reach a similar performance when evaluated in different
scenarios and using different similarity metrics. Both of them improve the state of the art approaches.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; MSIAU; 600.086; 600.130; 600.122; 600.118 Approved no  
  Call Number Admin @ si @ SSH2018 Serial 3145  
Permanent link to this record
 

 
Author Dimosthenis Karatzas; Lluis Gomez; Marçal Rusiñol; Anguelos Nicolaou edit   pdf
url  openurl
  Title The Robust Reading Competition Annotation and Evaluation Platform Type Conference Article
  Year 2018 Publication 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal  
  Volume Issue Pages 61-66  
  Keywords  
  Abstract The ICDAR Robust Reading Competition (RRC), initiated in 2003 and reestablished in 2011, has become the defacto evaluation standard for the international community. Concurrent with its second incarnation in 2011, a continuous
effort started to develop an online framework to facilitate the hosting and management of competitions. This short paper briefly outlines the Robust Reading Competition Annotation and Evaluation Platform, the backbone of the
Robust Reading Competition, comprising a collection of tools and processes that aim to simplify the management and annotation of data, and to provide online and offline performance evaluation and analysis services.
 
  Address Viena; Austria; April 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.084; 600.121 Approved no  
  Call Number KGR2018 Serial 3103  
Permanent link to this record
 

 
Author Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Cutting Sayre's Knot: Reading Scene Text without Segmentation. Application to Utility Meters Type Conference Article
  Year 2018 Publication 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal  
  Volume Issue Pages 97-102  
  Keywords Robust Reading; End-to-end Systems; CNN; Utility Meters  
  Abstract In this paper we present a segmentation-free system for reading text in natural scenes. A CNN architecture is trained in an end-to-end manner, and is able to directly output readings without any explicit text localization step. In order to validate our proposal, we focus on the specific case of reading utility meters. We present our results in a large dataset of images acquired by different users and devices, so text appears in any location, with different sizes, fonts and lengths, and the images present several distortions such as
dirt, illumination highlights or blur.
 
  Address Viena; Austria; April 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.084; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ GRK2018 Serial 3102  
Permanent link to this record
 

 
Author Zhijie Fang; Antonio Lopez edit   pdf
url  doi
openurl 
  Title Is the Pedestrian going to Cross? Answering by 2D Pose Estimation Type Conference Article
  Year 2018 Publication IEEE Intelligent Vehicles Symposium Abbreviated Journal  
  Volume Issue Pages 1271 - 1276  
  Keywords  
  Abstract Our recent work suggests that, thanks to nowadays powerful CNNs, image-based 2D pose estimation is a promising cue for determining pedestrian intentions such as crossing the road in the path of the ego-vehicle, stopping before entering the road, and starting to walk or bending towards the road. This statement is based on the results obtained on non-naturalistic sequences (Daimler dataset), i.e. in sequences choreographed specifically for performing the study. Fortunately, a new publicly available dataset (JAAD) has appeared recently to allow developing methods for detecting pedestrian intentions in naturalistic driving conditions; more specifically, for addressing the relevant question is the pedestrian going to cross? Accordingly, in this paper we use JAAD to assess the usefulness of 2D pose estimation for answering such a question. We combine CNN-based pedestrian detection, tracking and pose estimation to predict the crossing action from monocular images. Overall, the proposed pipeline provides new state-ofthe-art results.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IV  
  Notes ADAS; 600.124; 600.116; 600.118 Approved no  
  Call Number Admin @ si @ FaL2018 Serial 3181  
Permanent link to this record
 

 
Author Eduardo Aguilar; Beatriz Remeseiro; Marc Bolaños; Petia Radeva edit   pdf
url  doi
openurl 
  Title Grab, Pay, and Eat: Semantic Food Detection for Smart Restaurants Type Journal Article
  Year 2018 Publication IEEE Transactions on Multimedia Abbreviated Journal  
  Volume 20 Issue 12 Pages 3266 - 3275  
  Keywords  
  Abstract The increase in awareness of people towards their nutritional habits has drawn considerable attention to the field of automatic food analysis. Focusing on self-service restaurants environment, automatic food analysis is not only useful for extracting nutritional information from foods selected by customers, it is also of high interest to speed up the service solving the bottleneck produced at the cashiers in times of high demand. In this paper, we address the problem of automatic food tray analysis in canteens and restaurants environment, which consists in predicting multiple foods placed on a tray image. We propose a new approach for food analysis based on convolutional neural networks, we name Semantic Food Detection, which integrates in the same framework food localization, recognition and segmentation. We demonstrate that our method improves the state of the art food detection by a considerable margin on the public dataset UNIMIB2016 achieving about 90% in terms of F-measure, and thus provides a significant technological advance towards the automatic billing in restaurant environments.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ ARB2018 Serial 3236  
Permanent link to this record
 

 
Author Huamin Ren; Nattiya Kanhabua; Andreas Mogelmose; Weifeng Liu; Kaustubh Kulkarni; Sergio Escalera; Xavier Baro; Thomas B. Moeslund edit  url
doi  openurl
  Title Back-dropout Transfer Learning for Action Recognition Type Journal Article
  Year 2018 Publication IET Computer Vision Abbreviated Journal IETCV  
  Volume 12 Issue 4 Pages 484-491  
  Keywords Learning (artificial intelligence); Pattern Recognition  
  Abstract Transfer learning aims at adapting a model learned from source dataset to target dataset. It is a beneficial approach especially when annotating on the target dataset is expensive or infeasible. Transfer learning has demonstrated its powerful learning capabilities in various vision tasks. Despite transfer learning being a promising approach, it is still an open question how to adapt the model learned from the source dataset to the target dataset. One big challenge is to prevent the impact of category bias on classification performance. Dataset bias exists when two images from the same category, but from different datasets, are not classified as the same. To address this problem, a transfer learning algorithm has been proposed, called negative back-dropout transfer learning (NB-TL), which utilizes images that have been misclassified and further performs back-dropout strategy on them to penalize errors. Experimental results demonstrate the effectiveness of the proposed algorithm. In particular, the authors evaluate the performance of the proposed NB-TL algorithm on UCF 101 action recognition dataset, achieving 88.9% recognition rate.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ RKM2018 Serial 3071  
Permanent link to this record
 

 
Author Carles Sanchez; Miguel Viñas; Coen Antens; Agnes Borras; Debora Gil edit   pdf
url  doi
openurl 
  Title Back to Front Architecture for Diagnosis as a Service Type Conference Article
  Year 2018 Publication 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing Abbreviated Journal  
  Volume Issue Pages 343-346  
  Keywords  
  Abstract Software as a Service (SaaS) is a cloud computing model in which a provider hosts applications in a server that customers use via internet. Since SaaS does not require to install applications on customers' own computers, it allows the use by multiple users of highly specialized software without extra expenses for hardware acquisition or licensing. A SaaS tailored for clinical needs not only would alleviate licensing costs, but also would facilitate easy access to new methods for diagnosis assistance. This paper presents a SaaS client-server architecture for Diagnosis as a Service (DaaS). The server is based on docker technology in order to allow execution of softwares implemented in different languages with the highest portability and scalability. The client is a content management system allowing the design of websites with multimedia content and interactive visualization of results allowing user editing. We explain a usage case that uses our DaaS as crowdsourcing platform in a multicentric pilot study carried out to evaluate the clinical benefits of a software for assessment of central airway obstruction.  
  Address Timisoara; Rumania; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference SYNASC  
  Notes IAM; 600.145 Approved no  
  Call Number Admin @ si @ SVA2018 Serial 3360  
Permanent link to this record
 

 
Author Domicele Jonauskaite; Nele Dael; C. Alejandro Parraga; Laetitia Chevre; Alejandro Garcia Sanchez; Christine Mohr edit   pdf
url  doi
openurl 
  Title Stripping #The Dress: The importance of contextual information on inter-individual differences in colour perception Type Journal Article
  Year 2018 Publication Psychological Research Abbreviated Journal PSYCHO R  
  Volume Issue Pages 1-15  
  Keywords  
  Abstract In 2015, a picture of a Dress (henceforth the Dress) triggered popular and scientific interest; some reported seeing the Dress in white and gold (W&G) and others in blue and black (B&B). We aimed to describe the phenomenon and investigate the role of contextualization. Few days after the Dress had appeared on the Internet, we projected it to 240 students on two large screens in the classroom. Participants reported seeing the Dress in B&B (48%), W&G (38%), or blue and brown (B&Br; 7%). Amongst numerous socio-demographic variables, we only observed that W&G viewers were most likely to have always seen the Dress as W&G. In the laboratory, we tested how much contextual information is necessary for the phenomenon to occur. Fifty-seven participants selected colours most precisely matching predominant colours of parts or the full Dress. We presented, in this order, small squares (a), vertical strips (b), and the full Dress (c). We found that (1) B&B, B&Br, and W&G viewers had selected colours differing in lightness and chroma levels for contextualized images only (b, c conditions) and hue for fully contextualized condition only (c) and (2) B&B viewers selected colours most closely matching displayed colours of the Dress. Thus, the Dress phenomenon emerges due to inter-individual differences in subjectively perceived lightness, chroma, and hue, at least when all aspects of the picture need to be integrated. Our results support the previous conclusions that contextual information is key to colour perception; it should be important to understand how this actually happens.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes NEUROBIT; no proj Approved no  
  Call Number Admin @ si @ JDP2018 Serial 3149  
Permanent link to this record
 

 
Author Alejandro Cartas; Juan Marin; Petia Radeva; Mariella Dimiccoli edit   pdf
url  openurl
  Title Batch-based activity recognition from egocentric photo-streams revisited Type Journal Article
  Year 2018 Publication Pattern Analysis and Applications Abbreviated Journal PAA  
  Volume 21 Issue 4 Pages 953–965  
  Keywords Egocentric vision; Lifelogging; Activity recognition; Deep learning; Recurrent neural networks  
  Abstract Wearable cameras can gather large amounts of image data that provide rich visual information about the daily activities of the wearer. Motivated by the large number of health applications that could be enabled by the automatic recognition of daily activities, such as lifestyle characterization for habit improvement, context-aware personal assistance and tele-rehabilitation services, we propose a system to classify 21 daily activities from photo-streams acquired by a wearable photo-camera. Our approach combines the advantages of a late fusion ensemble strategy relying on convolutional neural networks at image level with the ability of recurrent neural networks to account for the temporal evolution of high-level features in photo-streams without relying on event boundaries. The proposed batch-based approach achieved an overall accuracy of 89.85%, outperforming state-of-the-art end-to-end methodologies. These results were achieved on a dataset consists of 44,902 egocentric pictures from three persons captured during 26 days in average.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ CMR2018 Serial 3186  
Permanent link to this record
 

 
Author Sergio Escalera; Jordi Gonzalez; Hugo Jair Escalante; Xavier Baro; Isabelle Guyon edit  url
openurl 
  Title Looking at People Special Issue Type Journal Article
  Year 2018 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 126 Issue 2-4 Pages 141-143  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; ISE; 600.119 Approved no  
  Call Number Admin @ si @ EGJ2018 Serial 3093  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Andrew Bagdanov; Michael Felsberg; Jorma edit   pdf
url  openurl
  Title Scale coding bag of deep features for human attribute and action recognition Type Journal Article
  Year 2018 Publication Machine Vision and Applications Abbreviated Journal MVAP  
  Volume 29 Issue 1 Pages 55-71  
  Keywords Action recognition; Attribute recognition; Bag of deep features  
  Abstract Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding. Both in bag-of-words and the recently popular representations based on convolutional neural networks, local features are computed at multiple scales. However, these multi-scale convolutional features are pooled into a single scale-invariant representation. We argue that entirely scale-invariant image representations are sub-optimal and investigate approaches to scale coding within a bag of deep features framework. Our approach encodes multi-scale information explicitly during the image encoding stage. We propose two strategies to encode multi-scale information explicitly in the final image representation. We validate our two scale coding techniques on five datasets: Willow, PASCAL VOC 2010, PASCAL VOC 2012, Stanford-40 and Human Attributes (HAT-27). On all datasets, the proposed scale coding approaches outperform both the scale-invariant method and the standard deep features of the same network. Further, combining our scale coding approaches with standard deep features leads to consistent improvement over the state of the art.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.068; 600.079; 600.106; 600.120 Approved no  
  Call Number Admin @ si @ KWR2018 Serial 3107  
Permanent link to this record
 

 
Author Albert Clapes; Alex Pardo; Oriol Pujol; Sergio Escalera edit   pdf
url  openurl
  Title Action detection fusing multiple Kinects and a WIMU: an application to in-home assistive technology for the elderly Type Journal Article
  Year 2018 Publication Machine Vision and Applications Abbreviated Journal MVAP  
  Volume 29 Issue 5 Pages 765–788  
  Keywords Multimodal activity detection; Computer vision; Inertial sensors; Dense trajectories; Dynamic time warping; Assistive technology  
  Abstract We present a vision-inertial system which combines two RGB-Depth devices together with a wearable inertial movement unit in order to detect activities of the daily living. From multi-view videos, we extract dense trajectories enriched with a histogram of normals description computed from the depth cue and bag them into multi-view codebooks. During the later classification step a multi-class support vector machine with a RBF- 2 kernel combines the descriptions at kernel level. In order to perform action detection from the videos, a sliding window approach is utilized. On the other hand, we extract accelerations, rotation angles, and jerk features from the inertial data collected by the wearable placed on the user’s dominant wrist. During gesture spotting, a dynamic time warping is applied and the aligning costs to a set of pre-selected gesture sub-classes are thresholded to determine possible detections. The outputs of the two modules are combined in a late-fusion fashion. The system is validated in a real-case scenario with elderly from an elder home. Learning-based fusion results improve the ones from the single modalities, demonstrating the success of such multimodal approach.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ CPP2018 Serial 3125  
Permanent link to this record
 

 
Author Laura Lopez-Fuentes; Joost Van de Weijer; Manuel Gonzalez-Hidalgo; Harald Skinnemoen; Andrew Bagdanov edit   pdf
url  openurl
  Title Review on computer vision techniques in emergency situations Type Journal Article
  Year 2018 Publication Multimedia Tools and Applications Abbreviated Journal MTAP  
  Volume 77 Issue 13 Pages 17069–17107  
  Keywords Emergency management; Computer vision; Decision makers; Situational awareness; Critical situation  
  Abstract In emergency situations, actions that save lives and limit the impact of hazards are crucial. In order to act, situational awareness is needed to decide what to do. Geolocalized photos and video of the situations as they evolve can be crucial in better understanding them and making decisions faster. Cameras are almost everywhere these days, either in terms of smartphones, installed CCTV cameras, UAVs or others. However, this poses challenges in big data and information overflow. Moreover, most of the time there are no disasters at any given location, so humans aiming to detect sudden situations may not be as alert as needed at any point in time. Consequently, computer vision tools can be an excellent decision support. The number of emergencies where computer vision tools has been considered or used is very wide, and there is a great overlap across related emergency research. Researchers tend to focus on state-of-the-art systems that cover the same emergency as they are studying, obviating important research in other fields. In order to unveil this overlap, the survey is divided along four main axes: the types of emergencies that have been studied in computer vision, the objective that the algorithms can address, the type of hardware needed and the algorithms used. Therefore, this review provides a broad overview of the progress of computer vision covering all sorts of emergencies.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.068; 600.120 Approved no  
  Call Number Admin @ si @ LWG2018 Serial 3041  
Permanent link to this record
 

 
Author Arash Akbarinia; C. Alejandro Parraga edit   pdf
url  openurl
  Title Feedback and Surround Modulated Boundary Detection Type Journal Article
  Year 2018 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 126 Issue 12 Pages 1367–1380  
  Keywords Boundary detection; Surround modulation; Biologically-inspired vision  
  Abstract Edges are key components of any visual scene to the extent that we can recognise objects merely by their silhouettes. The human visual system captures edge information through neurons in the visual cortex that are sensitive to both intensity discontinuities and particular orientations. The “classical approach” assumes that these cells are only responsive to the stimulus present within their receptive fields, however, recent studies demonstrate that surrounding regions and inter-areal feedback connections influence their responses significantly. In this work we propose a biologically-inspired edge detection model in which orientation selective neurons are represented through the first derivative of a Gaussian function resembling double-opponent cells in the primary visual cortex (V1). In our model we account for four kinds of receptive field surround, i.e. full, far, iso- and orthogonal-orientation, whose contributions are contrast-dependant. The output signal from V1 is pooled in its perpendicular direction by larger V2 neurons employing a contrast-variant centre-surround kernel. We further introduce a feedback connection from higher-level visual areas to the lower ones. The results of our model on three benchmark datasets show a big improvement compared to the current non-learning and biologically-inspired state-of-the-art algorithms while being competitive to the learning-based methods.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes NEUROBIT; 600.068; 600.072 Approved no  
  Call Number Admin @ si @ AkP2018b Serial 2991  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: