toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Eduardo Aguilar; Beatriz Remeseiro; Marc Bolaños; Petia Radeva edit   pdf
url  doi
openurl 
  Title Grab, Pay, and Eat: Semantic Food Detection for Smart Restaurants Type Journal Article
  Year 2018 Publication IEEE Transactions on Multimedia Abbreviated Journal  
  Volume 20 Issue 12 Pages (down) 3266 - 3275  
  Keywords  
  Abstract The increase in awareness of people towards their nutritional habits has drawn considerable attention to the field of automatic food analysis. Focusing on self-service restaurants environment, automatic food analysis is not only useful for extracting nutritional information from foods selected by customers, it is also of high interest to speed up the service solving the bottleneck produced at the cashiers in times of high demand. In this paper, we address the problem of automatic food tray analysis in canteens and restaurants environment, which consists in predicting multiple foods placed on a tray image. We propose a new approach for food analysis based on convolutional neural networks, we name Semantic Food Detection, which integrates in the same framework food localization, recognition and segmentation. We demonstrate that our method improves the state of the art food detection by a considerable margin on the public dataset UNIMIB2016 achieving about 90% in terms of F-measure, and thus provides a significant technological advance towards the automatic billing in restaurant environments.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ ARB2018 Serial 3236  
Permanent link to this record
 

 
Author Cristhian A. Aguilera-Carrasco; Cristhian Aguilera; Cristobal A. Navarro; Angel Sappa edit   pdf
url  doi
openurl 
  Title Fast CNN Stereo Depth Estimation through Embedded GPU Devices Type Journal Article
  Year 2020 Publication Sensors Abbreviated Journal SENS  
  Volume 20 Issue 11 Pages (down) 3249  
  Keywords stereo matching; deep learning; embedded GPU  
  Abstract Current CNN-based stereo depth estimation models can barely run under real-time constraints on embedded graphic processing unit (GPU) devices. Moreover, state-of-the-art evaluations usually do not consider model optimization techniques, being that it is unknown what is the current potential on embedded GPU devices. In this work, we evaluate two state-of-the-art models on three different embedded GPU devices, with and without optimization methods, presenting performance results that illustrate the actual capabilities of embedded GPU devices for stereo depth estimation. More importantly, based on our evaluation, we propose the use of a U-Net like architecture for postprocessing the cost-volume, instead of a typical sequence of 3D convolutions, drastically augmenting the runtime speed of current models. In our experiments, we achieve real-time inference speed, in the range of 5–32 ms, for 1216 × 368 input stereo images on the Jetson TX2, Jetson Xavier, and Jetson Nano embedded devices.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MSIAU; 600.122 Approved no  
  Call Number Admin @ si @ AAN2020 Serial 3428  
Permanent link to this record
 

 
Author German Ros; Laura Sellart; Joanna Materzynska; David Vazquez; Antonio Lopez edit   pdf
doi  openurl
  Title The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes Type Conference Article
  Year 2016 Publication 29th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (down) 3234-3243  
  Keywords Domain Adaptation; Autonomous Driving; Virtual Data; Semantic Segmentation  
  Abstract Vision-based semantic segmentation in urban scenarios is a key functionality for autonomous driving. The irruption of deep convolutional neural networks (DCNNs) allows to foresee obtaining reliable classifiers to perform such a visual task. However, DCNNs require to learn many parameters from raw images; thus, having a sufficient amount of diversified images with this class annotations is needed. These annotations are obtained by a human cumbersome labour specially challenging for semantic segmentation, since pixel-level annotations are required. In this paper, we propose to use a virtual world for automatically generating realistic synthetic images with pixel-level annotations. Then, we address the question of how useful can be such data for the task of semantic segmentation; in particular, when using a DCNN paradigm. In order to answer this question we have generated a synthetic diversified collection of urban images, named SynthCity, with automatically generated class annotations. We use SynthCity in combination with publicly available real-world urban images with manually provided annotations. Then, we conduct experiments on a DCNN setting that show how the inclusion of SynthCity in the training stage significantly improves the performance of the semantic segmentation task  
  Address Las Vegas; USA; June 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPR  
  Notes ADAS; 600.085; 600.082; 600.076 Approved no  
  Call Number ADAS @ adas @ RSM2016 Serial 2739  
Permanent link to this record
 

 
Author Marco Buzzelli; Joost Van de Weijer; Raimondo Schettini edit   pdf
doi  openurl
  Title Learning Illuminant Estimation from Object Recognition Type Conference Article
  Year 2018 Publication 25th International Conference on Image Processing Abbreviated Journal  
  Volume Issue Pages (down) 3234 - 3238  
  Keywords Illuminant estimation; computational color constancy; semi-supervised learning; deep learning; convolutional neural networks  
  Abstract In this paper we present a deep learning method to estimate the illuminant of an image. Our model is not trained with illuminant annotations, but with the objective of improving performance on an auxiliary task such as object recognition. To the best of our knowledge, this is the first example of a deep
learning architecture for illuminant estimation that is trained without ground truth illuminants. We evaluate our solution on standard datasets for color constancy, and compare it with state of the art methods. Our proposal is shown to outperform most deep learning methods in a cross-dataset evaluation
setup, and to present competitive results in a comparison with parametric solutions.
 
  Address Athens; Greece; October 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes LAMP; 600.109; 600.120 Approved no  
  Call Number Admin @ si @ BWS2018 Serial 3157  
Permanent link to this record
 

 
Author Emanuel Sanchez Aimar; Petia Radeva; Mariella Dimiccoli edit   pdf
url  doi
openurl 
  Title Social Relation Recognition in Egocentric Photostreams Type Conference Article
  Year 2019 Publication 26th International Conference on Image Processing Abbreviated Journal  
  Volume Issue Pages (down) 3227-3231  
  Keywords  
  Abstract This paper proposes an approach to automatically categorize the social interactions of a user wearing a photo-camera (2fpm), by relying solely on what the camera is seeing. The problem is challenging due to the overwhelming complexity of social life and the extreme intra-class variability of social interactions captured under unconstrained conditions. We adopt the formalization proposed in Bugental's social theory, that groups human relations into five social domains with related categories. Our method is a new deep learning architecture that exploits the hierarchical structure of the label space and relies on a set of social attributes estimated at frame level to provide a semantic representation of social interactions. Experimental results on the new EgoSocialRelation dataset demonstrate the effectiveness of our proposal.  
  Address Taipei; Taiwan; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes MILAB; no menciona Approved no  
  Call Number Admin @ si @ SRD2019 Serial 3370  
Permanent link to this record
 

 
Author Vacit Oguz Yazici; Longlong Yu; Arnau Ramisa; Luis Herranz; Joost Van de Weijer edit  url
openurl 
  Title Main product detection with graph networks for fashion Type Journal Article
  Year 2024 Publication Multimedia Tools and Applications Abbreviated Journal MTAP  
  Volume 83 Issue Pages (down) 3215–3231  
  Keywords  
  Abstract Computer vision has established a foothold in the online fashion retail industry. Main product detection is a crucial step of vision-based fashion product feed parsing pipelines, focused on identifying the bounding boxes that contain the product being sold in the gallery of images of the product page. The current state-of-the-art approach does not leverage the relations between regions in the image, and treats images of the same product independently, therefore not fully exploiting visual and product contextual information. In this paper, we propose a model that incorporates Graph Convolutional Networks (GCN) that jointly represent all detected bounding boxes in the gallery as nodes. We show that the proposed method is better than the state-of-the-art, especially, when we consider the scenario where title-input is missing at inference time and for cross-dataset evaluation, our method outperforms previous approaches by a large margin.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; MACO; 600.147; 600.167; 600.164; 600.161; 600.141; 601.309 Approved no  
  Call Number Admin @ si @ YYR2024 Serial 4017  
Permanent link to this record
 

 
Author Dipam Goswami; J Schuster; Joost Van de Weijer; Didier Stricker edit   pdf
url  openurl
  Title Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation Type Conference Article
  Year 2023 Publication Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages (down) 3195-3204  
  Keywords  
  Abstract Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation. D Goswami, R Schuster, J van de Weijer, D Stricker. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 3195-3204  
  Address Waikoloa; Hawai; USA; January 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes LAMP Approved no  
  Call Number Admin @ si @ GSW2023 Serial 3901  
Permanent link to this record
 

 
Author Saad Minhas; Zeba Khanam; Shoaib Ehsan; Klaus McDonald Maier; Aura Hernandez-Sabate edit  doi
openurl 
  Title Weather Classification by Utilizing Synthetic Data Type Journal Article
  Year 2022 Publication Sensors Abbreviated Journal SENS  
  Volume 22 Issue 9 Pages (down) 3193  
  Keywords Weather classification; synthetic data; dataset; autonomous car; computer vision; advanced driver assistance systems; deep learning; intelligent transportation systems  
  Abstract Weather prediction from real-world images can be termed a complex task when targeting classification using neural networks. Moreover, the number of images throughout the available datasets can contain a huge amount of variance when comparing locations with the weather those images are representing. In this article, the capabilities of a custom built driver simulator are explored specifically to simulate a wide range of weather conditions. Moreover, the performance of a new synthetic dataset generated by the above simulator is also assessed. The results indicate that the use of synthetic datasets in conjunction with real-world datasets can increase the training efficiency of the CNNs by as much as 74%. The article paves a way forward to tackle the persistent problem of bias in vision-based datasets.  
  Address 21 April 2022  
  Corporate Author Thesis  
  Publisher MDPI Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM; 600.139; 600.159; 600.166; 600.145; Approved no  
  Call Number Admin @ si @ MKE2022 Serial 3761  
Permanent link to this record
 

 
Author Jose Luis Gomez; Gabriel Villalonga; Antonio Lopez edit   pdf
url  openurl
  Title Co-Training for Deep Object Detection: Comparing Single-Modal and Multi-Modal Approaches Type Journal Article
  Year 2021 Publication Sensors Abbreviated Journal SENS  
  Volume 21 Issue 9 Pages (down) 3185  
  Keywords co-training; multi-modality; vision-based object detection; ADAS; self-driving  
  Abstract Top-performing computer vision models are powered by convolutional neural networks (CNNs). Training an accurate CNN highly depends on both the raw sensor data and their associated ground truth (GT). Collecting such GT is usually done through human labeling, which is time-consuming and does not scale as we wish. This data-labeling bottleneck may be intensified due to domain shifts among image sensors, which could force per-sensor data labeling. In this paper, we focus on the use of co-training, a semi-supervised learning (SSL) method, for obtaining self-labeled object bounding boxes (BBs), i.e., the GT to train deep object detectors. In particular, we assess the goodness of multi-modal co-training by relying on two different views of an image, namely, appearance (RGB) and estimated depth (D). Moreover, we compare appearance-based single-modal co-training with multi-modal. Our results suggest that in a standard SSL setting (no domain shift, a few human-labeled data) and under virtual-to-real domain shift (many virtual-world labeled data, no human-labeled data) multi-modal co-training outperforms single-modal. In the latter case, by performing GAN-based domain translation both co-training modalities are on par, at least when using an off-the-shelf depth estimation model not specifically trained on the translated images.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ GVL2021 Serial 3562  
Permanent link to this record
 

 
Author Jorge Bernal; F. Javier Sanchez; Fernando Vilariño edit   pdf
url  doi
openurl 
  Title Towards Automatic Polyp Detection with a Polyp Appearance Model Type Journal Article
  Year 2012 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 45 Issue 9 Pages (down) 3166-3182  
  Keywords Colonoscopy,PolypDetection,RegionSegmentation,SA-DOVA descriptot  
  Abstract This work aims at the automatic polyp detection by using a model of polyp appearance in the context of the analysis of colonoscopy videos. Our method consists of three stages: region segmentation, region description and region classification. The performance of our region segmentation method guarantees that if a polyp is present in the image, it will be exclusively and totally contained in a single region. The output of the algorithm also defines which regions can be considered as non-informative. We define as our region descriptor the novel Sector Accumulation-Depth of Valleys Accumulation (SA-DOVA), which provides a necessary but not sufficient condition for the polyp presence. Finally, we classify our segmented regions according to the maximal values of the SA-DOVA descriptor. Our preliminary classification results are promising, especially when classifying those parts of the image that do not contain a polyp inside.  
  Address  
  Corporate Author Thesis  
  Publisher Elsevier Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0031-3203 ISBN Medium  
  Area 800 Expedition Conference IbPRIA  
  Notes MV;SIAI Approved no  
  Call Number Admin @ si @ BSV2012; IAM @ iam Serial 1997  
Permanent link to this record
 

 
Author Carlo Gatta; Petia Radeva edit  doi
isbn  openurl
  Title Bilateral Enhancers Type Conference Article
  Year 2009 Publication 16th IEEE International Conference on Image Processing Abbreviated Journal  
  Volume Issue Pages (down) 3161-3165  
  Keywords  
  Abstract Ten years ago the concept of bilateral filtering (BF) became popular in the image processing community. The core of the idea is to blend the effect of a spatial filter, as e.g. the Gaussian filter, with the effect of a filter that acts on image values. The two filters acts on orthogonal domains of a picture: the 2D lattice of the image support and the intensity (or color) domain. The BF approach is an intuitive way to blend these two filters giving rise to algorithms that perform difficult tasks requiring a relatively simple design. In this paper we extend the concept of BF, proposing the bilateral enhancers (BE). We show how to design proper functions to obtain an edge-preserving smoothing and a selective sharpening. Moreover, we show that the proposed algorithm can perform edge-preserving smoothing and selective sharpening simultaneously in a single filtering.  
  Address Cairo, Egypt  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1522-4880 ISBN 978-1-4244-5653-6 Medium  
  Area Expedition Conference ICIP  
  Notes MILAB Approved no  
  Call Number BCNPCL @ bcnpcl @ GaR2009b Serial 1243  
Permanent link to this record
 

 
Author Cristina Cañero; Petia Radeva edit  doi
openurl 
  Title Vesselness enhancement diffusion Type Journal Article
  Year 2003 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 24 Issue 16 Pages (down) 3141–3151  
  Keywords  
  Abstract IF: 0.809  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB Approved no  
  Call Number BCNPCL @ bcnpcl @ CaR2003 Serial 371  
Permanent link to this record
 

 
Author Naveen Onkarappa; Angel Sappa edit  doi
openurl 
  Title Synthetic sequences and ground-truth flow field generation for algorithm validation Type Journal Article
  Year 2015 Publication Multimedia Tools and Applications Abbreviated Journal MTAP  
  Volume 74 Issue 9 Pages (down) 3121-3135  
  Keywords Ground-truth optical flow; Synthetic sequence; Algorithm validation  
  Abstract Research in computer vision is advancing by the availability of good datasets that help to improve algorithms, validate results and obtain comparative analysis. The datasets can be real or synthetic. For some of the computer vision problems such as optical flow it is not possible to obtain ground-truth optical flow with high accuracy in natural outdoor real scenarios directly by any sensor, although it is possible to obtain ground-truth data of real scenarios in a laboratory setup with limited motion. In this difficult situation computer graphics offers a viable option for creating realistic virtual scenarios. In the current work we present a framework to design virtual scenes and generate sequences as well as ground-truth flow fields. Particularly, we generate a dataset containing sequences of driving scenarios. The sequences in the dataset vary in different speeds of the on-board vision system, different road textures, complex motion of vehicle and independent moving vehicles in the scene. This dataset enables analyzing and adaptation of existing optical flow methods, and leads to invention of new approaches particularly for driver assistance systems.  
  Address  
  Corporate Author Thesis  
  Publisher Springer US Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1380-7501 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.055; 601.215; 600.076 Approved no  
  Call Number Admin @ si @ OnS2014b Serial 2472  
Permanent link to this record
 

 
Author Lluis Gomez; Dimosthenis Karatzas edit   pdf
doi  openurl
  Title MSER-based Real-Time Text Detection and Tracking Type Conference Article
  Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (down) 3110 - 3115  
  Keywords  
  Abstract We present a hybrid algorithm for detection and tracking of text in natural scenes that goes beyond the fulldetection approaches in terms of time performance optimization.
A state-of-the-art scene text detection module based on Maximally Stable Extremal Regions (MSER) is used to detect text asynchronously, while on a separate thread detected text objects are tracked by MSER propagation. The cooperation of these two modules yields real time video processing at high frame rates even on low-resource devices.
 
  Address Stockholm; August 2014  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1051-4651 ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.056; 601.158; 601.197; 600.077 Approved no  
  Call Number Admin @ si @ GoK2014a Serial 2492  
Permanent link to this record
 

 
Author Zhengying Liu; Adrien Pavao; Zhen Xu; Sergio Escalera; Fabio Ferreira; Isabelle Guyon; Sirui Hong; Frank Hutter; Rongrong Ji; Julio C. S. Jacques Junior; Ge Li; Marius Lindauer; Zhipeng Luo; Meysam Madadi; Thomas Nierhoff; Kangning Niu; Chunguang Pan; Danny Stoll; Sebastien Treguer; Jin Wang; Peng Wang; Chenglin Wu; Youcheng Xiong; Arber Zela; Yang Zhang edit  url
doi  openurl
  Title Winning Solutions and Post-Challenge Analyses of the ChaLearn AutoDL Challenge 2019 Type Journal Article
  Year 2021 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI  
  Volume 43 Issue 9 Pages (down) 3108 - 3125  
  Keywords  
  Abstract This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a “meta-learner”, “data ingestor”, “model selector”, “model/learner”, and “evaluator”. This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (http://autodl.chalearn.org), the open-sourced code of the winners, and a free “AutoDL self-service.”  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ LPX2021 Serial 3587  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: