|   | 
Details
   web
Records
Author Eduardo Aguilar; Beatriz Remeseiro; Marc Bolaños; Petia Radeva
Title Grab, Pay, and Eat: Semantic Food Detection for Smart Restaurants Type Journal Article
Year 2018 Publication IEEE Transactions on Multimedia Abbreviated Journal
Volume 20 Issue 12 Pages (down) 3266 - 3275
Keywords
Abstract The increase in awareness of people towards their nutritional habits has drawn considerable attention to the field of automatic food analysis. Focusing on self-service restaurants environment, automatic food analysis is not only useful for extracting nutritional information from foods selected by customers, it is also of high interest to speed up the service solving the bottleneck produced at the cashiers in times of high demand. In this paper, we address the problem of automatic food tray analysis in canteens and restaurants environment, which consists in predicting multiple foods placed on a tray image. We propose a new approach for food analysis based on convolutional neural networks, we name Semantic Food Detection, which integrates in the same framework food localization, recognition and segmentation. We demonstrate that our method improves the state of the art food detection by a considerable margin on the public dataset UNIMIB2016 achieving about 90% in terms of F-measure, and thus provides a significant technological advance towards the automatic billing in restaurant environments.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no proj Approved no
Call Number Admin @ si @ ARB2018 Serial 3236
Permanent link to this record
 

 
Author Cristhian A. Aguilera-Carrasco; Cristhian Aguilera; Cristobal A. Navarro; Angel Sappa
Title Fast CNN Stereo Depth Estimation through Embedded GPU Devices Type Journal Article
Year 2020 Publication Sensors Abbreviated Journal SENS
Volume 20 Issue 11 Pages (down) 3249
Keywords stereo matching; deep learning; embedded GPU
Abstract Current CNN-based stereo depth estimation models can barely run under real-time constraints on embedded graphic processing unit (GPU) devices. Moreover, state-of-the-art evaluations usually do not consider model optimization techniques, being that it is unknown what is the current potential on embedded GPU devices. In this work, we evaluate two state-of-the-art models on three different embedded GPU devices, with and without optimization methods, presenting performance results that illustrate the actual capabilities of embedded GPU devices for stereo depth estimation. More importantly, based on our evaluation, we propose the use of a U-Net like architecture for postprocessing the cost-volume, instead of a typical sequence of 3D convolutions, drastically augmenting the runtime speed of current models. In our experiments, we achieve real-time inference speed, in the range of 5–32 ms, for 1216 × 368 input stereo images on the Jetson TX2, Jetson Xavier, and Jetson Nano embedded devices.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MSIAU; 600.122 Approved no
Call Number Admin @ si @ AAN2020 Serial 3428
Permanent link to this record
 

 
Author German Ros; Laura Sellart; Joanna Materzynska; David Vazquez; Antonio Lopez
Title The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes Type Conference Article
Year 2016 Publication 29th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages (down) 3234-3243
Keywords Domain Adaptation; Autonomous Driving; Virtual Data; Semantic Segmentation
Abstract Vision-based semantic segmentation in urban scenarios is a key functionality for autonomous driving. The irruption of deep convolutional neural networks (DCNNs) allows to foresee obtaining reliable classifiers to perform such a visual task. However, DCNNs require to learn many parameters from raw images; thus, having a sufficient amount of diversified images with this class annotations is needed. These annotations are obtained by a human cumbersome labour specially challenging for semantic segmentation, since pixel-level annotations are required. In this paper, we propose to use a virtual world for automatically generating realistic synthetic images with pixel-level annotations. Then, we address the question of how useful can be such data for the task of semantic segmentation; in particular, when using a DCNN paradigm. In order to answer this question we have generated a synthetic diversified collection of urban images, named SynthCity, with automatically generated class annotations. We use SynthCity in combination with publicly available real-world urban images with manually provided annotations. Then, we conduct experiments on a DCNN setting that show how the inclusion of SynthCity in the training stage significantly improves the performance of the semantic segmentation task
Address Las Vegas; USA; June 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes ADAS; 600.085; 600.082; 600.076 Approved no
Call Number ADAS @ adas @ RSM2016 Serial 2739
Permanent link to this record
 

 
Author Marco Buzzelli; Joost Van de Weijer; Raimondo Schettini
Title Learning Illuminant Estimation from Object Recognition Type Conference Article
Year 2018 Publication 25th International Conference on Image Processing Abbreviated Journal
Volume Issue Pages (down) 3234 - 3238
Keywords Illuminant estimation; computational color constancy; semi-supervised learning; deep learning; convolutional neural networks
Abstract In this paper we present a deep learning method to estimate the illuminant of an image. Our model is not trained with illuminant annotations, but with the objective of improving performance on an auxiliary task such as object recognition. To the best of our knowledge, this is the first example of a deep
learning architecture for illuminant estimation that is trained without ground truth illuminants. We evaluate our solution on standard datasets for color constancy, and compare it with state of the art methods. Our proposal is shown to outperform most deep learning methods in a cross-dataset evaluation
setup, and to present competitive results in a comparison with parametric solutions.
Address Athens; Greece; October 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIP
Notes LAMP; 600.109; 600.120 Approved no
Call Number Admin @ si @ BWS2018 Serial 3157
Permanent link to this record
 

 
Author Emanuel Sanchez Aimar; Petia Radeva; Mariella Dimiccoli
Title Social Relation Recognition in Egocentric Photostreams Type Conference Article
Year 2019 Publication 26th International Conference on Image Processing Abbreviated Journal
Volume Issue Pages (down) 3227-3231
Keywords
Abstract This paper proposes an approach to automatically categorize the social interactions of a user wearing a photo-camera (2fpm), by relying solely on what the camera is seeing. The problem is challenging due to the overwhelming complexity of social life and the extreme intra-class variability of social interactions captured under unconstrained conditions. We adopt the formalization proposed in Bugental's social theory, that groups human relations into five social domains with related categories. Our method is a new deep learning architecture that exploits the hierarchical structure of the label space and relies on a set of social attributes estimated at frame level to provide a semantic representation of social interactions. Experimental results on the new EgoSocialRelation dataset demonstrate the effectiveness of our proposal.
Address Taipei; Taiwan; September 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIP
Notes MILAB; no menciona Approved no
Call Number Admin @ si @ SRD2019 Serial 3370
Permanent link to this record
 

 
Author Vacit Oguz Yazici; Longlong Yu; Arnau Ramisa; Luis Herranz; Joost Van de Weijer
Title Main product detection with graph networks for fashion Type Journal Article
Year 2024 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 83 Issue Pages (down) 3215–3231
Keywords
Abstract Computer vision has established a foothold in the online fashion retail industry. Main product detection is a crucial step of vision-based fashion product feed parsing pipelines, focused on identifying the bounding boxes that contain the product being sold in the gallery of images of the product page. The current state-of-the-art approach does not leverage the relations between regions in the image, and treats images of the same product independently, therefore not fully exploiting visual and product contextual information. In this paper, we propose a model that incorporates Graph Convolutional Networks (GCN) that jointly represent all detected bounding boxes in the gallery as nodes. We show that the proposed method is better than the state-of-the-art, especially, when we consider the scenario where title-input is missing at inference time and for cross-dataset evaluation, our method outperforms previous approaches by a large margin.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; MACO; 600.147; 600.167; 600.164; 600.161; 600.141; 601.309 Approved no
Call Number Admin @ si @ YYR2024 Serial 4017
Permanent link to this record
 

 
Author Dipam Goswami; J Schuster; Joost Van de Weijer; Didier Stricker
Title Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages (down) 3195-3204
Keywords
Abstract Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation. D Goswami, R Schuster, J van de Weijer, D Stricker. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 3195-3204
Address Waikoloa; Hawai; USA; January 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WACV
Notes LAMP Approved no
Call Number Admin @ si @ GSW2023 Serial 3901
Permanent link to this record
 

 
Author Saad Minhas; Zeba Khanam; Shoaib Ehsan; Klaus McDonald Maier; Aura Hernandez-Sabate
Title Weather Classification by Utilizing Synthetic Data Type Journal Article
Year 2022 Publication Sensors Abbreviated Journal SENS
Volume 22 Issue 9 Pages (down) 3193
Keywords Weather classification; synthetic data; dataset; autonomous car; computer vision; advanced driver assistance systems; deep learning; intelligent transportation systems
Abstract Weather prediction from real-world images can be termed a complex task when targeting classification using neural networks. Moreover, the number of images throughout the available datasets can contain a huge amount of variance when comparing locations with the weather those images are representing. In this article, the capabilities of a custom built driver simulator are explored specifically to simulate a wide range of weather conditions. Moreover, the performance of a new synthetic dataset generated by the above simulator is also assessed. The results indicate that the use of synthetic datasets in conjunction with real-world datasets can increase the training efficiency of the CNNs by as much as 74%. The article paves a way forward to tackle the persistent problem of bias in vision-based datasets.
Address 21 April 2022
Corporate Author Thesis
Publisher MDPI Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM; 600.139; 600.159; 600.166; 600.145; Approved no
Call Number Admin @ si @ MKE2022 Serial 3761
Permanent link to this record
 

 
Author Jose L. Gomez; Gabriel Villalonga; Antonio Lopez
Title Co-Training for Deep Object Detection: Comparing Single-Modal and Multi-Modal Approaches Type Journal Article
Year 2021 Publication Sensors Abbreviated Journal SENS
Volume 21 Issue 9 Pages (down) 3185
Keywords co-training; multi-modality; vision-based object detection; ADAS; self-driving
Abstract Top-performing computer vision models are powered by convolutional neural networks (CNNs). Training an accurate CNN highly depends on both the raw sensor data and their associated ground truth (GT). Collecting such GT is usually done through human labeling, which is time-consuming and does not scale as we wish. This data-labeling bottleneck may be intensified due to domain shifts among image sensors, which could force per-sensor data labeling. In this paper, we focus on the use of co-training, a semi-supervised learning (SSL) method, for obtaining self-labeled object bounding boxes (BBs), i.e., the GT to train deep object detectors. In particular, we assess the goodness of multi-modal co-training by relying on two different views of an image, namely, appearance (RGB) and estimated depth (D). Moreover, we compare appearance-based single-modal co-training with multi-modal. Our results suggest that in a standard SSL setting (no domain shift, a few human-labeled data) and under virtual-to-real domain shift (many virtual-world labeled data, no human-labeled data) multi-modal co-training outperforms single-modal. In the latter case, by performing GAN-based domain translation both co-training modalities are on par, at least when using an off-the-shelf depth estimation model not specifically trained on the translated images.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ GVL2021 Serial 3562
Permanent link to this record
 

 
Author Jorge Bernal; F. Javier Sanchez; Fernando Vilariño
Title Towards Automatic Polyp Detection with a Polyp Appearance Model Type Journal Article
Year 2012 Publication Pattern Recognition Abbreviated Journal PR
Volume 45 Issue 9 Pages (down) 3166-3182
Keywords Colonoscopy,PolypDetection,RegionSegmentation,SA-DOVA descriptot
Abstract This work aims at the automatic polyp detection by using a model of polyp appearance in the context of the analysis of colonoscopy videos. Our method consists of three stages: region segmentation, region description and region classification. The performance of our region segmentation method guarantees that if a polyp is present in the image, it will be exclusively and totally contained in a single region. The output of the algorithm also defines which regions can be considered as non-informative. We define as our region descriptor the novel Sector Accumulation-Depth of Valleys Accumulation (SA-DOVA), which provides a necessary but not sufficient condition for the polyp presence. Finally, we classify our segmented regions according to the maximal values of the SA-DOVA descriptor. Our preliminary classification results are promising, especially when classifying those parts of the image that do not contain a polyp inside.
Address
Corporate Author Thesis
Publisher Elsevier Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0031-3203 ISBN Medium
Area 800 Expedition Conference IbPRIA
Notes MV;SIAI Approved no
Call Number Admin @ si @ BSV2012; IAM @ iam Serial 1997
Permanent link to this record
 

 
Author Carlo Gatta; Petia Radeva
Title Bilateral Enhancers Type Conference Article
Year 2009 Publication 16th IEEE International Conference on Image Processing Abbreviated Journal
Volume Issue Pages (down) 3161-3165
Keywords
Abstract Ten years ago the concept of bilateral filtering (BF) became popular in the image processing community. The core of the idea is to blend the effect of a spatial filter, as e.g. the Gaussian filter, with the effect of a filter that acts on image values. The two filters acts on orthogonal domains of a picture: the 2D lattice of the image support and the intensity (or color) domain. The BF approach is an intuitive way to blend these two filters giving rise to algorithms that perform difficult tasks requiring a relatively simple design. In this paper we extend the concept of BF, proposing the bilateral enhancers (BE). We show how to design proper functions to obtain an edge-preserving smoothing and a selective sharpening. Moreover, we show that the proposed algorithm can perform edge-preserving smoothing and selective sharpening simultaneously in a single filtering.
Address Cairo, Egypt
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1522-4880 ISBN 978-1-4244-5653-6 Medium
Area Expedition Conference ICIP
Notes MILAB Approved no
Call Number BCNPCL @ bcnpcl @ GaR2009b Serial 1243
Permanent link to this record
 

 
Author Cristina Cañero; Petia Radeva
Title Vesselness enhancement diffusion Type Journal Article
Year 2003 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 24 Issue 16 Pages (down) 3141–3151
Keywords
Abstract IF: 0.809
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB Approved no
Call Number BCNPCL @ bcnpcl @ CaR2003 Serial 371
Permanent link to this record
 

 
Author Naveen Onkarappa; Angel Sappa
Title Synthetic sequences and ground-truth flow field generation for algorithm validation Type Journal Article
Year 2015 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 74 Issue 9 Pages (down) 3121-3135
Keywords Ground-truth optical flow; Synthetic sequence; Algorithm validation
Abstract Research in computer vision is advancing by the availability of good datasets that help to improve algorithms, validate results and obtain comparative analysis. The datasets can be real or synthetic. For some of the computer vision problems such as optical flow it is not possible to obtain ground-truth optical flow with high accuracy in natural outdoor real scenarios directly by any sensor, although it is possible to obtain ground-truth data of real scenarios in a laboratory setup with limited motion. In this difficult situation computer graphics offers a viable option for creating realistic virtual scenarios. In the current work we present a framework to design virtual scenes and generate sequences as well as ground-truth flow fields. Particularly, we generate a dataset containing sequences of driving scenarios. The sequences in the dataset vary in different speeds of the on-board vision system, different road textures, complex motion of vehicle and independent moving vehicles in the scene. This dataset enables analyzing and adaptation of existing optical flow methods, and leads to invention of new approaches particularly for driver assistance systems.
Address
Corporate Author Thesis
Publisher Springer US Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1380-7501 ISBN Medium
Area Expedition Conference
Notes ADAS; 600.055; 601.215; 600.076 Approved no
Call Number Admin @ si @ OnS2014b Serial 2472
Permanent link to this record
 

 
Author Lluis Gomez; Dimosthenis Karatzas
Title MSER-based Real-Time Text Detection and Tracking Type Conference Article
Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages (down) 3110 - 3115
Keywords
Abstract We present a hybrid algorithm for detection and tracking of text in natural scenes that goes beyond the fulldetection approaches in terms of time performance optimization.
A state-of-the-art scene text detection module based on Maximally Stable Extremal Regions (MSER) is used to detect text asynchronously, while on a separate thread detected text objects are tracked by MSER propagation. The cooperation of these two modules yields real time video processing at high frame rates even on low-resource devices.
Address Stockholm; August 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN Medium
Area Expedition Conference ICPR
Notes DAG; 600.056; 601.158; 601.197; 600.077 Approved no
Call Number Admin @ si @ GoK2014a Serial 2492
Permanent link to this record
 

 
Author Zhengying Liu; Adrien Pavao; Zhen Xu; Sergio Escalera; Fabio Ferreira; Isabelle Guyon; Sirui Hong; Frank Hutter; Rongrong Ji; Julio C. S. Jacques Junior; Ge Li; Marius Lindauer; Zhipeng Luo; Meysam Madadi; Thomas Nierhoff; Kangning Niu; Chunguang Pan; Danny Stoll; Sebastien Treguer; Jin Wang; Peng Wang; Chenglin Wu; Youcheng Xiong; Arber Zela; Yang Zhang
Title Winning Solutions and Post-Challenge Analyses of the ChaLearn AutoDL Challenge 2019 Type Journal Article
Year 2021 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 43 Issue 9 Pages (down) 3108 - 3125
Keywords
Abstract This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a “meta-learner”, “data ingestor”, “model selector”, “model/learner”, and “evaluator”. This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (http://autodl.chalearn.org), the open-sourced code of the winners, and a free “AutoDL self-service.”
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ LPX2021 Serial 3587
Permanent link to this record