|   | 
Details
   web
Records
Author Antonio Hernandez; Carlos Primo; Sergio Escalera
Title Automatic user interaction correction via Multi-label Graph cuts Type Conference Article
Year 2011 Publication In ICCV 2011 1st IEEE International Workshop on Human Interaction in Computer Vision HICV Abbreviated Journal
Volume Issue Pages 1276-1281
Keywords
Abstract Most applications in image segmentation requires from user interaction in order to achieve accurate results. However, user wants to achieve the desired segmentation accuracy reducing effort of manual labelling. In this work, we extend standard multi-label α-expansion Graph Cut algorithm so that it analyzes the interaction of the user in order to modify the object model and improve final segmentation of objects. The approach is inspired in the fact that fast user interactions may introduce some pixel errors confusing object and background. Our results with different degrees of user interaction and input errors show high performance of the proposed approach on a multi-label human limb segmentation problem compared with classical α-expansion algorithm.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4673-0062-9 Medium
Area Expedition Conference HICV
Notes MILAB; HuPBA Approved no
Call Number Admin @ si @ HPE2011 Serial 1892
Permanent link to this record
 

 
Author Jürgen Brauer; Wenjuan Gong; Jordi Gonzalez; Michael Arens
Title On the Effect of Temporal Information on Monocular 3D Human Pose Estimation Type Conference Article
Year 2011 Publication 2nd IEEE International Workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Streams Abbreviated Journal
Volume Issue Pages 906 - 913
Keywords
Abstract We address the task of estimating 3D human poses from monocular camera sequences. Many works make use of multiple consecutive frames for the estimation of a 3D pose in a frame. Although such an approach should ease the pose estimation task substantially since multiple consecutive frames allow to solve for 2D projection ambiguities in principle, it has not yet been investigated systematically how much we can improve the 3D pose estimates when using multiple consecutive frames opposed to single frame information. In this paper we analyze the difference in quality of 3D pose estimates based on different numbers of consecutive frames from which 2D pose estimates are available. We validate the use of temporal information on two major different approaches for human pose estimation – modeling and learning approaches. The results of our experiments show that both learning and modeling approaches benefit from using multiple frames opposed to single frame input but that the benefit is small when the 2D pose estimates show a high quality in terms of precision.
Address Barcelona
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4673-0062-9 Medium
Area Expedition Conference ARTEMIS
Notes ISE Approved no
Call Number Admin @ si @BGG 2011 Serial 1860
Permanent link to this record
 

 
Author Yaxing Wang; Hector Laria Mantecon; Joost Van de Weijer; Laura Lopez-Fuentes; Bogdan Raducanu
Title TransferI2I: Transfer Learning for Image-to-Image Translation from Small Datasets Type Conference Article
Year 2021 Publication 19th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 13990-13999
Keywords
Abstract Image-to-image (I2I) translation has matured in recent years and is able to generate high-quality realistic images. However, despite current success, it still faces important challenges when applied to small domains. Existing methods use transfer learning for I2I translation, but they still require the learning of millions of parameters from scratch. This drawback severely limits its application on small domains. In this paper, we propose a new transfer learning for I2I translation (TransferI2I). We decouple our learning process into the image generation step and the I2I translation step. In the first step we propose two novel techniques: source-target initialization and self-initialization of the adaptor layer. The former finetunes the pretrained generative model (e.g., StyleGAN) on source and target data. The latter allows to initialize all non-pretrained network parameters without the need of any data. These techniques provide a better initialization for the I2I translation step. In addition, we introduce an auxiliary GAN that further facilitates the training of deep I2I systems even from small datasets. In extensive experiments on three datasets, (Animal faces, Birds, and Foods), we show that we outperform existing methods and that mFID improves on several datasets with over 25 points.
Address Virtual; October 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes LAMP; 600.147; 602.200; 600.120 Approved no
Call Number Admin @ si @ WLW2021 Serial 3604
Permanent link to this record
 

 
Author Shiqi Yang; Yaxing Wang; Joost Van de Weijer; Luis Herranz; Shangling Jui
Title Generalized Source-free Domain Adaptation Type Conference Article
Year 2021 Publication 19th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 8958-8967
Keywords
Abstract Domain adaptation (DA) aims to transfer the knowledge learned from a source domain to an unlabeled target domain. Some recent works tackle source-free domain adaptation (SFDA) where only a source pre-trained model is available for adaptation to the target domain. However, those methods do not consider keeping source performance which is of high practical value in real world applications. In this paper, we propose a new domain adaptation paradigm called Generalized Source-free Domain Adaptation (G-SFDA), where the learned model needs to perform well on both the target and source domains, with only access to current unlabeled target data during adaptation. First, we propose local structure clustering (LSC), aiming to cluster the target features with its semantically similar neighbors, which successfully adapts the model to the target domain in the absence of source data. Second, we propose sparse domain attention (SDA), it produces a binary domain specific attention to activate different feature channels for different domains, meanwhile the domain attention will be utilized to regularize the gradient during adaptation to keep source information. In the experiments, for target performance our method is on par with or better than existing DA and SFDA methods, specifically it achieves state-of-the-art performance (85.4%) on VisDA, and our method works well for all domains after adapting to single or multiple target domains.
Address Virtual; October 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.120; 600.147 Approved no
Call Number Admin @ si @ YWW2021 Serial 3605
Permanent link to this record
 

 
Author Javier Marin; David Vazquez; Antonio Lopez; Jaume Amores; Bastian Leibe
Title Random Forests of Local Experts for Pedestrian Detection Type Conference Article
Year 2013 Publication 15th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 2592 - 2599
Keywords ADAS; Random Forest; Pedestrian Detection
Abstract Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one.
Address Sydney; Australia; December 2013
Corporate Author Thesis
Publisher IEEE Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1550-5499 ISBN Medium
Area Expedition Conference ICCV
Notes ADAS; 600.057; 600.054 Approved no
Call Number ADAS @ adas @ MVL2013 Serial 2333
Permanent link to this record
 

 
Author Gemma Roig; Xavier Boix; R. de Nijs; Sebastian Ramos; K. Kühnlenz; Luc Van Gool
Title Active MAP Inference in CRFs for Efficient Semantic Segmentation Type Conference Article
Year 2013 Publication 15th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 2312 - 2319
Keywords Semantic Segmentation
Abstract Most MAP inference algorithms for CRFs optimize an energy function knowing all the potentials. In this paper, we focus on CRFs where the computational cost of instantiating the potentials is orders of magnitude higher than MAP inference. This is often the case in semantic image segmentation, where most potentials are instantiated by slow classifiers fed with costly features. We introduce Active MAP inference 1) to on-the-fly select a subset of potentials to be instantiated in the energy function, leaving the rest of the parameters of the potentials unknown, and 2) to estimate the MAP labeling from such incomplete energy function. Results for semantic segmentation benchmarks, namely PASCAL VOC 2010 [5] and MSRC-21 [19], show that Active MAP inference achieves similar levels of accuracy but with major efficiency gains.
Address Sydney; Australia; December 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1550-5499 ISBN Medium
Area Expedition Conference ICCV
Notes ADAS; 600.057 Approved no
Call Number ADAS @ adas @ RBN2013 Serial 2377
Permanent link to this record
 

 
Author Fares Alnajar; Theo Gevers; Roberto Valenti; Sennay Ghebreab
Title Calibration-free Gaze Estimation using Human Gaze Patterns Type Conference Article
Year 2013 Publication 15th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 137-144
Keywords
Abstract We present a novel method to auto-calibrate gaze estimators based on gaze patterns obtained from other viewers. Our method is based on the observation that the gaze patterns of humans are indicative of where a new viewer will look at [12]. When a new viewer is looking at a stimulus, we first estimate a topology of gaze points (initial gaze points). Next, these points are transformed so that they match the gaze patterns of other humans to find the correct gaze points. In a flexible uncalibrated setup with a web camera and no chin rest, the proposed method was tested on ten subjects and ten images. The method estimates the gaze points after looking at a stimulus for a few seconds with an average accuracy of 4.3 im. Although the reported performance is lower than what could be achieved with dedicated hardware or calibrated setup, the proposed method still provides a sufficient accuracy to trace the viewer attention. This is promising considering the fact that auto-calibration is done in a flexible setup , without the use of a chin rest, and based only on a few seconds of gaze initialization data. To the best of our knowledge, this is the first work to use human gaze patterns in order to auto-calibrate gaze estimators.
Address Sydney
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes ALTRES;ISE Approved no
Call Number Admin @ si @ AGV2013 Serial 2365
Permanent link to this record
 

 
Author Hamdi Dibeklioglu; Albert Ali Salah; Theo Gevers
Title Like Father, Like Son: Facial Expression Dynamics for Kinship Verification Type Conference Article
Year 2013 Publication 15th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 1497-1504
Keywords
Abstract Kinship verification from facial appearance is a difficult problem. This paper explores the possibility of employing facial expression dynamics in this problem. By using features that describe facial dynamics and spatio-temporal appearance over smile expressions, we show that it is possible to improve the state of the art in this problem, and verify that it is indeed possible to recognize kinship by resemblance of facial expressions. The proposed method is tested on different kin relationships. On the average, 72.89% verification accuracy is achieved on spontaneous smiles.
Address Sydney
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes ALTRES;ISE Approved no
Call Number Admin @ si @ DSG2013 Serial 2366
Permanent link to this record
 

 
Author Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny
Title Handwritten Word Spotting with Corrected Attributes Type Conference Article
Year 2013 Publication 15th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 1017-1024
Keywords
Abstract We propose an approach to multi-writer word spotting, where the goal is to find a query word in a dataset comprised of document images. We propose an attributes-based approach that leads to a low-dimensional, fixed-length representation of the word images that is fast to compute and, especially, fast to compare. This approach naturally leads to an unified representation of word images and strings, which seamlessly allows one to indistinctly perform query-by-example, where the query is an image, and query-by-string, where the query is a string. We also propose a calibration scheme to correct the attributes scores based on Canonical Correlation Analysis that greatly improves the results on a challenging dataset. We test our approach on two public datasets showing state-of-the-art results.
Address Sydney; Australia; December 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1550-5499 ISBN Medium
Area Expedition Conference ICCV
Notes DAG Approved no
Call Number Admin @ si @ AGF2013 Serial 2327
Permanent link to this record
 

 
Author Mohammad Rouhani; Angel Sappa
Title Correspondence Free Registration through a Point-to-Model Distance Minimization Type Conference Article
Year 2011 Publication 13th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 2150-2157
Keywords
Abstract This paper presents a novel formulation, which derives in a smooth minimization problem, to tackle the rigid registration between a given point set and a model set. Unlike most of the existing works, which are based on minimizing a point-wise correspondence term, we propose to describe the model set by means of an implicit representation. It allows a new definition of the registration error, which works beyond the point level representation. Moreover, it could be used in a gradient-based optimization framework. The proposed approach consists of two stages. Firstly, a novel formulation is proposed that relates the registration parameters with the distance between the model and data set. Secondly, the registration parameters are obtained by means of the Levengberg-Marquardt algorithm. Experimental results and comparisons with state of the art show the validity of the proposed framework.
Address Barcelona
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1550-5499 ISBN 978-1-4577-1101-5 Medium
Area Expedition Conference ICCV
Notes ADAS Approved no
Call Number Admin @ si @ RoS2011b; ADAS @ adas @ Serial 1832
Permanent link to this record
 

 
Author Koen E.A. van de Sande; Jasper Uilings; Theo Gevers; Arnold Smeulders
Title Segmentation as Selective Search for Object Recognition Type Conference Article
Year 2011 Publication 13th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 1879-1886
Keywords
Abstract For object recognition, the current state-of-the-art is based on exhaustive search. However, to enable the use of more expensive features and classifiers and thereby progress beyond the state-of-the-art, a selective search strategy is needed. Therefore, we adapt segmentation as a selective search by reconsidering segmentation: We propose to generate many approximate locations over few and precise object delineations because (1) an object whose location is never generated can not be recognised and (2) appearance and immediate nearby context are most effective for object recognition. Our method is class-independent and is shown to cover 96.7% of all objects in the Pascal VOC 2007 test set using only 1,536 locations per image. Our selective search enables the use of the more expensive bag-of-words method which we use to substantially improve the state-of-the-art by up to 8.5% for 8 out of 20 classes on the Pascal VOC 2010 detection challenge.
Address Barcelona
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1550-5499 ISBN 978-1-4577-1101-5 Medium
Area Expedition Conference ICCV
Notes ISE Approved no
Call Number Admin @ si @ SUG2011 Serial 1780
Permanent link to this record
 

 
Author Bhaskar Chakraborty; Michael Holte; Thomas B. Moeslund; Jordi Gonzalez; Xavier Roca
Title A Selective Spatio-Temporal Interest Point Detector for Human Action Recognition in Complex Scenes Type Conference Article
Year 2011 Publication 13th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 1776-1783
Keywords
Abstract Recent progress in the field of human action recognition points towards the use of Spatio-Temporal Interest Points (STIPs) for local descriptor-based recognition strategies. In this paper we present a new approach for STIP detection by applying surround suppression combined with local and temporal constraints. Our method is significantly different from existing STIP detectors and improves the performance by detecting more repeatable, stable and distinctive STIPs for human actors, while suppressing unwanted background STIPs. For action representation we use a bag-of-visual words (BoV) model of local N-jet features to build a vocabulary of visual-words. To this end, we introduce a novel vocabulary building strategy by combining spatial pyramid and vocabulary compression techniques, resulting in improved performance and efficiency. Action class specific Support Vector Machine (SVM) classifiers are trained for categorization of human actions. A comprehensive set of experiments on existing benchmark datasets, and more challenging datasets of complex scenes, validate our approach and show state-of-the-art performance.
Address Barcelona
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1550-5499 ISBN 978-1-4577-1101-5 Medium
Area Expedition Conference ICCV
Notes ISE Approved no
Call Number Admin @ si @ CHM2011 Serial 1811
Permanent link to this record
 

 
Author Ivan Huerta; Michael Holte; Thomas B. Moeslund; Jordi Gonzalez
Title Detection and Removal of Chromatic Moving Shadows in Surveillance Scenarios Type Conference Article
Year 2009 Publication 12th International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 1499 - 1506
Keywords
Abstract Segmentation in the surveillance domain has to deal with shadows to avoid distortions when detecting moving objects. Most segmentation approaches dealing with shadow detection are typically restricted to penumbra shadows. Therefore, such techniques cannot cope well with umbra shadows. Consequently, umbra shadows are usually detected as part of moving objects. In this paper we present a novel technique based on gradient and colour models for separating chromatic moving cast shadows from detected moving objects. Firstly, both a chromatic invariant colour cone model and an invariant gradient model are built to perform automatic segmentation while detecting potential shadows. In a second step, regions corresponding to potential shadows are grouped by considering “a bluish effect” and an edge partitioning. Lastly, (i) temporal similarities between textures and (ii) spatial similarities between chrominance angle and brightness distortions are analysed for all potential shadow regions in order to finally identify umbra shadows. Unlike other approaches, our method does not make any a-priori assumptions about camera location, surface geometries, surface textures, shapes and types of shadows, objects, and background. Experimental results show the performance and accuracy of our approach in different shadowed materials and illumination conditions.
Address Kyoto, Japan
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1550-5499 ISBN 978-1-4244-4420-5 Medium
Area Expedition Conference ICCV
Notes Approved no
Call Number ISE @ ise @ HHM2009 Serial 1213
Permanent link to this record
 

 
Author Petia Radeva; Joan Serrat; Enric Marti
Title A snake for model-based segmentation Type Conference Article
Year 1995 Publication Proc. Conf. Fifth Int Computer Vision Abbreviated Journal
Volume Issue Pages 816-821
Keywords snakes; elastic matching; model-based segmenta tion
Abstract Despite the promising results of numerous applications, the hitherto proposed snake techniques share some common problems: snake attraction by spurious edge points, snake degeneration (shrinking and attening), convergence and stability of the deformation process, snake initialization and local determination of the parameters of elasticity. We argue here that these problems can be solved only when all the snake aspects are considered. The snakes proposed here implement a new potential eld and external force in order to provide a deformation convergence, attraction by both near and far edges as well as snake behaviour selective according to the edge orientation. Furthermore, we conclude that in the case of model-based seg mentation, the internal force should include structural information about the expected snake shape. Experiments using this kind of snakes for segmenting bones in complex hand radiographs show a signi cant improvement.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB;ADAS;IAM Approved no
Call Number IAM @ iam @ RSM1995 Serial 1634
Permanent link to this record
 

 
Author Bogdan Raducanu; Jordi Vitria; D. Gatica-Perez
Title You are Fired! Nonverbal Role Analysis in Competitive Meetings Type Conference Article
Year 2009 Publication IEEE International Conference on Audio, Speech and Signal Processing Abbreviated Journal
Volume Issue Pages 1949–1952
Keywords
Abstract This paper addresses the problem of social interaction analysis in competitive meetings, using nonverbal cues. For our study, we made use of ldquoThe Apprenticerdquo reality TV show, which features a competition for a real, highly paid corporate job. Our analysis is centered around two tasks regarding a person's role in a meeting: predicting the person with the highest status and predicting the fired candidates. The current study was carried out using nonverbal audio cues. Results obtained from the analysis of a full season of the show, representing around 90 minutes of audio data, are very promising (up to 85.7% of accuracy in the first case and up to 92.8% in the second case). Our approach is based only on the nonverbal interaction dynamics during the meeting without relying on the spoken words.
Address Taipei, Taiwan
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1520-6149 ISBN 978-1-4244-2353-8 Medium
Area Expedition Conference ICASSP
Notes OR;MV Approved no
Call Number BCNPCL @ bcnpcl @ RVG2009 Serial 1154
Permanent link to this record