Home | << 1 2 >> |
Records | |||||
---|---|---|---|---|---|
Author | Daniel Hernandez; Antonio Espinosa; David Vazquez; Antonio Lopez; Juan Carlos Moure | ||||
Title | GPU-accelerated real-time stixel computation | Type | Conference Article | ||
Year | 2017 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 1054-1062 | ||
Keywords | Autonomous Driving; GPU; Stixel | ||||
Abstract | The Stixel World is a medium-level, compact representation of road scenes that abstracts millions of disparity pixels into hundreds or thousands of stixels. The goal of this work is to implement and evaluate a complete multi-stixel estimation pipeline on an embedded, energyefficient, GPU-accelerated device. This work presents a full GPU-accelerated implementation of stixel estimation that produces reliable results at 26 frames per second (real-time) on the Tegra X1 for disparity images of 1024×440 pixels and stixel widths of 5 pixels, and achieves more than 400 frames per second on a high-end Titan X GPU card. | ||||
Address | Santa Rosa; CA; USA; March 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | ADAS @ adas @ HEV2017b | Serial | 2812 | ||
Permanent link to this record | |||||
Author | Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera | ||||
Title | Support Vector Machines with Time Series Distance Kernels for Action Classification | Type | Conference Article | ||
Year | 2016 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 1-7 | ||
Keywords | |||||
Abstract | Despite the outperformance of Support Vector Machine (SVM) on many practical classification problems, the algorithm is not directly applicable to multi-dimensional trajectories having different lengths. In this paper, a new class of SVM that is applicable to trajectory classification, such as action recognition, is developed by incorporating two efficient time-series distances measures into the kernel function.
Dynamic Time Warping and Longest Common Subsequence distance measures along with their derivatives are employed as the SVM kernel. In addition, the pairwise proximity learning strategy is utilized in order to make use of non-positive semi-definite kernels in the SVM formulation. The proposed method is employed for a challenging classification problem: action recognition by depth cameras using only skeleton data; and evaluated on three benchmark action datasets. Experimental results demonstrate the outperformance of our methodology compared to the state-ofthe-art on the considered datasets. |
||||
Address | Lake Placid; NY (USA); March 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | HuPBA;MILAB; | Approved | no | ||
Call Number | Admin @ si @ BGE2016a | Serial | 2773 | ||
Permanent link to this record | |||||
Author | Laura Lopez-Fuentes; Andrew Bagdanov; Joost Van de Weijer; Harald Skinnemoen | ||||
Title | Bandwidth Limited Object Recognition in High Resolution Imagery | Type | Conference Article | ||
Year | 2017 | Publication | IEEE Winter conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | This paper proposes a novel method to optimize bandwidth usage for object detection in critical communication scenarios. We develop two operating models of active information seeking. The first model identifies promising regions in low resolution imagery and progressively requests higher resolution regions on which to perform recognition of higher semantic quality. The second model identifies promising regions in low resolution imagery while simultaneously predicting the approximate location of the object of higher semantic quality. From this general framework, we develop a car recognition system via identification of its license plate and evaluate the performance of both models on a car dataset that we introduce. Results are compared with traditional JPEG compression and demonstrate that our system saves up to one order of magnitude of bandwidth while sacrificing little in terms of recognition performance. | ||||
Address | Santa Rosa; CA; USA; March 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | LAMP; 600.068; 600.109; 600.084; 600.106; 600.079; 600.120 | Approved | no | ||
Call Number | Admin @ si @ LBW2017 | Serial | 2973 | ||
Permanent link to this record | |||||
Author | Lei Kang; Marçal Rusiñol; Alicia Fornes; Pau Riba; Mauricio Villegas | ||||
Title | Unsupervised Adaptation for Synthetic-to-Real Handwritten Word Recognition | Type | Conference Article | ||
Year | 2020 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Handwritten Text Recognition (HTR) is still a challenging problem because it must deal with two important difficulties: the variability among writing styles, and the scarcity of labelled data. To alleviate such problems, synthetic data generation and data augmentation are typically used to train HTR systems. However, training with such data produces encouraging but still inaccurate transcriptions in real words. In this paper, we propose an unsupervised writer adaptation approach that is able to automatically adjust a generic handwritten word recognizer, fully trained with synthetic fonts, towards a new incoming writer. We have experimentally validated our proposal using five different datasets, covering several challenges (i) the document source: modern and historic samples, which may involve paper degradation problems; (ii) different handwriting styles: single and multiple writer collections; and (iii) language, which involves different character combinations. Across these challenging collections, we show that our system is able to maintain its performance, thus, it provides a practical and generic approach to deal with new document collections without requiring any expensive and tedious manual annotation step. | ||||
Address | Aspen; Colorado; USA; March 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.129; 600.140; 601.302; 601.312; 600.121 | Approved | no | ||
Call Number | Admin @ si @ KRF2020 | Serial | 3446 | ||
Permanent link to this record | |||||
Author | Raul Gomez; Jaume Gibert; Lluis Gomez; Dimosthenis Karatzas | ||||
Title | Exploring Hate Speech Detection in Multimodal Publications | Type | Conference Article | ||
Year | 2020 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | In this work we target the problem of hate speech detection in multimodal publications formed by a text and an image. We gather and annotate a large scale dataset from Twitter, MMHS150K, and propose different models that jointly analyze textual and visual information for hate speech detection, comparing them with unimodal detection. We provide quantitative and qualitative results and analyze the challenges of the proposed task. We find that, even though images are useful for the hate speech detection task, current multimodal models cannot outperform models analyzing only text. We discuss why and open the field and the dataset for further research. | ||||
Address | Aspen; March 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ GGG2020a | Serial | 3280 | ||
Permanent link to this record | |||||
Author | Edgar Riba; D. Mishkin; Daniel Ponsa; E. Rublee; G. Bradski | ||||
Title | Kornia: an Open Source Differentiable Computer Vision Library for PyTorch | Type | Conference Article | ||
Year | 2020 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Aspen; Colorado; USA; March 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | MSIAU; 600.122; 600.130 | Approved | no | ||
Call Number | Admin @ si @ RMP2020 | Serial | 3291 | ||
Permanent link to this record | |||||
Author | Parichehr Behjati Ardakani; Pau Rodriguez; Armin Mehri; Isabelle Hupont; Carles Fernandez; Jordi Gonzalez | ||||
Title | OverNet: Lightweight Multi-Scale Super-Resolution with Overscaling Network | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 2693-2702 | ||
Keywords | |||||
Abstract | Super-resolution (SR) has achieved great success due to the development of deep convolutional neural networks (CNNs). However, as the depth and width of the networks increase, CNN-based SR methods have been faced with the challenge of computational complexity in practice. More- over, most SR methods train a dedicated model for each target resolution, losing generality and increasing memory requirements. To address these limitations we introduce OverNet, a deep but lightweight convolutional network to solve SISR at arbitrary scale factors with a single model. We make the following contributions: first, we introduce a lightweight feature extractor that enforces efficient reuse of information through a novel recursive structure of skip and dense connections. Second, to maximize the performance of the feature extractor, we propose a model agnostic reconstruction module that generates accurate high-resolution images from overscaled feature maps obtained from any SR architecture. Third, we introduce a multi-scale loss function to achieve generalization across scales. Experiments show that our proposal outperforms previous state-of-the-art approaches in standard benchmarks, while maintaining relatively low computation and memory requirements. | ||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | ISE; 600.119; 600.098 | Approved | no | ||
Call Number | Admin @ si @ BRM2021 | Serial | 3512 | ||
Permanent link to this record | |||||
Author | Andres Mafla; Sounak Dey; Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas | ||||
Title | Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features | Type | Conference Article | ||
Year | 2020 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Text contained in an image carries high-level semantics that can be exploited to achieve richer image understanding. In particular, the mere presence of text provides strong guiding content that should be employed to tackle a diversity of computer vision tasks such as image retrieval, fine-grained classification, and visual question answering. In this paper, we address the problem of fine-grained classification and image retrieval by leveraging textual information along with visual cues to comprehend the existing intrinsic relation between the two modalities. The novelty of the proposed model consists of the usage of a PHOC descriptor to construct a bag of textual words along with a Fisher Vector Encoding that captures the morphology of text. This approach provides a stronger multimodal representation for this task and as our experiments demonstrate, it achieves state-of-the-art results on two different tasks, fine-grained classification and image retrieval. | ||||
Address | Aspen; Colorado; USA; March 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ MDB2020 | Serial | 3334 | ||
Permanent link to this record | |||||
Author | Xavier Soria; Edgar Riba; Angel Sappa | ||||
Title | Dense Extreme Inception Network: Towards a Robust CNN Model for Edge Detection | Type | Conference Article | ||
Year | 2020 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | This paper proposes a Deep Learning based edge detector, which is inspired on both HED (Holistically-Nested Edge Detection) and Xception networks. The proposed approach generates thin edge-maps that are plausible for human eyes; it can be used in any edge detection task without previous training or fine tuning process. As a second contribution, a large dataset with carefully annotated edges has been generated. This dataset has been used for training the proposed approach as well the state-of-the-art algorithms for comparisons. Quantitative and qualitative evaluations have been performed on different benchmarks showing improvements with the proposed method when F-measure of ODS and OIS are considered. | ||||
Address | Aspen; USA; March 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | MSIAU; 600.130; 601.349; 600.122 | Approved | no | ||
Call Number | Admin @ si @ SRS2020 | Serial | 3434 | ||
Permanent link to this record | |||||
Author | Andres Mafla; Sounak Dey; Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas | ||||
Title | Multi-modal reasoning graph for scene-text based fine-grained image classification and retrieval | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 4022-4032 | ||
Keywords | |||||
Abstract | |||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ MDB2021 | Serial | 3491 | ||
Permanent link to this record | |||||
Author | Andres Mafla; Rafael S. Rezende; Lluis Gomez; Diana Larlus; Dimosthenis Karatzas | ||||
Title | StacMR: Scene-Text Aware Cross-Modal Retrieval | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 2219-2229 | ||
Keywords | |||||
Abstract | |||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ MRG2021a | Serial | 3492 | ||
Permanent link to this record | |||||
Author | Minesh Mathew; Dimosthenis Karatzas; C.V. Jawahar | ||||
Title | DocVQA: A Dataset for VQA on Document Images | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 2200-2209 | ||
Keywords | |||||
Abstract | We present a new dataset for Visual Question Answering (VQA) on document images called DocVQA. The dataset consists of 50,000 questions defined on 12,000+ document images. Detailed analysis of the dataset in comparison with similar datasets for VQA and reading comprehension is presented. We report several baseline results by adopting existing VQA and reading comprehension models. Although the existing models perform reasonably well on certain types of questions, there is large performance gap compared to human performance (94.36% accuracy). The models need to improve specifically on questions where understanding structure of the document is crucial. The dataset, code and leaderboard are available at docvqa. org | ||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ MKJ2021 | Serial | 3498 | ||
Permanent link to this record | |||||
Author | Cristina Palmero; Javier Selva; Sorina Smeureanu; Julio C. S. Jacques Junior; Albert Clapes; Alexa Mosegui; Zejian Zhang; David Gallardo; Georgina Guilera; David Leiva; Sergio Escalera | ||||
Title | Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 1-12 | ||
Keywords | |||||
Abstract | This paper introduces UDIVA, a new non-acted dataset of face-to-face dyadic interactions, where interlocutors perform competitive and collaborative tasks with different behavior elicitation and cognitive workload. The dataset consists of 90.5 hours of dyadic interactions among 147 participants distributed in 188 sessions, recorded using multiple audiovisual and physiological sensors. Currently, it includes sociodemographic, self- and peer-reported personality, internal state, and relationship profiling from participants. As an initial analysis on UDIVA, we propose a
transformer-based method for self-reported personality inference in dyadic scenarios, which uses audiovisual data and different sources of context from both interlocutors to regress a target person’s personality traits. Preliminary results from an incremental study show consistent improvements when using all available context information. |
||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ PSS2021 | Serial | 3532 | ||
Permanent link to this record | |||||
Author | Julio C. S. Jacques Junior; Agata Lapedriza; Cristina Palmero; Xavier Baro; Sergio Escalera | ||||
Title | Person Perception Biases Exposed: Revisiting the First Impressions Dataset | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 13-21 | ||
Keywords | |||||
Abstract | This work revisits the ChaLearn First Impressions database, annotated for personality perception using pairwise comparisons via crowdsourcing. We analyse for the first time the original pairwise annotations, and reveal existing person perception biases associated to perceived attributes like gender, ethnicity, age and face attractiveness.
We show how person perception bias can influence data labelling of a subjective task, which has received little attention from the computer vision and machine learning communities by now. We further show that the mechanism used to convert pairwise annotations to continuous values may magnify the biases if no special treatment is considered. The findings of this study are relevant for the computer vision community that is still creating new datasets on subjective tasks, and using them for practical applications, ignoring these perceptual biases. |
||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ JLP2021 | Serial | 3533 | ||
Permanent link to this record | |||||
Author | Armin Mehri; Parichehr Behjati Ardakani; Angel Sappa | ||||
Title | MPRNet: Multi-Path Residual Network for Lightweight Image Super Resolution | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 2703-2712 | ||
Keywords | |||||
Abstract | Lightweight super resolution networks have extremely importance for real-world applications. In recent years several SR deep learning approaches with outstanding achievement have been introduced by sacrificing memory and computational cost. To overcome this problem, a novel lightweight super resolution network is proposed, which improves the SOTA performance in lightweight SR and performs roughly similar to computationally expensive networks. Multi-Path Residual Network designs with a set of Residual concatenation Blocks stacked with Adaptive Residual Blocks: ($i$) to adaptively extract informative features and learn more expressive spatial context information; ($ii$) to better leverage multi-level representations before up-sampling stage; and ($iii$) to allow an efficient information and gradient flow within the network. The proposed architecture also contains a new attention mechanism, Two-Fold Attention Module, to maximize the representation ability of the model. Extensive experiments show the superiority of our model against other SOTA SR approaches. | ||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | MSIAU; 600.130; 600.122 | Approved | no | ||
Call Number | Admin @ si @ MAS2021b | Serial | 3582 | ||
Permanent link to this record |