|   | 
Details
   web
Records
Author Soumya Jahagirdar; Minesh Mathew; Dimosthenis Karatzas; CV Jawahar
Title Watching the News: Towards VideoQA Models that can Read Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Video Question Answering methods focus on commonsense reasoning and visual cognition of objects or persons and their interactions over time. Current VideoQA approaches ignore the textual information present in the video. Instead, we argue that textual information is complementary to the action and provides essential contextualisation cues to the reasoning process. To this end, we propose a novel VideoQA task that requires reading and understanding the text in the video. To explore this direction, we focus on news videos and require QA systems to comprehend and answer questions about the topics presented by combining visual and textual cues in the video. We introduce the ``NewsVideoQA'' dataset that comprises more than 8,600 QA pairs on 3,000+ news videos obtained from diverse news channels from around the world. We demonstrate the limitations of current Scene Text VQA and VideoQA methods and propose ways to incorporate scene text information into VideoQA methods.
Address Waikoloa; Hawai; USA; January 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) WACV
Notes DAG Approved no
Call Number Admin @ si @ JMK2023 Serial 3899
Permanent link to this record
 

 
Author Marcos V Conde; Florin Vasluianu; Javier Vazquez; Radu Timofte
Title Perceptual image enhancement for smartphone real-time applications Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 1848-1858
Keywords
Abstract Recent advances in camera designs and imaging pipelines allow us to capture high-quality images using smartphones. However, due to the small size and lens limitations of the smartphone cameras, we commonly find artifacts or degradation in the processed images. The most common unpleasant effects are noise artifacts, diffraction artifacts, blur, and HDR overexposure. Deep learning methods for image restoration can successfully remove these artifacts. However, most approaches are not suitable for real-time applications on mobile devices due to their heavy computation and memory requirements. In this paper, we propose LPIENet, a lightweight network for perceptual image enhancement, with the focus on deploying it on smartphones. Our experiments show that, with much fewer parameters and operations, our model can deal with the mentioned artifacts and achieve competitive performance compared with state-of-the-art methods on standard benchmarks. Moreover, to prove the efficiency and reliability of our approach, we deployed the model directly on commercial smartphones and evaluated its performance. Our model can process 2K resolution images under 1 second in mid-level commercial smartphones.
Address Waikoloa; Hawai; USA; January 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) WACV
Notes MACO; CIC Approved no
Call Number Admin @ si @ CVV2023 Serial 3900
Permanent link to this record
 

 
Author Dipam Goswami; J Schuster; Joost Van de Weijer; Didier Stricker
Title Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 3195-3204
Keywords
Abstract Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation. D Goswami, R Schuster, J van de Weijer, D Stricker. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 3195-3204
Address Waikoloa; Hawai; USA; January 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) WACV
Notes LAMP Approved no
Call Number Admin @ si @ GSW2023 Serial 3901
Permanent link to this record
 

 
Author Alloy Das; Sanket Biswas; Ayan Banerjee; Josep Llados; Umapada Pal; Saumik Bhattacharya
Title Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance Type Conference Article
Year 2024 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 718-728
Keywords
Abstract The adaptation capability to a wide range of domains is crucial for scene text spotting models when deployed to real-world conditions. However, existing state-of-the-art (SOTA) approaches usually incorporate scene text detection and recognition simply by pretraining on natural scene text datasets, which do not directly exploit the intermediate feature representations between multiple domains. Here, we investigate the problem of domain-adaptive scene text spotting, i.e., training a model on multi-domain source data such that it can directly adapt to target domains rather than being specialized for a specific domain or scenario. Further, we investigate a transformer baseline called Swin-TESTR to focus on solving scene-text spotting for both regular and arbitrary-shaped scene text along with an exhaustive evaluation. The results clearly demonstrate the potential of intermediate representations to achieve significant performance on text spotting benchmarks across multiple domains (e.g. language, synth-to-real, and documents). both in terms of accuracy and efficiency.
Address Waikoloa; Hawai; USA; January 2024
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) WACV
Notes DAG Approved no
Call Number Admin @ si @ DBB2024 Serial 3986
Permanent link to this record
 

 
Author Alex Gomez-Villa; Bartlomiej Twardowski; Kai Wang; Joost van de Weijer
Title Plasticity-Optimized Complementary Networks for Unsupervised Continual Learning Type Conference Article
Year 2024 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 1690-1700
Keywords
Abstract Continuous unsupervised representation learning (CURL) research has greatly benefited from improvements in self-supervised learning (SSL) techniques. As a result, existing CURL methods using SSL can learn high-quality representations without any labels, but with a notable performance drop when learning on a many-tasks data stream. We hypothesize that this is caused by the regularization losses that are imposed to prevent forgetting, leading to a suboptimal plasticity-stability trade-off: they either do not adapt fully to the incoming data (low plasticity), or incur significant forgetting when allowed to fully adapt to a new SSL pretext-task (low stability). In this work, we propose to train an expert network that is relieved of the duty of keeping the previous knowledge and can focus on performing optimally on the new tasks (optimizing plasticity). In the second phase, we combine this new knowledge with the previous network in an adaptation-retrospection phase to avoid forgetting and initialize a new expert with the knowledge of the old network. We perform several experiments showing that our proposed approach outperforms other CURL exemplar-free methods in few- and many-task split settings. Furthermore, we show how to adapt our approach to semi-supervised continual learning (Semi-SCL) and show that we surpass the accuracy of other exemplar-free Semi-SCL methods and reach the results of some others that use exemplars.
Address Waikoloa; Hawai; USA; January 2024
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) WACV
Notes LAMP Approved no
Call Number Admin @ si @ GTW2024 Serial 3989
Permanent link to this record
 

 
Author Sergi Garcia Bordils; Dimosthenis Karatzas; Marçal Rusiñol
Title STEP – Towards Structured Scene-Text Spotting Type Conference Article
Year 2024 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 883-892
Keywords
Abstract We introduce the structured scene-text spotting task, which requires a scene-text OCR system to spot text in the wild according to a query regular expression. Contrary to generic scene text OCR, structured scene-text spotting seeks to dynamically condition both scene text detection and recognition on user-provided regular expressions. To tackle this task, we propose the Structured TExt sPotter (STEP), a model that exploits the provided text structure to guide the OCR process. STEP is able to deal with regular expressions that contain spaces and it is not bound to detection at the word-level granularity. Our approach enables accurate zero-shot structured text spotting in a wide variety of real-world reading scenarios and is solely trained on publicly available data. To demonstrate the effectiveness of our approach, we introduce a new challenging test dataset that contains several types of out-of-vocabulary structured text, reflecting important reading applications of fields such as prices, dates, serial numbers, license plates etc. We demonstrate that STEP can provide specialised OCR performance on demand in all tested scenarios.
Address Waikoloa; Hawai; USA; January 2024
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) WACV
Notes DAG Approved no
Call Number Admin @ si @ GKR2024 Serial 3992
Permanent link to this record
 

 
Author Hunor Laczko; Meysam Madadi; Sergio Escalera; Jordi Gonzalez
Title A Generative Multi-Resolution Pyramid and Normal-Conditioning 3D Cloth Draping Type Conference Article
Year 2024 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 8709-8718
Keywords
Abstract RGB cloth generation has been deeply studied in the related literature, however, 3D garment generation remains an open problem. In this paper, we build a conditional variational autoencoder for 3D garment generation and draping. We propose a pyramid network to add garment details progressively in a canonical space, i.e. unposing and unshaping the garments w.r.t. the body. We study conditioning the network on surface normal UV maps, as an intermediate representation, which is an easier problem to optimize than 3D coordinates. Our results on two public datasets, CLOTH3D and CAPE, show that our model is robust, controllable in terms of detail generation by the use of multi-resolution pyramids, and achieves state-of-the-art results that can highly generalize to unseen garments, poses, and shapes even when training with small amounts of data.
Address Waikoloa; Hawai; USA; January 2024
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) WACV
Notes ISE; HUPBA Approved no
Call Number Admin @ si @ LME2024 Serial 3996
Permanent link to this record
 

 
Author C. Alejandro Parraga; Arash Akbarinia
Title Colour Constancy as a Product of Dynamic Centre-Surround Adaptation Type Conference Article
Year 2016 Publication 16th Annual meeting in Vision Sciences Society Abbreviated Journal
Volume 16 Issue 12 Pages
Keywords
Abstract Colour constancy refers to the human visual system's ability to preserve the perceived colour of objects despite changes in the illumination. Its exact mechanisms are unknown, although a number of systems ranging from retinal to cortical and memory are thought to play important roles. The strength of the perceptual shift necessary to preserve these colours is usually estimated by the vectorial distances from an ideal match (or canonical illuminant). In this work we explore how much of the colour constancy phenomenon could be explained by well-known physiological properties of V1 and V2 neurons whose receptive fields (RF) vary according to the contrast and orientation of surround stimuli. Indeed, it has been shown that both RF size and the normalization occurring between centre and surround in cortical neurons depend on the local properties of surrounding stimuli. Our stating point is the construction of a computational model which includes this dynamical centre-surround adaptation by means of two overlapping asymmetric Gaussian kernels whose variances are adjusted to the contrast of surrounding pixels to represent the changes in RF size of cortical neurons and the weights of their respective contributions are altered according to differences in centre-surround contrast and orientation. The final output of the model is obtained after convolving an image with this dynamical operator and an estimation of the illuminant is obtained by considering the contrast of the far surround. We tested our algorithm on naturalistic stimuli from several benchmark datasets. Our results show that although our model does not require any training, its performance against the state-of-the-art is highly competitive, even outperforming learning-based algorithms in some cases. Indeed, these results are very encouraging if we consider that they were obtained with the same parameters for all datasets (i.e. just like the human visual system operates).
Address Florida; USA; May 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) VSS
Notes NEUROBIT Approved no
Call Number Admin @ si @ PaA2016b Serial 2901
Permanent link to this record
 

 
Author G.D. Evangelidis; Ferran Diego; Joan Serrat; Antonio Lopez
Title Slice Matching for Accurate Spatio-Temporal Alignment Type Conference Article
Year 2011 Publication In ICCV Workshop on Visual Surveillance Abbreviated Journal
Volume Issue Pages
Keywords video alignment
Abstract Video synchronization and alignment is a rather recent topic in computer vision. It usually deals with the problem of aligning sequences recorded simultaneously by static, jointly- or independently-moving cameras. In this paper, we investigate the more difficult problem of matching videos captured at different times from independently-moving cameras, whose trajectories are approximately coincident or parallel. To this end, we propose a novel method that pixel-wise aligns videos and allows thus to automatically highlight their differences. This primarily aims at visual surveillance but the method can be adopted as is by other related video applications, like object transfer (augmented reality) or high dynamic range video. We build upon a slice matching scheme to first synchronize the sequences, while we develop a spatio-temporal alignment scheme to spatially register corresponding frames and refine the temporal mapping. We investigate the performance of the proposed method on videos recorded from vehicles driven along different types of roads and compare with related previous works.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) VS
Notes ADAS Approved no
Call Number Admin @ si @ EDS2011; ADAS @ adas @ eds2011a Serial 1861
Permanent link to this record
 

 
Author Juan Ignacio Toledo; Jordi Cucurull; Jordi Puiggali; Alicia Fornes; Josep Llados
Title Document Analysis Techniques for Automatic Electoral Document Processing: A Survey Type Conference Article
Year 2015 Publication E-Voting and Identity, Proceedings of 5th international conference, VoteID 2015 Abbreviated Journal
Volume Issue Pages 139-141
Keywords Document image analysis; Computer vision; Paper ballots; Paper based elections; Optical scan; Tally
Abstract In this paper, we will discuss the most common challenges in electoral document processing and study the different solutions from the document analysis community that can be applied in each case. We will cover Optical Mark Recognition techniques to detect voter selections in the Australian Ballot, handwritten number recognition for preferential elections and handwriting recognition for write-in areas. We will also propose some particular adjustments that can be made to those general techniques in the specific context of electoral documents.
Address Bern; Switzerland; September 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) VoteID
Notes DAG; 600.061; 602.006; 600.077 Approved no
Call Number Admin @ si @ TCP2015 Serial 2641
Permanent link to this record
 

 
Author Jordi Gonzalez; Josep M. Gonfaus; Carles Fernandez; Xavier Roca
Title Exploiting Natural-Language Interaction in Video Surveillance Systems Type Conference Article
Year 2011 Publication V&L Net Workshop on Vision and Language Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Brighton, UK
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) VL
Notes ISE Approved no
Call Number Admin @ si @ GGF2011 Serial 1813
Permanent link to this record
 

 
Author Joost Van de Weijer; Shida Beigpour
Title The Dichromatic Reflection Model: Future Research Directions and Applications Type Conference Article
Year 2011 Publication International Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications Abbreviated Journal
Volume Issue Pages
Keywords dblp
Abstract The dichromatic reflection model (DRM) predicts that color distributions form a parallelogram in color space, whose shape is defined by the body reflectance and the illuminant color. In this paper we resume the assumptions which led to the DRM and shortly recall two of its main applications domains: color image segmentation and photometric invariant feature computation. After having introduced the model we discuss several limitations of the theory, especially those which are raised once working on real-world uncalibrated images. In addition, we summerize recent extensions of the model which allow to handle more complicated light interactions. Finally, we suggest some future research directions which would further extend its applicability.
Address Algarve, Portugal
Corporate Author Thesis
Publisher SciTePress Place of Publication Editor Mestetskiy, Leonid and Braz, José
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-989-8425-47-8 Medium
Area Expedition Conference (down) VISIGRAPP
Notes CIC Approved no
Call Number Admin @ si @ WeB2011 Serial 1778
Permanent link to this record
 

 
Author Joan M. Nuñez; Jorge Bernal; F. Javier Sanchez; Fernando Vilariño
Title Blood Vessel Characterization in Colonoscopy Images to Improve Polyp Localization Type Conference Article
Year 2013 Publication Proceedings of the International Conference on Computer Vision Theory and Applications Abbreviated Journal
Volume 1 Issue Pages 162-171
Keywords Colonoscopy; Blood vessel; Linear features; Valley detection
Abstract This paper presents an approach to mitigate the contribution of blood vessels to the energy image used at different tasks of automatic colonoscopy image analysis. This goal is achieved by introducing a characterization of endoluminal scene objects which allows us to differentiate between the trace of 2-dimensional visual objects,such as vessels, and shades from 3-dimensional visual objects, such as folds. The proposed characterization is based on the influence that the object shape has in the resulting visual feature, and it leads to the development of a blood vessel attenuation algorithm. A database consisting of manually labelled masks was built in order to test the performance of our method, which shows an encouraging success in blood vessel mitigation while keeping other structures intact. Moreover, by extending our method to the only available polyp localization
algorithm tested on a public database, blood vessel mitigation proved to have a positive influence on the overall performance.
Address Barcelona; February 2013
Corporate Author Thesis
Publisher SciTePress Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area 800 Expedition Conference (down) VISIGRAPP
Notes MV; 600.054; 600.057;SIAI Approved no
Call Number IAM @ iam @ NBS2013 Serial 2198
Permanent link to this record
 

 
Author Mirko Arnold; Anarta Ghosh; Glen Doherty; Hugh Mulcahy; Stephen Patchett; Gerard Lacey
Title Towards Automatic Direct Observation of Procedure and Skill (DOPS) in Colonoscopy Type Conference Article
Year 2013 Publication Proceedings of the International Conference on Computer Vision Theory and Applications Abbreviated Journal
Volume Issue Pages 48-53
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area 800 Expedition Conference (down) VISIGRAPP
Notes MV Approved no
Call Number fernando @ fernando @ Serial 2427
Permanent link to this record
 

 
Author Chris Bahnsen; David Vazquez; Antonio Lopez; Thomas B. Moeslund
Title Learning to Remove Rain in Traffic Surveillance by Using Synthetic Data Type Conference Article
Year 2019 Publication 14th International Conference on Computer Vision Theory and Applications Abbreviated Journal
Volume Issue Pages 123-130
Keywords Rain Removal; Traffic Surveillance; Image Denoising
Abstract Rainfall is a problem in automated traffic surveillance. Rain streaks occlude the road users and degrade the overall visibility which in turn decrease object detection performance. One way of alleviating this is by artificially removing the rain from the images. This requires knowledge of corresponding rainy and rain-free images. Such images are often produced by overlaying synthetic rain on top of rain-free images. However, this method fails to incorporate the fact that rain fall in the entire three-dimensional volume of the scene. To overcome this, we introduce training data from the SYNTHIA virtual world that models rain streaks in the entirety of a scene. We train a conditional Generative Adversarial Network for rain removal and apply it on traffic surveillance images from SYNTHIA and the AAU RainSnow datasets. To measure the applicability of the rain-removed images in a traffic surveillance context, we run the YOLOv2 object detection algorithm on the original and rain-removed frames. The results on SYNTHIA show an 8% increase in detection accuracy compared to the original rain image. Interestingly, we find that high PSNR or SSIM scores do not imply good object detection performance.
Address Praga; Czech Republic; February 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) VISIGRAPP
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ BVL2019 Serial 3256
Permanent link to this record