|   | 
Details
   web
Records
Author Minesh Mathew; Ruben Tito; Dimosthenis Karatzas; R.Manmatha; C.V. Jawahar
Title Document Visual Question Answering Challenge 2020 Type Conference Article
Year 2020 Publication 33rd IEEE Conference on Computer Vision and Pattern Recognition – Short paper Abbreviated Journal
Volume Issue Pages
Keywords
Abstract This paper presents results of Document Visual Question Answering Challenge organized as part of “Text and Documents in the Deep Learning Era” workshop, in CVPR 2020. The challenge introduces a new problem – Visual Question Answering on document images. The challenge comprised two tasks. The first task concerns with asking questions on a single document image. On the other hand, the second task is set as a retrieval task where the question is posed over a collection of images. For the task 1 a new dataset is introduced comprising 50,000 questions-answer(s) pairs defined over 12,767 document images. For task 2 another dataset has been created comprising 20 questions over 14,362 document images which share the same document template.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPR
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ MTK2020 Serial 3558
Permanent link to this record
 

 
Author Fei Yang; Luis Herranz; Yongmei Cheng; Mikhail Mozerov
Title Slimmable compressive autoencoders for practical neural image compression Type Conference Article
Year 2021 Publication 34th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages 4996-5005
Keywords
Abstract Neural image compression leverages deep neural networks to outperform traditional image codecs in rate-distortion performance. However, the resulting models are also heavy, computationally demanding and generally optimized for a single rate, limiting their practical use. Focusing on practical image compression, we propose slimmable compressive autoencoders (SlimCAEs), where rate (R) and distortion (D) are jointly optimized for different capacities. Once trained, encoders and decoders can be executed at different capacities, leading to different rates and complexities. We show that a successful implementation of SlimCAEs requires suitable capacity-specific RD tradeoffs. Our experiments show that SlimCAEs are highly flexible models that provide excellent rate-distortion performance, variable rate, and dynamic adjustment of memory, computational cost and latency, thus addressing the main requirements of practical image compression.
Address Virtual; June 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPR
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ YHC2021 Serial 3569
Permanent link to this record
 

 
Author Matthias Eisenmann; Annika Reinke; Vivienn Weru; Minu D. Tizabi; Fabian Isensee; Tim J. Adler; Sharib Ali; Vincent Andrearczyk; Marc Aubreville; Ujjwal Baid; Spyridon Bakas; Niranjan Balu; Sophia Bano; Jorge Bernal; Sebastian Bodenstedt; Alessandro Casella; Veronika Cheplygina; Marie Daum; Marleen de Bruijne
Title Why Is the Winner the Best? Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages 19955-19966
Keywords
Abstract International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and postprocessing (66%). The “typical” lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.
Address Vancouver; Canada; June 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPR
Notes ISE Approved no
Call Number Admin @ si @ ERW2023 Serial 3842
Permanent link to this record
 

 
Author JW Xiao; CB Zhang; J. Feng; Xialei Liu; Joost Van de Weijer; MM Cheng
Title Endpoints Weight Fusion for Class Incremental Semantic Segmentation Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages 7204-7213
Keywords
Abstract Class incremental semantic segmentation (CISS) focuses on alleviating catastrophic forgetting to improve discrimination. Previous work mainly exploit regularization (e.g., knowledge distillation) to maintain previous knowledge in the current model. However, distillation alone often yields limited gain to the model since only the representations of old and new models are restricted to be consistent. In this paper, we propose a simple yet effective method to obtain a model with strong memory of old knowledge, named Endpoints Weight Fusion (EWF). In our method, the model containing old knowledge is fused with the model retaining new knowledge in a dynamic fusion manner, strengthening the memory of old classes in ever-changing distributions. In addition, we analyze the relation between our fusion strategy and a popular moving average technique EMA, which reveals why our method is more suitable for class-incremental learning. To facilitate parameter fusion with closer distance in the parameter space, we use distillation to enhance the optimization process. Furthermore, we conduct experiments on two widely used datasets, achieving the state-of-the-art performance.
Address Vancouver; Canada; June 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPR
Notes LAMP Approved no
Call Number Admin @ si @ XZF2023 Serial 3854
Permanent link to this record
 

 
Author Senmao Li; Joost Van de Weijer; Yaxing Wang; Fahad Shahbaz Khan; Meiqin Liu; Jian Yang
Title 3D-Aware Multi-Class Image-to-Image Translation with NeRFs Type Conference Article
Year 2023 Publication 36th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages 12652-12662
Keywords
Abstract Recent advances in 3D-aware generative models (3D-aware GANs) combined with Neural Radiance Fields (NeRF) have achieved impressive results. However no prior works investigate 3D-aware GANs for 3D consistent multiclass image-to-image (3D-aware 121) translation. Naively using 2D-121 translation methods suffers from unrealistic shape/identity change. To perform 3D-aware multiclass 121 translation, we decouple this learning process into a multiclass 3D-aware GAN step and a 3D-aware 121 translation step. In the first step, we propose two novel techniques: a new conditional architecture and an effective training strategy. In the second step, based on the well-trained multiclass 3D-aware GAN architecture, that preserves view-consistency, we construct a 3D-aware 121 translation system. To further reduce the view-consistency problems, we propose several new techniques, including a U-net-like adaptor network design, a hierarchical representation constrain and a relative regularization loss. In exten-sive experiments on two datasets, quantitative and qualitative results demonstrate that we successfully perform 3D-aware 121 translation with multi-view consistency. Code is available in 3DI2I.
Address Vancouver; Canada; June 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPR
Notes LAMP Approved no
Call Number Admin @ si @ LWW2023b Serial 3920
Permanent link to this record
 

 
Author Hugo Bertiche; Niloy J Mitra; Kuldeep Kulkarni; Chun Hao Paul Huang; Tuanfeng Y Wang; Meysam Madadi; Sergio Escalera; Duygu Ceylan
Title Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images Type Conference Article
Year 2023 Publication 36th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages 459-468
Keywords
Abstract Cinemagraphs are short looping videos created by adding subtle motions to a static image. This kind of media is popular and engaging. However, automatic generation of cinemagraphs is an underexplored area and current solutions require tedious low-level manual authoring by artists. In this paper, we present an automatic method that allows generating human cinemagraphs from single RGB images. We investigate the problem in the context of dressed humans under the wind. At the core of our method is a novel cyclic neural network that produces looping cinemagraphs for the target loop duration. To circumvent the problem of collecting real data, we demonstrate that it is possible, by working in the image normal space, to learn garment motion dynamics on synthetic data and generalize to real data. We evaluate our method on both synthetic and real data and demonstrate that it is possible to create compelling and plausible cinemagraphs from single RGB images.
Address Vancouver; Canada; June 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPR
Notes HUPBA Approved no
Call Number Admin @ si @ BMK2023 Serial 3921
Permanent link to this record
 

 
Author Justine Giroux; Mohammad Reza Karimi Dastjerdi; Yannick Hold-Geoffroy; Javier Vazquez; Jean François Lalonde
Title Towards a Perceptual Evaluation Framework for Lighting Estimation Type Conference Article
Year 2024 Publication Arxiv Abbreviated Journal
Volume Issue Pages
Keywords
Abstract rogress in lighting estimation is tracked by computing existing image quality assessment (IQA) metrics on images from standard datasets. While this may appear to be a reasonable approach, we demonstrate that doing so does not correlate to human preference when the estimated lighting is used to relight a virtual scene into a real photograph. To study this, we design a controlled psychophysical experiment where human observers must choose their preference amongst rendered scenes lit using a set of lighting estimation algorithms selected from the recent literature, and use it to analyse how these algorithms perform according to human perception. Then, we demonstrate that none of the most popular IQA metrics from the literature, taken individually, correctly represent human perception. Finally, we show that by learning a combination of existing IQA metrics, we can more accurately represent human preference. This provides a new perceptual framework to help evaluate future lighting estimation algorithms.
Address Seattle; USA; June 2024
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPR
Notes MACO; CIC Approved no
Call Number Admin @ si @ GDH2024 Serial 3999
Permanent link to this record
 

 
Author Chen Zhang; Maria del Mar Vila Muñoz; Petia Radeva; Roberto Elosua; Maria Grau; Angels Betriu; Elvira Fernandez-Giraldez; Laura Igual
Title Carotid Artery Segmentation in Ultrasound Images Type Conference Article
Year 2015 Publication Computing and Visualization for Intravascular Imaging and Computer Assisted Stenting (CVII-STENT2015), Joint MICCAI Workshops Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Munich; Germany; October 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVII-STENT
Notes MILAB Approved no
Call Number Admin @ si @ ZVR2015 Serial 2675
Permanent link to this record
 

 
Author Francesco Ciompi; Oriol Pujol; Simone Balocco; Xavier Carrillo; J. Mauri; Petia Radeva
Title Automatic Key Frames Detection in Intravascular Ultrasound Sequences Type Conference Article
Year 2011 Publication In MICCAI 2011 Workshop on Computing and Visualization for Intra Vascular Imaging Abbreviated Journal
Volume Issue Pages
Keywords
Abstract We present a method for the automatic detection of key frames in Intravascular Ultrasound (IVUS) sequences. The key frames are markers delimiting morphological changes along the vessel. The aim of defining key frames is two-fold: (1) they allow to summarize the content of the pullback into few representative frames; (2) they represent the basis for the automatic detection of clinical events in IVUS. The proposed approach achieved a compression ratio of 0.016 with respect to the original sequence and an average inter-frame distance of 61.76 frame, minimizing the number of missed clinical events.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVII
Notes MILAB;HuPBA Approved no
Call Number Admin @ si @ CPB2011 Serial 1767
Permanent link to this record
 

 
Author Josep Llados
Title Computer Vision: Progress of Research and Development Type Book Whole
Year 2006 Publication 1st CVC Internal Workshop Computer Vision: Progress of Research and Development, Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor J. Llados (ed.),
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 84-933652-8-9 Medium
Area Expedition Conference (down) CVCRD
Notes DAG Approved no
Call Number DAG @ dag @ Lla2006b Serial 766
Permanent link to this record
 

 
Author Robert Benavente; Laura Igual; Fernando Vilariño
Title Current Challenges in Computer Vision Type Book Whole
Year 2008 Publication Proccedings of the Third Internal Workshop Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-936529-0-6 Medium
Area Expedition Conference (down) CVCRD
Notes MILAB;CIC;SIAI Approved no
Call Number BCNPCL @ bcnpcl @ BIV2008 Serial 1110
Permanent link to this record
 

 
Author Ariel Amato; Angel Sappa; Alicia Fornes; Felipe Lumbreras; Josep Llados
Title Divide and Conquer: Atomizing and Parallelizing A Task in A Mobile Crowdsourcing Platform Type Conference Article
Year 2013 Publication 2nd International ACM Workshop on Crowdsourcing for Multimedia Abbreviated Journal
Volume Issue Pages 21-22
Keywords
Abstract In this paper we present some conclusions about the advantages of having an efficient task formulation when a crowdsourcing platform is used. In particular we show how the task atomization and distribution can help to obtain results in an efficient way. Our proposal is based on a recursive splitting of the original task into a set of smaller and simpler tasks. As a result both more accurate and faster solutions are obtained. Our evaluation is performed on a set of ancient documents that need to be digitized.
Address Barcelona; October 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4503-2396-3 Medium
Area Expedition Conference (down) CrowdMM
Notes ADAS; ISE; DAG; 600.054; 600.055; 600.045; 600.061; 602.006 Approved no
Call Number Admin @ si @ SLA2013 Serial 2335
Permanent link to this record
 

 
Author Naila Murray; Eduard Vazquez
Title Lacuna Restoration: How to choose a neutral colour? Type Conference Article
Year 2010 Publication Proceedings of The CREATE 2010 Conference Abbreviated Journal
Volume Issue Pages 248–252
Keywords
Abstract Painting restoration which involves filling in material loss (called lacuna) is a complex process. Several standard techniques exist to tackle lacuna restoration,
and this article focuses on those techniques that employ a “neutral” colour to mask the defect. Restoration experts often disagree on the choice of such a colour and in fact, the concept of a neutral colour is controversial. We posit that a neutral colour is one that attracts relatively little visual attention for a specific lacuna. We conducted an eye tracking experiment to compare two common neutral
colour selection methods, specifically the most common local colour and the mean local colour. Results obtained demonstrate that the most common local colour triggers less visual attention in general. Notwithstanding, we have observed instances in which the most common colour triggers a significant amount of attention when subjects spent time resolving their confusion about whether or not a lacuna was part of the painting.
Address Gjovik, Norway
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CREATE
Notes CIC Approved no
Call Number Admin @ si @ MuV2010 Serial 1297
Permanent link to this record
 

 
Author Marta Teres; Eduard Vazquez
Title Museums, spaces and museographical resources. Current state and proposals for a multidisciplinary framework to open new perspectives Type Conference Article
Year 2010 Publication Proceedings of The CREATE 2010 Conference Abbreviated Journal
Volume Issue Pages 319–323
Keywords
Abstract Two of the main aims of a museum are to communicate its heritage and to make enjoy its visitors. This communication can be done through the pieces itself and the museographical resources but also through the building, the interior design, the light and the colour. Art museums, in opposition with other museums, lack on the application of these additional resources. Such a work necessarily requires a multidisciplinary point of view for a holistic vision of all what a museum implies and to use all its potential as a tool of knowledge and culture for all the visitors.
Address Gjovik, Norway
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CREATE
Notes Approved no
Call Number Admin @ si @ TeV2010 Serial 1298
Permanent link to this record
 

 
Author Eduard Vazquez; Ramon Baldrich
Title Non-supervised goodness measure for image segmentation Type Conference Article
Year 2010 Publication Proceedings of The CREATE 2010 Conference Abbreviated Journal
Volume Issue Pages 334–335
Keywords
Abstract
Address Gjovik, Norway
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CREATE
Notes CIC Approved no
Call Number CAT @ cat @ VaB2010 Serial 1299
Permanent link to this record