|   | 
Details
   web
Records
Author Ozge Mercanoglu Sincan; Julio C. S. Jacques Junior; Sergio Escalera; Hacer Yalim Keles
Title ChaLearn LAP Large Scale Signer Independent Isolated Sign Language Recognition Challenge: Design, Results and Future Research Type (down) Conference Article
Year 2021 Publication Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages 3467-3476
Keywords
Abstract The performances of Sign Language Recognition (SLR) systems have improved considerably in recent years. However, several open challenges still need to be solved to allow SLR to be useful in practice. The research in the field is in its infancy in regards to the robustness of the models to a large diversity of signs and signers, and to fairness of the models to performers from different demographics. This work summarises the ChaLearn LAP Large Scale Signer Independent Isolated SLR Challenge, organised at CVPR 2021 with the goal of overcoming some of the aforementioned challenges. We analyse and discuss the challenge design, top winning solutions and suggestions for future research. The challenge attracted 132 participants in the RGB track and 59 in the RGB+Depth track, receiving more than 1.5K submissions in total. Participants were evaluated using a new large-scale multi-modal Turkish Sign Language (AUTSL) dataset, consisting of 226 sign labels and 36,302 isolated sign video samples performed by 43 different signers. Winning teams achieved more than 96% recognition rate, and their approaches benefited from pose/hand/face estimation, transfer learning, external data, fusion/ensemble of modalities and different strategies to model spatio-temporal information. However, methods still fail to distinguish among very similar signs, in particular those sharing similar hand trajectories.
Address Virtual; June 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ MJE2021 Serial 3560
Permanent link to this record
 

 
Author Marc Masana; Tinne Tuytelaars; Joost Van de Weijer
Title Ternary Feature Masks: zero-forgetting for task-incremental learning Type (down) Conference Article
Year 2021 Publication 34th IEEE Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages 3565-3574
Keywords
Abstract We propose an approach without any forgetting to continual learning for the task-aware regime, where at inference the task-label is known. By using ternary masks we can upgrade a model to new tasks, reusing knowledge from previous tasks while not forgetting anything about them. Using masks prevents both catastrophic forgetting and backward transfer. We argue -- and show experimentally -- that avoiding the former largely compensates for the lack of the latter, which is rarely observed in practice. In contrast to earlier works, our masks are applied to the features (activations) of each layer instead of the weights. This considerably reduces the number of mask parameters for each new task; with more than three orders of magnitude for most networks. The encoding of the ternary masks into two bits per feature creates very little overhead to the network, avoiding scalability issues. To allow already learned features to adapt to the current task without changing the behavior of these features for previous tasks, we introduce task-specific feature normalization. Extensive experiments on several finegrained datasets and ImageNet show that our method outperforms current state-of-the-art while reducing memory overhead in comparison to weight-based approaches.
Address Virtual; June 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ MTW2021 Serial 3565
Permanent link to this record
 

 
Author Sudeep Katakol; Luis Herranz; Fei Yang; Marta Mrak
Title DANICE: Domain adaptation without forgetting in neural image compression Type (down) Conference Article
Year 2021 Publication Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages 1921-1925
Keywords
Abstract Neural image compression (NIC) is a new coding paradigm where coding capabilities are captured by deep models learned from data. This data-driven nature enables new potential functionalities. In this paper, we study the adaptability of codecs to custom domains of interest. We show that NIC codecs are transferable and that they can be adapted with relatively few target domain images. However, naive adaptation interferes with the solution optimized for the original source domain, resulting in forgetting the original coding capabilities in that domain, and may even break the compatibility with previously encoded bitstreams. Addressing these problems, we propose Codec Adaptation without Forgetting (CAwF), a framework that can avoid these problems by adding a small amount of custom parameters, where the source codec remains embedded and unchanged during the adaptation process. Experiments demonstrate its effectiveness and provide useful insights on the characteristics of catastrophic interference in NIC.
Address Virtual; June 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes LAMP; 600.120; 600.141; 601.379 Approved no
Call Number Admin @ si @ KHY2021 Serial 3568
Permanent link to this record
 

 
Author Fei Yang; Luis Herranz; Yongmei Cheng; Mikhail Mozerov
Title Slimmable compressive autoencoders for practical neural image compression Type (down) Conference Article
Year 2021 Publication 34th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages 4996-5005
Keywords
Abstract Neural image compression leverages deep neural networks to outperform traditional image codecs in rate-distortion performance. However, the resulting models are also heavy, computationally demanding and generally optimized for a single rate, limiting their practical use. Focusing on practical image compression, we propose slimmable compressive autoencoders (SlimCAEs), where rate (R) and distortion (D) are jointly optimized for different capacities. Once trained, encoders and decoders can be executed at different capacities, leading to different rates and complexities. We show that a successful implementation of SlimCAEs requires suitable capacity-specific RD tradeoffs. Our experiments show that SlimCAEs are highly flexible models that provide excellent rate-distortion performance, variable rate, and dynamic adjustment of memory, computational cost and latency, thus addressing the main requirements of practical image compression.
Address Virtual; June 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ YHC2021 Serial 3569
Permanent link to this record
 

 
Author Arturo Fuentes; F. Javier Sanchez; Thomas Voncina; Jorge Bernal
Title LAMV: Learning to Predict Where Spectators Look in Live Music Performances Type (down) Conference Article
Year 2021 Publication 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications Abbreviated Journal
Volume 5 Issue Pages 500-507
Keywords
Abstract The advent of artificial intelligence has supposed an evolution on how different daily work tasks are performed. The analysis of cultural content has seen a huge boost by the development of computer-assisted methods that allows easy and transparent data access. In our case, we deal with the automation of the production of live shows, like music concerts, aiming to develop a system that can indicate the producer which camera to show based on what each of them is showing. In this context, we consider that is essential to understand where spectators look and what they are interested in so the computational method can learn from this information. The work that we present here shows the results of a first preliminary study in which we compare areas of interest defined by human beings and those indicated by an automatic system. Our system is based on the extraction of motion textures from dynamic Spatio-Temporal Volumes (STV) and then analyzing the patterns by means of texture analysis techniques. We validate our approach over several video sequences that have been labeled by 16 different experts. Our method is able to match those relevant areas identified by the experts, achieving recall scores higher than 80% when a distance of 80 pixels between method and ground truth is considered. Current performance shows promise when detecting abnormal peaks and movement trends.
Address Virtual; February 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VISIGRAPP
Notes MV; ISE; 600.119; Approved no
Call Number Admin @ si @ FSV2021 Serial 3570
Permanent link to this record
 

 
Author Adria Molina; Pau Riba; Lluis Gomez; Oriol Ramos Terrades; Josep Llados
Title Date Estimation in the Wild of Scanned Historical Photos: An Image Retrieval Approach Type (down) Conference Article
Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 12822 Issue Pages 306-320
Keywords
Abstract This paper presents a novel method for date estimation of historical photographs from archival sources. The main contribution is to formulate the date estimation as a retrieval task, where given a query, the retrieved images are ranked in terms of the estimated date similarity. The closer are their embedded representations the closer are their dates. Contrary to the traditional models that design a neural network that learns a classifier or a regressor, we propose a learning objective based on the nDCG ranking metric. We have experimentally evaluated the performance of the method in two different tasks: date estimation and date-sensitive image retrieval, using the DEW public database, overcoming the baseline methods.
Address Lausanne; Suissa; September 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121; 600.140; 110.312 Approved no
Call Number Admin @ si @ MRG2021b Serial 3571
Permanent link to this record
 

 
Author Pau Riba; Adria Molina; Lluis Gomez; Oriol Ramos Terrades; Josep Llados
Title Learning to Rank Words: Optimizing Ranking Metrics for Word Spotting Type (down) Conference Article
Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 12822 Issue Pages 381–395
Keywords
Abstract In this paper, we explore and evaluate the use of ranking-based objective functions for learning simultaneously a word string and a word image encoder. We consider retrieval frameworks in which the user expects a retrieval list ranked according to a defined relevance score. In the context of a word spotting problem, the relevance score has been set according to the string edit distance from the query string. We experimentally demonstrate the competitive performance of the proposed model on query-by-string word spotting for both, handwritten and real scene word images. We also provide the results for query-by-example word spotting, although it is not the main focus of this work.
Address Lausanne; Suissa; September 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121; 600.140; 110.312 Approved no
Call Number Admin @ si @ RMG2021 Serial 3572
Permanent link to this record
 

 
Author Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal
Title DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis Type (down) Conference Article
Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 12823 Issue Pages 555–568
Keywords
Abstract Despite significant progress on current state-of-the-art image generation models, synthesis of document images containing multiple and complex object layouts is a challenging task. This paper presents a novel approach, called DocSynth, to automatically synthesize document images based on a given layout. In this work, given a spatial layout (bounding boxes with object categories) as a reference by the user, our proposed DocSynth model learns to generate a set of realistic document images consistent with the defined layout. Also, this framework has been adapted to this work as a superior baseline model for creating synthetic document image datasets for augmenting real data during training for document layout analysis tasks. Different sets of learning objectives have been also used to improve the model performance. Quantitatively, we also compare the generated results of our model with real data using standard evaluation metrics. The results highlight that our model can successfully generate realistic and diverse document images with multiple objects. We also present a comprehensive qualitative analysis summary of the different scopes of synthetic image generation tasks. Lastly, to our knowledge this is the first work of its kind.
Address Lausanne; Suissa; September 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121; 600.140; 110.312 Approved no
Call Number Admin @ si @ BRL2021a Serial 3573
Permanent link to this record
 

 
Author Patricia Suarez; Angel Sappa; Boris X. Vintimilla; Riad I. Hammoud
Title Cycle Generative Adversarial Network: Towards A Low-Cost Vegetation Index Estimation Type (down) Conference Article
Year 2021 Publication 28th IEEE International Conference on Image Processing Abbreviated Journal
Volume Issue Pages 19-22
Keywords
Abstract This paper presents a novel unsupervised approach to estimate the Normalized Difference Vegetation Index (NDVI). The NDVI is obtained as the ratio between information from the visible and near infrared spectral bands; in the current work, the NDVI is estimated just from an image of the visible spectrum through a Cyclic Generative Adversarial Network (CyclicGAN). This unsupervised architecture learns to estimate the NDVI index by means of an image translation between the red channel of a given RGB image and the NDVI unpaired index’s image. The translation is obtained by means of a ResNET architecture and a multiple loss function. Experimental results obtained with this unsupervised scheme show the validity of the implemented model. Additionally, comparisons with the state of the art approaches are provided showing improvements with the proposed approach.
Address Anchorage-Alaska; USA; September 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIP
Notes MSIAU; 600.130; 600.122; 601.349 Approved no
Call Number Admin @ si @ SSV2021b Serial 3579
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla; Sabari Nathan; Priya Kansal; Armin Mehri; Parichehr Behjati Ardakani; A.Dalal; A.Akula; D.Sharma; S.Pandey; B.Kumar; J.Yao; R.Wu; KFeng; N.Li; Y.Zhao; H.Patel; V. Chudasama; K.Pjajapati; A.Sarvaiya; K.Upla; K.Raja; R.Ramachandra; C.Bush; F.Almasri; T.Vandamme; O.Debeir; N.Gutierrez; Q.Nguyen; W.Beksi
Title Thermal Image Super-Resolution Challenge – PBVS 2021 Type (down) Conference Article
Year 2021 Publication Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages 4359-4367
Keywords
Abstract This paper presents results from the second Thermal Image Super-Resolution (TISR) challenge organized in the framework of the Perception Beyond the Visible Spectrum (PBVS) 2021 workshop. For this second edition, the same thermal image dataset considered during the first challenge has been used; only mid-resolution (MR) and high-resolution (HR) sets have been considered. The dataset consists of 951 training images and 50 testing images for each resolution. A set of 20 images for each resolution is kept aside for evaluation. The two evaluation methodologies proposed for the first challenge are also considered in this opportunity. The first evaluation task consists of measuring the PSNR and SSIM between the obtained SR image and the corresponding ground truth (i.e., the HR thermal image downsampled by four). The second evaluation also consists of measuring the PSNR and SSIM, but in this case, considers the x2 SR obtained from the given MR thermal image; this evaluation is performed between the SR image with respect to the semi-registered HR image, which has been acquired with another camera. The results outperformed those from the first challenge, thus showing an improvement in both evaluation metrics.
Address Virtual; June 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes MSIAU; 600.130; 600.122 Approved no
Call Number Admin @ si @ RSV2021 Serial 3581
Permanent link to this record
 

 
Author Armin Mehri; Parichehr Behjati Ardakani; Angel Sappa
Title MPRNet: Multi-Path Residual Network for Lightweight Image Super Resolution Type (down) Conference Article
Year 2021 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 2703-2712
Keywords
Abstract Lightweight super resolution networks have extremely importance for real-world applications. In recent years several SR deep learning approaches with outstanding achievement have been introduced by sacrificing memory and computational cost. To overcome this problem, a novel lightweight super resolution network is proposed, which improves the SOTA performance in lightweight SR and performs roughly similar to computationally expensive networks. Multi-Path Residual Network designs with a set of Residual concatenation Blocks stacked with Adaptive Residual Blocks: ($i$) to adaptively extract informative features and learn more expressive spatial context information; ($ii$) to better leverage multi-level representations before up-sampling stage; and ($iii$) to allow an efficient information and gradient flow within the network. The proposed architecture also contains a new attention mechanism, Two-Fold Attention Module, to maximize the representation ability of the model. Extensive experiments show the superiority of our model against other SOTA SR approaches.
Address Virtual; January 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WACV
Notes MSIAU; 600.130; 600.122 Approved no
Call Number Admin @ si @ MAS2021b Serial 3582
Permanent link to this record
 

 
Author Armin Mehri; Parichehr Behjati Ardakani; Angel Sappa
Title LiNet: A Lightweight Network for Image Super Resolution Type (down) Conference Article
Year 2021 Publication 25th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 7196-7202
Keywords
Abstract This paper proposes a new lightweight network, LiNet, that enhancing technical efficiency in lightweight super resolution and operating approximately like very large and costly networks in terms of number of network parameters and operations. The proposed architecture allows the network to learn more abstract properties by avoiding low-level information via multiple links. LiNet introduces a Compact Dense Module, which contains set of inner and outer blocks, to efficiently extract meaningful information, to better leverage multi-level representations before upsampling stage, and to allow an efficient information and gradient flow within the network. Experiments on benchmark datasets show that the proposed LiNet achieves favorable performance against lightweight state-of-the-art methods.
Address Virtual; January 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MSIAU; 600.130; 600.122 Approved no
Call Number Admin @ si @ MAS2021a Serial 3583
Permanent link to this record
 

 
Author Bartlomiej Twardowski; Pawel Zawistowski; Szymon Zaborowski
Title Metric Learning for Session-Based Recommendations Type (down) Conference Article
Year 2021 Publication 43rd edition of the annual BCS-IRSG European Conference on Information Retrieval Abbreviated Journal
Volume 12656 Issue Pages 650-665
Keywords Session-based recommendations; Deep metric learning; Learning to rank
Abstract Session-based recommenders, used for making predictions out of users’ uninterrupted sequences of actions, are attractive for many applications. Here, for this task we propose using metric learning, where a common embedding space for sessions and items is created, and distance measures dissimilarity between the provided sequence of users’ events and the next action. We discuss and compare metric learning approaches to commonly used learning-to-rank methods, where some synergies exist. We propose a simple architecture for problem analysis and demonstrate that neither extensively big nor deep architectures are necessary in order to outperform existing methods. The experimental results against strong baselines on four datasets are provided with an ablation study.
Address Virtual; March 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECIR
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ TZZ2021 Serial 3586
Permanent link to this record
 

 
Author Albin Soutif; Marc Masana; Joost Van de Weijer; Bartlomiej Twardowski
Title On the importance of cross-task features for class-incremental learning Type (down) Conference Article
Year 2021 Publication Theory and Foundation of continual learning workshop of ICML Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In class-incremental learning, an agent with limited resources needs to learn a sequence of classification tasks, forming an ever growing classification problem, with the constraint of not being able to access data from previous tasks. The main difference with task-incremental learning, where a task-ID is available at inference time, is that the learner also needs to perform crosstask discrimination, i.e. distinguish between classes that have not been seen together. Approaches to tackle this problem are numerous and mostly make use of an external memory (buffer) of non-negligible size. In this paper, we ablate the learning of crosstask features and study its influence on the performance of basic replay strategies used for class-IL. We also define a new forgetting measure for class-incremental learning, and see that forgetting is not the principal cause of low performance. Our experimental results show that future algorithms for class-incremental learning should not only prevent forgetting, but also aim to improve the quality of the cross-task features. This is especially important when the number of classes per task is small.
Address Virtual; July 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICMLW
Notes LAMP Approved no
Call Number Admin @ si @ SMW2021 Serial 3588
Permanent link to this record
 

 
Author Graham D. Finlayson; Javier Vazquez; Fufu Fang
Title The Discrete Cosine Maximum Ignorance Assumption Type (down) Conference Article
Year 2021 Publication 29th Color and Imaging Conference Abbreviated Journal
Volume Issue Pages 13-18
Keywords
Abstract the performance of colour correction algorithms are dependent on the reflectance sets used. Sometimes, when the testing reflectance set is changed the ranking of colour correction algorithms also changes. To remove dependence on dataset we can
make assumptions about the set of all possible reflectances. In the Maximum Ignorance with Positivity (MIP) assumption we assume that all reflectances with per wavelength values between 0 and 1 are equally likely. A weakness in the MIP is that it fails to take into account the correlation of reflectance functions between
wavelengths (many of the assumed reflectances are, in reality, not possible).
In this paper, we take the view that the maximum ignorance assumption has merit but, hitherto it has been calculated with respect to the wrong coordinate basis. Here, we propose the Discrete Cosine Maximum Ignorance assumption (DCMI), where
all reflectances that have coordinates between max and min bounds in the Discrete Cosine Basis coordinate system are equally likely.
Here, the correlation between wavelengths is encoded and this results in the set of all plausible reflectances ’looking like’ typical reflectances that occur in nature. This said the DCMI model is also a superset of all measured reflectance sets.
Experiments show that, in colour correction, adopting the DCMI results in similar colour correction performance as using a particular reflectance set.
Address Virtual; November 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CIC
Notes CIC Approved no
Call Number FVF2021 Serial 3596
Permanent link to this record