Bartlomiej Twardowski, Pawel Zawistowski, & Szymon Zaborowski. (2021). Metric Learning for Session-Based Recommendations. In 43rd edition of the annual BCS-IRSG European Conference on Information Retrieval (Vol. 12656, pp. 650–665). LNCS.
Abstract: Session-based recommenders, used for making predictions out of users’ uninterrupted sequences of actions, are attractive for many applications. Here, for this task we propose using metric learning, where a common embedding space for sessions and items is created, and distance measures dissimilarity between the provided sequence of users’ events and the next action. We discuss and compare metric learning approaches to commonly used learning-to-rank methods, where some synergies exist. We propose a simple architecture for problem analysis and demonstrate that neither extensively big nor deep architectures are necessary in order to outperform existing methods. The experimental results against strong baselines on four datasets are provided with an ablation study.
Keywords: Session-based recommendations; Deep metric learning; Learning to rank
|
Zhengying Liu, Adrien Pavao, Zhen Xu, Sergio Escalera, Fabio Ferreira, Isabelle Guyon, et al. (2021). Winning Solutions and Post-Challenge Analyses of the ChaLearn AutoDL Challenge 2019. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(9), 3108–3125.
Abstract: This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a “meta-learner”, “data ingestor”, “model selector”, “model/learner”, and “evaluator”. This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (http://autodl.chalearn.org), the open-sourced code of the winners, and a free “AutoDL self-service.”
|
Albin Soutif, Marc Masana, Joost Van de Weijer, & Bartlomiej Twardowski. (2021). On the importance of cross-task features for class-incremental learning. In Theory and Foundation of continual learning workshop of ICML.
Abstract: In class-incremental learning, an agent with limited resources needs to learn a sequence of classification tasks, forming an ever growing classification problem, with the constraint of not being able to access data from previous tasks. The main difference with task-incremental learning, where a task-ID is available at inference time, is that the learner also needs to perform crosstask discrimination, i.e. distinguish between classes that have not been seen together. Approaches to tackle this problem are numerous and mostly make use of an external memory (buffer) of non-negligible size. In this paper, we ablate the learning of crosstask features and study its influence on the performance of basic replay strategies used for class-IL. We also define a new forgetting measure for class-incremental learning, and see that forgetting is not the principal cause of low performance. Our experimental results show that future algorithms for class-incremental learning should not only prevent forgetting, but also aim to improve the quality of the cross-task features. This is especially important when the number of classes per task is small.
|
Xim Cerda-Company, Olivier Penacchio, & Xavier Otazu. (2021). Chromatic Induction in Migraine. VISION, 37.
Abstract: The human visual system is not a colorimeter. The perceived colour of a region does not only depend on its colour spectrum, but also on the colour spectra and geometric arrangement of neighbouring regions, a phenomenon called chromatic induction. Chromatic induction is thought to be driven by lateral interactions: the activity of a central neuron is modified by stimuli outside its classical receptive field through excitatory–inhibitory mechanisms. As there is growing evidence of an excitation/inhibition imbalance in migraine, we compared chromatic induction in migraine and control groups. As hypothesised, we found a difference in the strength of induction between the two groups, with stronger induction effects in migraine. On the other hand, given the increased prevalence of visual phenomena in migraine with aura, we also hypothesised that the difference between migraine and control would be more important in migraine with aura than in migraine without aura. Our experiments did not support this hypothesis. Taken together, our results suggest a link between excitation/inhibition imbalance and increased induction effects.
Keywords: migraine; vision; colour; colour perception; chromatic induction; psychophysics
|
Sonia Baeza, R.Domingo, M.Salcedo, G.Moragas, J.Deportos, I.Garcia Olive, et al. (2021). Artificial Intelligence to Optimize Pulmonary Embolism Diagnosis During Covid-19 Pandemic by Perfusion SPECT/CT, a Pilot Study. American Journal of Respiratory and Critical Care Medicine, .
|
Mireia Sole, Joan Blanco, Debora Gil, Oliver Valero, Alvaro Pascual, B. Cardenas, et al. (2021). Chromosomal positioning in spermatogenic cells is influenced by chromosomal factors associated with gene activity, bouquet formation, and meiotic sex-chromosome inactivation. Chromosoma, 130, 163–175.
Abstract: Chromosome territoriality is not random along the cell cycle and it is mainly governed by intrinsic chromosome factors and gene expression patterns. Conversely, very few studies have explored the factors that determine chromosome territoriality and its influencing factors during meiosis. In this study, we analysed chromosome positioning in murine spermatogenic cells using three-dimensionally fluorescence in situ hybridization-based methodology, which allows the analysis of the entire karyotype. The main objective of the study was to decipher chromosome positioning in a radial axis (all analysed germ-cell nuclei) and longitudinal axis (only spermatozoa) and to identify the chromosomal factors that regulate such an arrangement. Results demonstrated that the radial positioning of chromosomes during spermatogenesis was cell-type specific and influenced by chromosomal factors associated to gene activity. Chromosomes with specific features that enhance transcription (high GC content, high gene density and high numbers of predicted expressed genes) were preferentially observed in the inner part of the nucleus in virtually all cell types. Moreover, the position of the sex chromosomes was influenced by their transcriptional status, from the periphery of the nucleus when its activity was repressed (pachytene) to a more internal position when it is partially activated (spermatid). At pachytene, chromosome positioning was also influenced by chromosome size due to the bouquet formation. Longitudinal chromosome positioning in the sperm nucleus was not random either, suggesting the importance of ordered longitudinal positioning for the release and activation of the paternal genome after fertilisation.
|
Marta Ligero, Alonso Garcia Ruiz, Cristina Viaplana, Guillermo Villacampa, Maria V Raciti, Jaid Landa, et al. (2021). A CT-based radiomics signature is associated with response to immune checkpoint inhibitors in advanced solid tumors. Radiology, 299(1), 109–119.
Abstract: Background Reliable predictive imaging markers of response to immune checkpoint inhibitors are needed. Purpose To develop and validate a pretreatment CT-based radiomics signature to predict response to immune checkpoint inhibitors in advanced solid tumors. Materials and Methods In this retrospective study, a radiomics signature was developed in patients with advanced solid tumors (including breast, cervix, gastrointestinal) treated with anti-programmed cell death-1 or programmed cell death ligand-1 monotherapy from August 2012 to May 2018 (cohort 1). This was tested in patients with bladder and lung cancer (cohorts 2 and 3). Radiomics variables were extracted from all metastases delineated at pretreatment CT and selected by using an elastic-net model. A regression model combined radiomics and clinical variables with response as the end point. Biologic validation of the radiomics score with RNA profiling of cytotoxic cells (cohort 4) was assessed with Mann-Whitney analysis. Results The radiomics signature was developed in 85 patients (cohort 1: mean age, 58 years ± 13 [standard deviation]; 43 men) and tested on 46 patients (cohort 2: mean age, 70 years ± 12; 37 men) and 47 patients (cohort 3: mean age, 64 years ± 11; 40 men). Biologic validation was performed in a further cohort of 20 patients (cohort 4: mean age, 60 years ± 13; 14 men). The radiomics signature was associated with clinical response to immune checkpoint inhibitors (area under the curve [AUC], 0.70; 95% CI: 0.64, 0.77; P < .001). In cohorts 2 and 3, the AUC was 0.67 (95% CI: 0.58, 0.76) and 0.67 (95% CI: 0.56, 0.77; P < .001), respectively. A radiomics-clinical signature (including baseline albumin level and lymphocyte count) improved on radiomics-only performance (AUC, 0.74 [95% CI: 0.63, 0.84; P < .001]; Akaike information criterion, 107.00 and 109.90, respectively). Conclusion A pretreatment CT-based radiomics signature is associated with response to immune checkpoint inhibitors, likely reflecting the tumor immunophenotype. © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Summers in this issue.
|
Debora Gil, Oriol Ramos Terrades, & Raquel Perez. (2021). Topological Radiomics (TOPiomics): Early Detection of Genetic Abnormalities in Cancer Treatment Evolution. In Extended Abstracts GEOMVAP 2019, Trends in Mathematics 15 (Vol. 15, 89–93). Springer Nature.
Abstract: Abnormalities in radiomic measures correlate to genomic alterations prone to alter the outcome of personalized anti-cancer treatments. TOPiomics is a new method for the early detection of variations in tumor imaging phenotype from a topological structure in multi-view radiomic spaces.
|
Trevor Canham, Javier Vazquez, Elise Mathieu, & Marcelo Bertalmío. (2021). Matching visual induction effects on screens of different size. JOV - Journal of Vision, 21(6(10)), 1–22.
Abstract: In the film industry, the same movie is expected to be watched on displays of vastly different sizes, from cinema screens to mobile phones. But visual induction, the perceptual phenomenon by which the appearance of a scene region is affected by its surroundings, will be different for the same image shown on two displays of different dimensions. This phenomenon presents a practical challenge for the preservation of the artistic intentions of filmmakers, because it can lead to shifts in image appearance between viewing destinations. In this work, we show that a neural field model based on the efficient representation principle is able to predict induction effects and how, by regularizing its associated energy functional, the model is still able to represent induction but is now invertible. From this finding, we propose a method to preprocess an image in a screen–size dependent way so that its perception, in terms of visual induction, may remain constant across displays of different size. The potential of the method is demonstrated through psychophysical experiments on synthetic images and qualitative examples on natural images.
|
Graham D. Finlayson, Javier Vazquez, & Fufu Fang. (2021). The Discrete Cosine Maximum Ignorance Assumption. In 29th Color and Imaging Conference (pp. 13–18).
Abstract: the performance of colour correction algorithms are dependent on the reflectance sets used. Sometimes, when the testing reflectance set is changed the ranking of colour correction algorithms also changes. To remove dependence on dataset we can
make assumptions about the set of all possible reflectances. In the Maximum Ignorance with Positivity (MIP) assumption we assume that all reflectances with per wavelength values between 0 and 1 are equally likely. A weakness in the MIP is that it fails to take into account the correlation of reflectance functions between
wavelengths (many of the assumed reflectances are, in reality, not possible).
In this paper, we take the view that the maximum ignorance assumption has merit but, hitherto it has been calculated with respect to the wrong coordinate basis. Here, we propose the Discrete Cosine Maximum Ignorance assumption (DCMI), where
all reflectances that have coordinates between max and min bounds in the Discrete Cosine Basis coordinate system are equally likely.
Here, the correlation between wavelengths is encoded and this results in the set of all plausible reflectances ’looking like’ typical reflectances that occur in nature. This said the DCMI model is also a superset of all measured reflectance sets.
Experiments show that, in colour correction, adopting the DCMI results in similar colour correction performance as using a particular reflectance set.
|
Razieh Rastgoo, Kourosh Kiani, Sergio Escalera, & Mohammad Sabokrou. (2021). Sign Language Production: A Review. In Conference on Computer Vision and Pattern Recognition Workshops (pp. 3472–3481).
Abstract: Sign Language is the dominant yet non-primary form of communication language used in the deaf and hearing-impaired community. To make an easy and mutual communication between the hearing-impaired and the hearing communities, building a robust system capable of translating the spoken language into sign language and vice versa is fundamental. To this end, sign language recognition and production are two necessary parts for making such a two-way system. Sign language recognition and production need to cope with some critical challenges. In this survey, we review recent advances in Sign Language Production (SLP) and related areas using deep learning. This survey aims to briefly summarize recent achievements in SLP, discussing their advantages, limitations, and future directions of research.
|
Yaxing Wang, Hector Laria Mantecon, Joost Van de Weijer, Laura Lopez-Fuentes, & Bogdan Raducanu. (2021). TransferI2I: Transfer Learning for Image-to-Image Translation from Small Datasets. In 19th IEEE International Conference on Computer Vision (pp. 13990–13999).
Abstract: Image-to-image (I2I) translation has matured in recent years and is able to generate high-quality realistic images. However, despite current success, it still faces important challenges when applied to small domains. Existing methods use transfer learning for I2I translation, but they still require the learning of millions of parameters from scratch. This drawback severely limits its application on small domains. In this paper, we propose a new transfer learning for I2I translation (TransferI2I). We decouple our learning process into the image generation step and the I2I translation step. In the first step we propose two novel techniques: source-target initialization and self-initialization of the adaptor layer. The former finetunes the pretrained generative model (e.g., StyleGAN) on source and target data. The latter allows to initialize all non-pretrained network parameters without the need of any data. These techniques provide a better initialization for the I2I translation step. In addition, we introduce an auxiliary GAN that further facilitates the training of deep I2I systems even from small datasets. In extensive experiments on three datasets, (Animal faces, Birds, and Foods), we show that we outperform existing methods and that mFID improves on several datasets with over 25 points.
|
Shiqi Yang, Yaxing Wang, Joost Van de Weijer, Luis Herranz, & Shangling Jui. (2021). Generalized Source-free Domain Adaptation. In 19th IEEE International Conference on Computer Vision (pp. 8958–8967).
Abstract: Domain adaptation (DA) aims to transfer the knowledge learned from a source domain to an unlabeled target domain. Some recent works tackle source-free domain adaptation (SFDA) where only a source pre-trained model is available for adaptation to the target domain. However, those methods do not consider keeping source performance which is of high practical value in real world applications. In this paper, we propose a new domain adaptation paradigm called Generalized Source-free Domain Adaptation (G-SFDA), where the learned model needs to perform well on both the target and source domains, with only access to current unlabeled target data during adaptation. First, we propose local structure clustering (LSC), aiming to cluster the target features with its semantically similar neighbors, which successfully adapts the model to the target domain in the absence of source data. Second, we propose sparse domain attention (SDA), it produces a binary domain specific attention to activate different feature channels for different domains, meanwhile the domain attention will be utilized to regularize the gradient during adaptation to keep source information. In the experiments, for target performance our method is on par with or better than existing DA and SFDA methods, specifically it achieves state-of-the-art performance (85.4%) on VisDA, and our method works well for all domains after adapting to single or multiple target domains.
|
Hugo Bertiche, Meysam Madadi, Emilio Tylson, & Sergio Escalera. (2021). DeePSD: Automatic Deep Skinning And Pose Space Deformation For 3D Garment Animation. In 19th IEEE International Conference on Computer Vision (pp. 5471–5480).
Abstract: We present a novel solution to the garment animation problem through deep learning. Our contribution allows animating any template outfit with arbitrary topology and geometric complexity. Recent works develop models for garment edition, resizing and animation at the same time by leveraging the support body model (encoding garments as body homotopies). This leads to complex engineering solutions that suffer from scalability, applicability and compatibility. By limiting our scope to garment animation only, we are able to propose a simple model that can animate any outfit, independently of its topology, vertex order or connectivity. Our proposed architecture maps outfits to animated 3D models into the standard format for 3D animation (blend weights and blend shapes matrices), automatically providing of compatibility with any graphics engine. We also propose a methodology to complement supervised learning with an unsupervised physically based learning that implicitly solves collisions and enhances cloth quality.
|
Edgar Riba. (2021). Geometric Computer Vision Techniques for Scene Reconstruction (Daniel Ponsa, Ed.). Ph.D. thesis, , .
Abstract: From the early stages of Computer Vision, scene reconstruction has been one of the most studied topics leading to a wide variety of new discoveries and applications. Object grasping and manipulation, localization and mapping, or even visual effect generation are different examples of applications in which scene reconstruction has taken an important role for industries such as robotics, factory automation, or audio visual production. However, scene reconstruction is an extensive topic that can be approached in many different ways with already existing solutions that effectively work in controlled environments. Formally, the problem of scene reconstruction can be formulated as a sequence of independent processes which compose a pipeline. In this thesis, we analyse some parts of the reconstruction pipeline from which we contribute with novel methods using Convolutional Neural Networks (CNN) proposing innovative solutions that consider the optimisation of the methods in an end-to-end fashion. First, we review the state of the art of classical local features detectors and descriptors and contribute with two novel methods that inherently improve pre-existing solutions in the scene reconstruction pipeline.
It is a fact that computer science and software engineering are two fields that usually go hand in hand and evolve according to mutual needs making easier the design of complex and efficient algorithms. For this reason, we contribute with Kornia, a library specifically designed to work with classical computer vision techniques along with deep neural networks. In essence, we created a framework that eases the design of complex pipelines for computer vision algorithms so that can be included within neural networks and be used to backpropagate gradients throw a common optimisation framework. Finally, in the last chapter of this thesis we develop the aforementioned concept of designing end-to-end systems with classical projective geometry. Thus, we contribute with a solution to the problem of synthetic view generation by hallucinating novel views from high deformable cloths objects using a geometry aware end-to-end system. To summarize, in this thesis we demonstrate that with a proper design that combine classical geometric computer vision methods with deep learning techniques can lead to improve pre-existing solutions for the problem of scene reconstruction.
|