Records |
Author |
Mariona Caros; Maite Garolera; Petia Radeva; Xavier Giro |
Title |
Automatic Reminiscence Therapy for Dementia |
Type |
Conference Article |
Year |
2020 |
Publication |
10th ACM International Conference on Multimedia Retrieval |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
383-387 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
With people living longer than ever, the number of cases with dementia such as Alzheimer's disease increases steadily. It affects more than 46 million people worldwide, and it is estimated that in 2050 more than 100 million will be affected. While there are not effective treatments for these terminal diseases, therapies such as reminiscence, that stimulate memories from the past are recommended. Currently, reminiscence therapy takes place in care homes and is guided by a therapist or a carer. In this work, we present an AI-based solution to automatize the reminiscence therapy, which consists in a dialogue system that uses photos as input to generate questions. We run a usability case study with patients diagnosed of mild cognitive impairment that shows they found the system very entertaining and challenging. Overall, this paper presents how reminiscence therapy can be automatized by using machine learning, and deployed to smartphones and laptops, making the therapy more accessible to every person affected by dementia. |
Address |
Virtual; October 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICRM |
Notes |
|
Approved |
no |
Call Number |
Admin @ si @ CGR2020 |
Serial |
3529 |
Permanent link to this record |
|
|
|
Author |
Kai Wang; Luis Herranz; Anjan Dutta; Joost Van de Weijer |
Title |
Bookworm continual learning: beyond zero-shot learning and continual learning |
Type |
Conference Article |
Year |
2020 |
Publication |
Workshop TASK-CV 2020 |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
We propose bookworm continual learning(BCL), a flexible setting where unseen classes can be inferred via a semantic model, and the visual model can be updated continually. Thus BCL generalizes both continual learning (CL) and zero-shot learning (ZSL). We also propose the bidirectional imagination (BImag) framework to address BCL where features of both past and future classes are generated. We observe that conditioning the feature generator on attributes can actually harm the continual learning ability, and propose two variants (joint class-attribute conditioning and asymmetric generation) to alleviate this problem. |
Address |
Virtual; August 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ECCVW |
Notes |
LAMP; 600.141; 600.120 |
Approved |
no |
Call Number |
Admin @ si @ WHD2020 |
Serial |
3466 |
Permanent link to this record |
|
|
|
Author |
Zhengying Liu; Zhen Xu; Sergio Escalera; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Adrien Pavao; Sebastien Treguer; Wei-Wei Tu |
Title |
Towards automated computer vision: analysis of the AutoCV challenges 2019 |
Type |
Journal Article |
Year |
2020 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
Volume |
135 |
Issue |
|
Pages |
196-203 |
Keywords |
Computer vision; AutoML; Deep learning |
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
We present the results of recent challenges in Automated Computer Vision (AutoCV, renamed here for clarity AutoCV1 and AutoCV2, 2019), which are part of a series of challenge on Automated Deep Learning (AutoDL). These two competitions aim at searching for fully automated solutions for classification tasks in computer vision, with an emphasis on any-time performance. The first competition was limited to image classification while the second one included both images and videos. Our design imposed to the participants to submit their code on a challenge platform for blind testing on five datasets, both for training and testing, without any human intervention whatsoever. Winning solutions adopted deep learning techniques based on already published architectures, such as AutoAugment, MobileNet and ResNet, to reach state-of-the-art performance in the time budget of the challenge (only 20 minutes of GPU time). The novel contributions include strategies to deliver good preliminary results at any time during the learning process, such that a method can be stopped early and still deliver good performance. This feature is key for the adoption of such techniques by data analysts desiring to obtain rapidly preliminary results on large datasets and to speed up the development process. The soundness of our design was verified in several aspects: (1) Little overfitting of the on-line leaderboard providing feedback on 5 development datasets was observed, compared to the final blind testing on the 5 (separate) final test datasets, suggesting that winning solutions might generalize to other computer vision classification tasks; (2) Error bars on the winners’ performance allow us to say with confident that they performed significantly better than the baseline solutions we provided; (3) The ranking of participants according to the any-time metric we designed, namely the Area under the Learning Curve, was different from that of the fixed-time metric, i.e. AUC at the end of the fixed time budget. We released all winning solutions under open-source licenses. At the end of the AutoDL challenge series, all data of the challenge will be made publicly available, thus providing a collection of uniformly formatted datasets, which can serve to conduct further research, particularly on meta-learning. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
HuPBA; no proj |
Approved |
no |
Call Number |
Admin @ si @ LXE2020 |
Serial |
3427 |
Permanent link to this record |
|
|
|
Author |
Zhengying Liu; Zhen Xu; Shangeth Rajaa; Meysam Madadi; Julio C. S. Jacques Junior; Sergio Escalera; Adrien Pavao; Sebastien Treguer; Wei-Wei Tu; Isabelle Guyon |
Title |
Towards Automated Deep Learning: Analysis of the AutoDL challenge series 2019 |
Type |
Conference Article |
Year |
2020 |
Publication |
Proceedings of Machine Learning Research |
Abbreviated Journal |
|
Volume |
123 |
Issue |
|
Pages |
242-252 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
We present the design and results of recent competitions in Automated Deep Learning (AutoDL). In the AutoDL challenge series 2019, we organized 5 machine learning challenges: AutoCV, AutoCV2, AutoNLP, AutoSpeech and AutoDL. The first 4 challenges concern each a specific application domain, such as computer vision, natural language processing and speech recognition. At the time of March 2020, the last challenge AutoDL is still on-going and we only present its design. Some highlights of this work include: (1) a benchmark suite of baseline AutoML solutions, with emphasis on domains for which Deep Learning methods have had prior success (image, video, text, speech, etc); (2) a novel any-time learning framework, which opens doors for further theoretical consideration; (3) a repository of around 100 datasets (from all above domains) over half of which are released as public datasets to enable research on meta-learning; (4) analyses revealing that winning solutions generalize to new unseen datasets, validating progress towards universal AutoML solution; (5) open-sourcing of the challenge platform, the starting kit, the dataset formatting toolkit, and all winning solutions (All information available at {autodl.chalearn.org}). |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
NEURIPS |
Notes |
HUPBA |
Approved |
no |
Call Number |
Admin @ si @ LXR2020 |
Serial |
3500 |
Permanent link to this record |
|
|
|
Author |
Hassan Ahmed Sial; Ramon Baldrich; Maria Vanrell; Dimitris Samaras |
Title |
Light Direction and Color Estimation from Single Image with Deep Regression |
Type |
Conference Article |
Year |
2020 |
Publication |
London Imaging Conference |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
We present a method to estimate the direction and color of the scene light source from a single image. Our method is based on two main ideas: (a) we use a new synthetic dataset with strong shadow effects with similar constraints to the SID dataset; (b) we define a deep architecture trained on the mentioned dataset to estimate the direction and color of the scene light source. Apart from showing good performance on synthetic images, we additionally propose a preliminary procedure to obtain light positions of the Multi-Illumination dataset, and, in this way, we also prove that our trained model achieves good performance when it is applied to real scenes. |
Address |
Virtual; September 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
LIM |
Notes |
CIC; 600.118; 600.140; |
Approved |
no |
Call Number |
Admin @ si @ SBV2020 |
Serial |
3460 |
Permanent link to this record |
|
|
|
Author |
Klara Janousckova; Jiri Matas; Lluis Gomez; Dimosthenis Karatzas |
Title |
Text Recognition – Real World Data and Where to Find Them |
Type |
Conference Article |
Year |
2020 |
Publication |
25th International Conference on Pattern Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
4489-4496 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
We present a method for exploiting weakly annotated images to improve text extraction pipelines. The approach uses an arbitrary end-to-end text recognition system to obtain text region proposals and their, possibly erroneous, transcriptions. The method includes matching of imprecise transcriptions to weak annotations and an edit distance guided neighbourhood search. It produces nearly error-free, localised instances of scene text, which we treat as “pseudo ground truth” (PGT). The method is applied to two weakly-annotated datasets. Training with the extracted PGT consistently improves the accuracy of a state of the art recognition model, by 3.7% on average, across different benchmark datasets (image domains) and 24.5% on one of the weakly annotated datasets 1 1 Acknowledgements. The authors were supported by Czech Technical University student grant SGS20/171/0HK3/3TJ13, the MEYS VVV project CZ.02.1.01/0.010.0J16 019/0000765 Research Center for Informatics, the Spanish Research project TIN2017-89779-P and the CERCA Programme / Generalitat de Catalunya. |
Address |
Virtual; January 2021 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICPR |
Notes |
DAG; 600.121; 600.129 |
Approved |
no |
Call Number |
Admin @ si @ JMG2020 |
Serial |
3557 |
Permanent link to this record |
|
|
|
Author |
David Berga; Marc Masana; Joost Van de Weijer |
Title |
Disentanglement of Color and Shape Representations for Continual Learning |
Type |
Conference Article |
Year |
2020 |
Publication |
ICML Workshop on Continual Learning |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
We hypothesize that disentangled feature representations suffer less from catastrophic forgetting. As a case study we perform explicit disentanglement of color and shape, by adjusting the network architecture. We tested classification accuracy and forgetting in a task-incremental setting with Oxford-102 Flowers dataset. We combine our method with Elastic Weight Consolidation, Learning without Forgetting, Synaptic Intelligence and Memory Aware Synapses, and show that feature disentanglement positively impacts continual learning performance. |
Address |
Virtual; July 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICMLW |
Notes |
LAMP; 600.120 |
Approved |
no |
Call Number |
Admin @ si @ BMW2020 |
Serial |
3506 |
Permanent link to this record |
|
|
|
Author |
Fei Yang; Luis Herranz; Joost Van de Weijer; Jose Antonio Iglesias; Antonio Lopez; Mikhail Mozerov |
Title |
Variable Rate Deep Image Compression with Modulated Autoencoder |
Type |
Journal Article |
Year |
2020 |
Publication |
IEEE Signal Processing Letters |
Abbreviated Journal |
SPL |
Volume |
27 |
Issue |
|
Pages |
331-335 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
Variable rate is a requirement for flexible and adaptable image and video compression. However, deep image compression methods (DIC) are optimized for a single fixed rate-distortion (R-D) tradeoff. While this can be addressed by training multiple models for different tradeoffs, the memory requirements increase proportionally to the number of models. Scaling the bottleneck representation of a shared autoencoder can provide variable rate compression with a single shared autoencoder. However, the R-D performance using this simple mechanism degrades in low bitrates, and also shrinks the effective range of bitrates. To address these limitations, we formulate the problem of variable R-D optimization for DIC, and propose modulated autoencoders (MAEs), where the representations of a shared autoencoder are adapted to the specific R-D tradeoff via a modulation network. Jointly training this modulated autoencoder and the modulation network provides an effective way to navigate the R-D operational curve. Our experiments show that the proposed method can achieve almost the same R-D performance of independent models with significantly fewer parameters. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
LAMP; ADAS; 600.141; 600.120; 600.118 |
Approved |
no |
Call Number |
Admin @ si @ YHW2020 |
Serial |
3346 |
Permanent link to this record |
|
|
|
Author |
Aymen Azaza; Joost Van de Weijer; Ali Douik; Javad Zolfaghari Bengar; Marc Masana |
Title |
Saliency from High-Level Semantic Image Features |
Type |
Journal |
Year |
2020 |
Publication |
SN Computer Science |
Abbreviated Journal |
SN |
Volume |
1 |
Issue |
4 |
Pages |
1-12 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
Top-down semantic information is known to play an important role in assigning saliency. Recently, large strides have been made in improving state-of-the-art semantic image understanding in the fields of object detection and semantic segmentation. Therefore, since these methods have now reached a high-level of maturity, evaluation of the impact of high-level image understanding on saliency estimation is now feasible. We propose several saliency features which are computed from object detection and semantic segmentation results. We combine these features with a standard baseline method for saliency detection to evaluate their importance. Experiments demonstrate that the proposed features derived from object detection and semantic segmentation improve saliency estimation significantly. Moreover, they show that our method obtains state-of-the-art results on (FT, ImgSal, and SOD datasets) and obtains competitive results on four other datasets (ECSSD, PASCAL-S, MSRA-B, and HKU-IS). |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
LAMP; 600.120; 600.109; 600.106 |
Approved |
no |
Call Number |
Admin @ si @ AWD2020 |
Serial |
3503 |
Permanent link to this record |
|
|
|
Author |
Khalid El Asnaoui; Petia Radeva |
Title |
Automatically Assess Day Similarity Using Visual Lifelogs |
Type |
Journal Article |
Year |
2020 |
Publication |
International Journal of Intelligent Systems |
Abbreviated Journal |
IJIS |
Volume |
29 |
Issue |
|
Pages |
298–310 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
Today, we witness the appearance of many lifelogging cameras that are able to capture the life of a person wearing the camera and which produce a large number of images everyday. Automatically characterizing the experience and extracting patterns of behavior of individuals from this huge collection of unlabeled and unstructured egocentric data present major challenges and require novel and efficient algorithmic solutions. The main goal of this work is to propose a new method to automatically assess day similarity from the lifelogging images of a person. We propose a technique to measure the similarity between images based on the Swain’s distance and generalize it to detect the similarity between daily visual data. To this purpose, we apply the dynamic time warping (DTW) combined with the Swain’s distance for final day similarity estimation. For validation, we apply our technique on the Egocentric Dataset of University of Barcelona (EDUB) of 4912 daily images acquired by four persons with preliminary encouraging results. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MILAB; no proj |
Approved |
no |
Call Number |
AsR2020 |
Serial |
3409 |
Permanent link to this record |
|
|
|
Author |
Tomas Sixta; Julio C. S. Jacques Junior; Pau Buch Cardona; Eduard Vazquez; Sergio Escalera |
Title |
FairFace Challenge at ECCV 2020: Analyzing Bias in Face Recognition |
Type |
Conference Article |
Year |
2020 |
Publication |
ECCV Workshops |
Abbreviated Journal |
|
Volume |
12540 |
Issue |
|
Pages |
463-481 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
This work summarizes the 2020 ChaLearn Looking at People Fair Face Recognition and Analysis Challenge and provides a description of the top-winning solutions and analysis of the results. The aim of the challenge was to evaluate accuracy and bias in gender and skin colour of submitted algorithms on the task of 1:1 face verification in the presence of other confounding attributes. Participants were evaluated using an in-the-wild dataset based on reannotated IJB-C, further enriched 12.5K new images and additional labels. The dataset is not balanced, which simulates a real world scenario where AI-based models supposed to present fair outcomes are trained and evaluated on imbalanced data. The challenge attracted 151 participants, who made more 1.8K submissions in total. The final phase of the challenge attracted 36 active teams out of which 10 exceeded 0.999 AUC-ROC while achieving very low scores in the proposed bias metrics. Common strategies by the participants were face pre-processing, homogenization of data distributions, the use of bias aware loss functions and ensemble models. The analysis of top-10 teams shows higher false positive rates (and lower false negative rates) for females with dark skin tone as well as the potential of eyeglasses and young age to increase the false positive rates too. |
Address |
Virtual; August 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ECCVW |
Notes |
HUPBA |
Approved |
no |
Call Number |
Admin @ si @ SJB2020 |
Serial |
3499 |
Permanent link to this record |
|
|
|
Author |
Hugo Bertiche; Meysam Madadi; Sergio Escalera |
Title |
CLOTH3D: Clothed 3D Humans |
Type |
Conference Article |
Year |
2020 |
Publication |
16th European Conference on Computer Vision |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
This work presents CLOTH3D, the first big scale synthetic dataset of 3D clothed human sequences. CLOTH3D contains a large variability on garment type, topology, shape, size, tightness and fabric. Clothes are simulated on top of thousands of different pose sequences and body shapes, generating realistic cloth dynamics. We provide the dataset with a generative model for cloth generation. We propose a Conditional Variational Auto-Encoder (CVAE) based on graph convolutions (GCVAE) to learn garment latent spaces. This allows for realistic generation of 3D garments on top of SMPL model for any pose and shape. |
Address |
Virtual; August 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ECCV |
Notes |
HUPBA |
Approved |
no |
Call Number |
Admin @ si @ BME2020 |
Serial |
3519 |
Permanent link to this record |
|
|
|
Author |
Angel Morera; Angel Sanchez; A. Belen Moreno; Angel Sappa; Jose F. Velez |
Title |
SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities |
Type |
Journal Article |
Year |
2020 |
Publication |
Sensors |
Abbreviated Journal |
SENS |
Volume |
20 |
Issue |
16 |
Pages |
4587 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
This work compares Single Shot MultiBox Detector (SSD) and You Only Look Once (YOLO) deep neural networks for the outdoor advertisement panel detection problem by handling multiple and combined variabilities in the scenes. Publicity panel detection in images offers important advantages both in the real world as well as in the virtual one. For example, applications like Google Street View can be used for Internet publicity and when detecting these ads panels in images, it could be possible to replace the publicity appearing inside the panels by another from a funding company. In our experiments, both SSD and YOLO detectors have produced acceptable results under variable sizes of panels, illumination conditions, viewing perspectives, partial occlusion of panels, complex background and multiple panels in scenes. Due to the difficulty of finding annotated images for the considered problem, we created our own dataset for conducting the experiments. The major strength of the SSD model was the almost elimination of False Positive (FP) cases, situation that is preferable when the publicity contained inside the panel is analyzed after detecting them. On the other side, YOLO produced better panel localization results detecting a higher number of True Positive (TP) panels with a higher accuracy. Finally, a comparison of the two analyzed object detection models with different types of semantic segmentation networks and using the same evaluation metrics is also included. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MSIAU; 600.130; 601.349; 600.122 |
Approved |
no |
Call Number |
Admin @ si @ MSM2020 |
Serial |
3452 |
Permanent link to this record |
|
|
|
Author |
Sergio Escalera; Ralf Herbrich |
Title |
The NeurIPS’18 Competition: From Machine Learning to Intelligent Conversations |
Type |
Book Whole |
Year |
2020 |
Publication |
The Springer Series on Challenges in Machine Learning |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
This volume presents the results of the Neural Information Processing Systems Competition track at the 2018 NeurIPS conference. The competition follows the same format as the 2017 competition track for NIPS. Out of 21 submitted proposals, eight competition proposals were selected, spanning the area of Robotics, Health, Computer Vision, Natural Language Processing, Systems and Physics. Competitions have become an integral part of advancing state-of-the-art in artificial intelligence (AI). They exhibit one important difference to benchmarks: Competitions test a system end-to-end rather than evaluating only a single component; they assess the practicability of an algorithmic solution in addition to assessing feasibility. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
Sergio Escalera; Ralf Hebrick |
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
2520-1328 |
ISBN |
978-3-030-29134-1 |
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
HuPBA; no menciona |
Approved |
no |
Call Number |
Admin @ si @ HeE2020 |
Serial |
3328 |
Permanent link to this record |
|
|
|
Author |
Sounak Dey |
Title |
Mapping between Images and Conceptual Spaces: Sketch-based Image Retrieval |
Type |
Book Whole |
Year |
2020 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
This thesis presents several contributions to the literature of sketch based image retrieval (SBIR). In SBIR the first challenge we face is how to map two different domains to common space for effective retrieval of images, while tackling the different levels of abstraction people use to express their notion of objects around while sketching. To this extent we first propose a cross-modal learning framework that maps both sketches and text into a joint embedding space invariant to depictive style, while preserving semantics. Then we have also investigated different query types possible to encompass people's dilema in sketching certain world objects. For this we propose an approach for multi-modal image retrieval in multi-labelled images. A multi-modal deep network architecture is formulated to jointly model sketches and text as input query modalities into a common embedding space, which is then further aligned with the image feature space. This permits encoding the object-based features and its alignment with the query irrespective of the availability of the co-occurrence of different objects in the training set.
Finally, we explore the problem of zero-shot sketch-based image retrieval (ZS-SBIR), where human sketches are used as queries to conduct retrieval of photos from unseen categories. We importantly advance prior arts by proposing a novel ZS-SBIR scenario that represents a firm step forward in its practical application. The new setting uniquely recognises two important yet often neglected challenges of practical ZS-SBIR, (i) the large domain gap between amateur sketch and photo, and (ii) the necessity for moving towards large-scale retrieval. We first contribute to the community a novel ZS-SBIR dataset, QuickDraw-Extended. We also in this dissertation pave the path to the future direction of research in this domain. |
Address |
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Josep Llados;Umapada Pal |
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
978-84-121011-8-8 |
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG; 600.121 |
Approved |
no |
Call Number |
Admin @ si @ Dey20 |
Serial |
3480 |
Permanent link to this record |