|   | 
Details
   web
Records
Author Raul Gomez; Jaume Gibert; Lluis Gomez; Dimosthenis Karatzas
Title Location Sensitive Image Retrieval and Tagging Type Conference Article
Year 2020 Publication 16th European Conference on Computer Vision Abbreviated Journal
Volume Issue Pages
Keywords
Abstract People from different parts of the globe describe objects and concepts in distinct manners. Visual appearance can thus vary across different geographic locations, which makes location a relevant contextual information when analysing visual data. In this work, we address the task of image retrieval related to a given tag conditioned on a certain location on Earth. We present LocSens, a model that learns to rank triplets of images, tags and coordinates by plausibility, and two training strategies to balance the location influence in the final ranking. LocSens learns to fuse textual and location information of multimodal queries to retrieve related images at different levels of location granularity, and successfully utilizes location information to improve image tagging.
Address Virtual; August 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCV
Notes DAG; 600.121; 600.129 Approved no
Call Number (down) Admin @ si @ GGG2020b Serial 3420
Permanent link to this record
 

 
Author Raul Gomez; Jaume Gibert; Lluis Gomez; Dimosthenis Karatzas
Title Exploring Hate Speech Detection in Multimodal Publications Type Conference Article
Year 2020 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In this work we target the problem of hate speech detection in multimodal publications formed by a text and an image. We gather and annotate a large scale dataset from Twitter, MMHS150K, and propose different models that jointly analyze textual and visual information for hate speech detection, comparing them with unimodal detection. We provide quantitative and qualitative results and analyze the challenges of the proposed task. We find that, even though images are useful for the hate speech detection task, current multimodal models cannot outperform models analyzing only text. We discuss why and open the field and the dataset for further research.
Address Aspen; March 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WACV
Notes DAG; 600.121; 600.129 Approved no
Call Number (down) Admin @ si @ GGG2020a Serial 3280
Permanent link to this record
 

 
Author Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
Title Self-Supervised Learning from Web Data for Multimodal Retrieval Type Book Chapter
Year 2019 Publication Multi-Modal Scene Understanding Book Abbreviated Journal
Volume Issue Pages 279-306
Keywords self-supervised learning; webly supervised learning; text embeddings; multimodal retrieval; multimodal embedding
Abstract Self-Supervised learning from multimodal image and text data allows deep neural networks to learn powerful features with no need of human annotated data. Web and Social Media platforms provide a virtually unlimited amount of this multimodal data. In this work we propose to exploit this free available data to learn a multimodal image and text embedding, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the proposed pipeline can learn from images with associated text without supervision and analyze the semantic structure of the learnt joint image and text embeddingspace. Weperformathoroughanalysisandperformancecomparisonoffivedifferentstateof the art text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text basedimageretrievaltask,andweclearlyoutperformstateoftheartintheMIRFlickrdatasetwhen training in the target data. Further, we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.129; 601.338; 601.310 Approved no
Call Number (down) Admin @ si @ GGG2019 Serial 3266
Permanent link to this record
 

 
Author Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
Title Learning from# Barcelona Instagram data what Locals and Tourists post about its Neighbourhoods Type Conference Article
Year 2018 Publication 15th European Conference on Computer Vision Workshops Abbreviated Journal
Volume 11134 Issue Pages 530-544
Keywords
Abstract Massive tourism is becoming a big problem for some cities, such as Barcelona, due to its concentration in some neighborhoods. In this work we gather Instagram data related to Barcelona consisting on images-captions pairs and, using the text as a supervisory signal, we learn relations between images, words and neighborhoods. Our goal is to learn which visual elements appear in photos when people is posting about each neighborhood. We perform a language separate treatment of the data and show that it can be extrapolated to a tourists and locals separate analysis, and that tourism is reflected in Social Media at a neighborhood level. The presented pipeline allows analyzing the differences between the images that tourists and locals associate to the different neighborhoods. The proposed method, which can be extended to other cities or subjects, proves that Instagram data can be used to train multi-modal (image and text) machine learning models that are useful to analyze publications about a city at a neighborhood level. We publish the collected dataset, InstaBarcelona and the code used in the analysis.
Address Munich; Alemanya; September 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes DAG; 600.129; 601.338; 600.121 Approved no
Call Number (down) Admin @ si @ GGG2018b Serial 3176
Permanent link to this record
 

 
Author Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
Title Learning to Learn from Web Data through Deep Semantic Embeddings Type Conference Article
Year 2018 Publication 15th European Conference on Computer Vision Workshops Abbreviated Journal
Volume 11134 Issue Pages 514-529
Keywords
Abstract In this paper we propose to learn a multimodal image and text embedding from Web and Social Media data, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the pipeline can learn from images with associated text without supervision and perform a thourough analysis of five different text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text based image retrieval task, and we clearly outperform state of the art in the MIRFlickr dataset when training in the target data. Further we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
Address Munich; Alemanya; September 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes DAG; 600.129; 601.338; 600.121 Approved no
Call Number (down) Admin @ si @ GGG2018a Serial 3175
Permanent link to this record
 

 
Author Josep M. Gonfaus; Theo Gevers; Arjan Gijsenij; Xavier Roca; Jordi Gonzalez
Title Edge Classification using Photo-Geo metric features Type Conference Article
Year 2012 Publication 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 1497 - 1500
Keywords
Abstract Edges are caused by several imaging cues such as shadow, material and illumination transitions. Classification methods have been proposed which are solely based on photometric information, ignoring geometry to classify the physical nature of edges in images. In this paper, the aim is to present a novel strategy to handle both photometric and geometric information for edge classification. Photometric information is obtained through the use of quasi-invariants while geometric information is derived from the orientation and contrast of edges. Different combination frameworks are compared with a new principled approach that captures both information into the same descriptor. From large scale experiments on different datasets, it is shown that, in addition to photometric information, the geometry of edges is an important visual cue to distinguish between different edge types. It is concluded that by combining both cues the performance improves by more than 7% for shadows and highlights.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN 978-1-4673-2216-4 Medium
Area Expedition Conference ICPR
Notes ISE Approved no
Call Number (down) Admin @ si @ GGG2012b Serial 2142
Permanent link to this record
 

 
Author Theo Gevers; Arjan Gijsenij; Joost Van de Weijer; J.M. Geusebroek
Title Color in Computer Vision: Fundamentals and Applications Type Book Whole
Year 2012 Publication Color in Computer Vision: Fundamentals and Applications Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher The Wiley-IS&T Series in Imaging Science and Technology Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-0-470-89084-4 Medium
Area Expedition Conference
Notes ALTRES;ISE Approved no
Call Number (down) Admin @ si @ GGG2012a Serial 2068
Permanent link to this record
 

 
Author Jordi Gonzalez; Josep M. Gonfaus; Carles Fernandez; Xavier Roca
Title Exploiting Natural-Language Interaction in Video Surveillance Systems Type Conference Article
Year 2011 Publication V&L Net Workshop on Vision and Language Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Brighton, UK
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VL
Notes ISE Approved no
Call Number (down) Admin @ si @ GGF2011 Serial 1813
Permanent link to this record
 

 
Author Jose Garcia-Rodriguez; Isabelle Guyon; Sergio Escalera; Alexandra Psarrou; Andrew Lewis; Miguel Cazorla
Title Editorial: Special Issue on Computational Intelligence for Vision and Robotics Type Journal Article
Year 2017 Publication Neural Computing and Applications Abbreviated Journal Neural Computing and Applications
Volume 28 Issue 5 Pages 853–854
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA;MILAB; no menciona Approved no
Call Number (down) Admin @ si @ GGE2017 Serial 2845
Permanent link to this record
 

 
Author Yagmur Gucluturk; Umut Guclu; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera; Marcel A. J. van Gerven; Rob van Lier
Title Multimodal First Impression Analysis with Deep Residual Networks Type Journal Article
Year 2018 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC
Volume 8 Issue 3 Pages 316-329
Keywords
Abstract People form first impressions about the personalities of unfamiliar individuals even after very brief interactions with them. In this study we present and evaluate several models that mimic this automatic social behavior. Specifically, we present several models trained on a large dataset of short YouTube video blog posts for predicting apparent Big Five personality traits of people and whether they seem suitable to be recommended to a job interview. Along with presenting our audiovisual approach and results that won the third place in the ChaLearn First Impressions Challenge, we investigate modeling in different modalities including audio only, visual only, language only, audiovisual, and combination of audiovisual and language. Our results demonstrate that the best performance could be obtained using a fusion of all data modalities. Finally, in order to promote explainability in machine learning and to provide an example for the upcoming ChaLearn challenges, we present a simple approach for explaining the predictions for job interview recommendations
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number (down) Admin @ si @ GGB2018 Serial 3210
Permanent link to this record
 

 
Author Albert Gordo; Alicia Fornes; Ernest Valveny
Title Writer identification in handwritten musical scores with bags of notes Type Journal Article
Year 2013 Publication Pattern Recognition Abbreviated Journal PR
Volume 46 Issue 5 Pages 1337-1345
Keywords
Abstract Writer Identification is an important task for the automatic processing of documents. However, the identification of the writer in graphical documents is still challenging. In this work, we adapt the Bag of Visual Words framework to the task of writer identification in handwritten musical scores. A vanilla implementation of this method already performs comparably to the state-of-the-art. Furthermore, we analyze the effect of two improvements of the representation: a Bhattacharyya embedding, which improves the results at virtually no extra cost, and a Fisher Vector representation that very significantly improves the results at the cost of a more complex and costly representation. Experimental evaluation shows results more than 20 points above the state-of-the-art in a new, challenging dataset.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0031-3203 ISBN Medium
Area Expedition Conference
Notes DAG Approved no
Call Number (down) Admin @ si @ GFV2013 Serial 2307
Permanent link to this record
 

 
Author Debora Gil; Antonio Esteban Lansaque; Sebastian Stefaniga; Mihail Gaianu; Carles Sanchez
Title Data Augmentation from Sketch Type Conference Article
Year 2019 Publication International Workshop on Uncertainty for Safe Utilization of Machine Learning in Medical Imaging Abbreviated Journal
Volume 11840 Issue Pages 155-162
Keywords Data augmentation; cycleGANs; Multi-objective optimization
Abstract State of the art machine learning methods need huge amounts of data with unambiguous annotations for their training. In the context of medical imaging this is, in general, a very difficult task due to limited access to clinical data, the time required for manual annotations and variability across experts. Simulated data could serve for data augmentation provided that its appearance was comparable to the actual appearance of intra-operative acquisitions. Generative Adversarial Networks (GANs) are a powerful tool for artistic style transfer, but lack a criteria for selecting epochs ensuring also preservation of intra-operative content.

We propose a multi-objective optimization strategy for a selection of cycleGAN epochs ensuring a mapping between virtual images and the intra-operative domain preserving anatomical content. Our approach has been applied to simulate intra-operative bronchoscopic videos and chest CT scans from virtual sketches generated using simple graphical primitives.
Address Shenzhen; China; October 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CLIP
Notes IAM; 600.145; 601.337; 600.139; 600.145 Approved no
Call Number (down) Admin @ si @ GES2019 Serial 3359
Permanent link to this record
 

 
Author Debora Gil; Antonio Esteban Lansaque; Agnes Borras; Esmitt Ramirez; Carles Sanchez
Title Intraoperative Extraction of Airways Anatomy in VideoBronchoscopy Type Journal Article
Year 2020 Publication IEEE Access Abbreviated Journal ACCESS
Volume 8 Issue Pages 159696 - 159704
Keywords
Abstract A main bottleneck in bronchoscopic biopsy sampling is to efficiently reach the lesion navigating across bronchial levels. Any guidance system should be able to localize the scope position during the intervention with minimal costs and alteration of clinical protocols. With the final goal of an affordable image-based guidance, this work presents a novel strategy to extract and codify the anatomical structure of bronchi, as well as, the scope navigation path from videobronchoscopy. Experiments using interventional data show that our method accurately identifies the bronchial structure. Meanwhile, experiments using simulated data verify that the extracted navigation path matches the 3D route.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM; 600.139; 600.145 Approved no
Call Number (down) Admin @ si @ GEB2020 Serial 3467
Permanent link to this record
 

 
Author Debora Gil; Antonio Esteban Lansaque; Agnes Borras; Carles Sanchez
Title Enhancing virtual bronchoscopy with intra-operative data using a multi-objective GAN Type Journal Article
Year 2019 Publication International Journal of Computer Assisted Radiology and Surgery Abbreviated Journal IJCAR
Volume 7 Issue 1 Pages
Keywords
Abstract This manuscript has been withdrawn by bioRxiv due to upload of an incorrect version of the manuscript by the authors. Therefore, this manuscript should not be cited as reference for this project.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM; 600.139; 600.145 Approved no
Call Number (down) Admin @ si @ GEB2019 Serial 3307
Permanent link to this record
 

 
Author Debora Gil; Katerine Diaz; Carles Sanchez; Aura Hernandez-Sabate
Title Early Screening of SARS-CoV-2 by Intelligent Analysis of X-Ray Images Type Miscellaneous
Year 2020 Publication Arxiv Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Future SARS-CoV-2 virus outbreak COVID-XX might possibly occur during the next years. However the pathology in humans is so recent that many clinical aspects, like early detection of complications, side effects after recovery or early screening, are currently unknown. In spite of the number of cases of COVID-19, its rapid spread putting many sanitary systems in the edge of collapse has hindered proper collection and analysis of the data related to COVID-19 clinical aspects. We describe an interdisciplinary initiative that integrates clinical research, with image diagnostics and the use of new technologies such as artificial intelligence and radiomics with the aim of clarifying some of SARS-CoV-2 open questions. The whole initiative addresses 3 main points: 1) collection of standardize data including images, clinical data and analytics; 2) COVID-19 screening for its early diagnosis at primary care centers; 3) define radiomic signatures of COVID-19 evolution and associated pathologies for the early treatment of complications. In particular, in this paper we present a general overview of the project, the experimental design and first results of X-ray COVID-19 detection using a classic approach based on HoG and feature selection. Our experiments include a comparison to some recent methods for COVID-19 screening in X-Ray and an exploratory analysis of the feasibility of X-Ray COVID-19 screening. Results show that classic approaches can outperform deep-learning methods in this experimental setting, indicate the feasibility of early COVID-19 screening and that non-COVID infiltration is the group of patients most similar to COVID-19 in terms of radiological description of X-ray. Therefore, an efficient COVID-19 screening should be complemented with other clinical data to better discriminate these cases.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM; 600.139; 600.145; 601.337 Approved no
Call Number (down) Admin @ si @ GDS2020 Serial 3474
Permanent link to this record