|
Records |
Links |
|
Author |
Vacit Oguz Yazici; Joost Van de Weijer; Arnau Ramisa |
|
|
Title |
Color Naming for Multi-Color Fashion Items |
Type |
Conference Article |
|
Year |
2018 |
Publication |
6th World Conference on Information Systems and Technologies |
Abbreviated Journal |
|
|
|
Volume |
747 |
Issue |
|
Pages |
64-73 |
|
|
Keywords |
Deep learning; Color; Multi-label |
|
|
Abstract |
There exists a significant amount of research on color naming of single colored objects. However in reality many fashion objects consist of multiple colors. Currently, searching in fashion datasets for multi-colored objects can be a laborious task. Therefore, in this paper we focus on color naming for images with multi-color fashion items. We collect a dataset, which consists of images which may have from one up to four colors. We annotate the images with the 11 basic colors of the English language. We experiment with several designs for deep neural networks with different losses. We show that explicitly estimating the number of colors in the fashion item leads to improved results. |
|
|
Address |
Naples; March 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WORLDCIST |
|
|
Notes |
LAMP; 600.109; 601.309; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ YWR2018 |
Serial |
3161 |
|
Permanent link to this record |
|
|
|
|
Author |
Felipe Codevilla; Antonio Lopez; Vladlen Koltun; Alexey Dosovitskiy |
|
|
Title |
On Offline Evaluation of Vision-based Driving Models |
Type |
Conference Article |
|
Year |
2018 |
Publication |
15th European Conference on Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
11219 |
Issue |
|
Pages |
246-262 |
|
|
Keywords |
Autonomous driving; deep learning |
|
|
Abstract |
Autonomous driving models should ideally be evaluated by deploying
them on a fleet of physical vehicles in the real world. Unfortunately, this approach is not practical for the vast majority of researchers. An attractive alternative is to evaluate models offline, on a pre-collected validation dataset with ground truth annotation. In this paper, we investigate the relation between various online and offline metrics for evaluation of autonomous driving models. We find that offline prediction error is not necessarily correlated with driving quality, and two models with identical prediction error can differ dramatically in their driving performance. We show that the correlation of offline evaluation with driving quality can be significantly improved by selecting an appropriate validation dataset and
suitable offline metrics. |
|
|
Address |
Munich; September 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECCV |
|
|
Notes |
ADAS; 600.124; 600.118 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CLK2018 |
Serial |
3162 |
|
Permanent link to this record |
|
|
|
|
Author |
Jorge Bernal; Aymeric Histace; Marc Masana; Quentin Angermann; Cristina Sanchez Montes; Cristina Rodriguez de Miguel; Maroua Hammami; Ana Garcia Rodriguez; Henry Cordova; Olivier Romain; Gloria Fernandez Esparrach; Xavier Dray; F. Javier Sanchez |
|
|
Title |
GTCreator: a flexible annotation tool for image-based datasets |
Type |
Journal Article |
|
Year |
2019 |
Publication |
International Journal of Computer Assisted Radiology and Surgery |
Abbreviated Journal |
IJCAR |
|
|
Volume |
14 |
Issue |
2 |
Pages |
191–201 |
|
|
Keywords |
Annotation tool; Validation Framework; Benchmark; Colonoscopy; Evaluation |
|
|
Abstract |
Abstract Purpose: Methodology evaluation for decision support systems for health is a time consuming-task. To assess performance of polyp detection
methods in colonoscopy videos, clinicians have to deal with the annotation
of thousands of images. Current existing tools could be improved in terms of
exibility and ease of use. Methods:We introduce GTCreator, a exible annotation tool for providing image and text annotations to image-based datasets.
It keeps the main basic functionalities of other similar tools while extending
other capabilities such as allowing multiple annotators to work simultaneously
on the same task or enhanced dataset browsing and easy annotation transfer aiming to speed up annotation processes in large datasets. Results: The
comparison with other similar tools shows that GTCreator allows to obtain
fast and precise annotation of image datasets, being the only one which offers
full annotation editing and browsing capabilites. Conclusions: Our proposed
annotation tool has been proven to be efficient for large image dataset annota-
tion, as well as showing potential of use in other stages of method evaluation
such as experimental setup or results analysis. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MV; 600.096; 600.109; 600.119; 601.305 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BHM2019 |
Serial |
3163 |
|
Permanent link to this record |
|
|
|
|
Author |
Cristina Sanchez Montes; F. Javier Sanchez; Jorge Bernal; Henry Cordova; Maria Lopez Ceron; Miriam Cuatrecasas; Cristina Rodriguez de Miguel; Ana Garcia Rodriguez; Rodrigo Garces Duran; Maria Pellise; Josep Llach; Gloria Fernandez Esparrach |
|
|
Title |
Computer-aided Prediction of Polyp Histology on White-Light Colonoscopy using Surface Pattern Analysis |
Type |
Journal Article |
|
Year |
2019 |
Publication |
Endoscopy |
Abbreviated Journal |
END |
|
|
Volume |
51 |
Issue |
3 |
Pages |
261-265 |
|
|
Keywords |
|
|
|
Abstract |
Background and study aims: To evaluate a new computational histology prediction system based on colorectal polyp textural surface patterns using high definition white light images.
Patients and methods: Textural elements (textons) were characterized according to their contrast with respect to the surface, shape and number of bifurcations, assuming that dysplastic polyps are associated with highly contrasted, large tubular patterns with some degree of bifurcation. Computer-aided diagnosis (CAD) was compared with pathological diagnosis and the diagnosis by the endoscopists using Kudo and NICE classification.
Results: Images of 225 polyps were evaluated (142 dysplastic and 83 non-dysplastic). CAD system correctly classified 205 (91.1%) polyps, 131/142 (92.3%) dysplastic and 74/83 (89.2%) non-dysplastic. For the subgroup of 100 diminutive (<5 mm) polyps, CAD correctly classified 87 (87%) polyps, 43/50 (86%) dysplastic and 44/50 (88%) non-dysplastic. There were not statistically significant differences in polyp histology prediction based on CAD system and on endoscopist assessment.
Conclusion: A computer vision system based on the characterization of the polyp surface in the white light accurately predicts colorectal polyp histology. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MV; 600.096; 600.119; 600.075 |
Approved |
no |
|
|
Call Number |
Admin @ si @ SSB2019 |
Serial |
3164 |
|
Permanent link to this record |
|
|
|
|
Author |
F. Javier Sanchez; Jorge Bernal |
|
|
Title |
Use of Software Tools for Real-time Monitoring of Learning Processes: Application to Compilers subject |
Type |
Conference Article |
|
Year |
2018 |
Publication |
4th International Conference of Higher Education Advances |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1359-1366 |
|
|
Keywords |
Monitoring; Evaluation tool; Gamification; Student motivation |
|
|
Abstract |
The effective implementation of the Higher European Education Area has meant a change regarding the focus of the learning process, being now the student at its very center. This shift of focus requires a strong involvement and fluent communication between teachers and students to succeed. Considering the difficulties associated to motivate students to take a more active role in the learning process, we explore how the use of a software tool can help both actors to improve the learning experience. We present a tool that can help students to obtain instantaneous feedback with respect to their progress in the subject as well as providing teachers with useful information about the evolution of knowledge acquisition with respect to each of the subject areas. We compare the performance achieved by students in two academic years: results show an improvement in overall performance which, after observing graphs provided by our tool, can be associated to an increase in students interest in the subject. |
|
|
Address |
Valencia; June 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
HEAD |
|
|
Notes |
MV; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ SaB2018 |
Serial |
3165 |
|
Permanent link to this record |
|
|
|
|
Author |
Juan Ignacio Toledo; Manuel Carbonell; Alicia Fornes; Josep Llados |
|
|
Title |
Information Extraction from Historical Handwritten Document Images with a Context-aware Neural Model |
Type |
Journal Article |
|
Year |
2019 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
86 |
Issue |
|
Pages |
27-36 |
|
|
Keywords |
Document image analysis; Handwritten documents; Named entity recognition; Deep neural networks |
|
|
Abstract |
Many historical manuscripts that hold trustworthy memories of the past societies contain information organized in a structured layout (e.g. census, birth or marriage records). The precious information stored in these documents cannot be effectively used nor accessed without costly annotation efforts. The transcription driven by the semantic categories of words is crucial for the subsequent access. In this paper we describe an approach to extract information from structured historical handwritten text images and build a knowledge representation for the extraction of meaning out of historical data. The method extracts information, such as named entities, without the need of an intermediate transcription step, thanks to the incorporation of context information through language models. Our system has two variants, the first one is based on bigrams, whereas the second one is based on recurrent neural networks. Concretely, our second architecture integrates a Convolutional Neural Network to model visual information from word images together with a Bidirecitonal Long Short Term Memory network to model the relation among the words. This integrated sequential approach is able to extract more information than just the semantic category (e.g. a semantic category can be associated to a person in a record). Our system is generic, it deals with out-of-vocabulary words by design, and it can be applied to structured handwritten texts from different domains. The method has been validated with the ICDAR IEHHR competition protocol, outperforming the existing approaches. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.097; 601.311; 603.057; 600.084; 600.140; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ TCF2019 |
Serial |
3166 |
|
Permanent link to this record |
|
|
|
|
Author |
Lei Kang; Juan Ignacio Toledo; Pau Riba; Mauricio Villegas; Alicia Fornes; Marçal Rusiñol |
|
|
Title |
Convolve, Attend and Spell: An Attention-based Sequence-to-Sequence Model for Handwritten Word Recognition |
Type |
Conference Article |
|
Year |
2018 |
Publication |
40th German Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
459-472 |
|
|
Keywords |
|
|
|
Abstract |
This paper proposes Convolve, Attend and Spell, an attention based sequence-to-sequence model for handwritten word recognition. The proposed architecture has three main parts: an encoder, consisting of a CNN and a bi-directional GRU, an attention mechanism devoted to focus on the pertinent features and a decoder formed by a one-directional GRU, able to spell the corresponding word, character by character. Compared with the recent state-of-the-art, our model achieves competitive results on the IAM dataset without needing any pre-processing step, predefined lexicon nor language model. Code and additional results are available in https://github.com/omni-us/research-seq2seq-HTR. |
|
|
Address |
Stuttgart; Germany; October 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GCPR |
|
|
Notes |
DAG; 600.097; 603.057; 302.065; 601.302; 600.084; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ KTR2018 |
Serial |
3167 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes |
|
|
Title |
Learning Graph Distances with Message Passing Neural Networks |
Type |
Conference Article |
|
Year |
2018 |
Publication |
24th International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2239-2244 |
|
|
Keywords |
★Best Paper Award★ |
|
|
Abstract |
Graph representations have been widely used in pattern recognition thanks to their powerful representation formalism and rich theoretical background. A number of error-tolerant graph matching algorithms such as graph edit distance have been proposed for computing a distance between two labelled graphs. However, they typically suffer from a high
computational complexity, which makes it difficult to apply
these matching algorithms in a real scenario. In this paper, we propose an efficient graph distance based on the emerging field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure and learns a metric with a siamese network approach. The performance of the proposed graph distance is validated in two application cases, graph classification and graph retrieval of handwritten words, and shows a promising performance when compared with
(approximate) graph edit distance benchmarks. |
|
|
Address |
Beijing; China; August 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
DAG; 600.097; 603.057; 601.302; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RFL2018 |
Serial |
3168 |
|
Permanent link to this record |
|
|
|
|
Author |
Jialuo Chen; Pau Riba; Alicia Fornes; Juan Mas; Josep Llados; Joana Maria Pujadas-Mora |
|
|
Title |
Word-Hunter: A Gamesourcing Experience to Validate the Transcription of Historical Manuscripts |
Type |
Conference Article |
|
Year |
2018 |
Publication |
16th International Conference on Frontiers in Handwriting Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
528-533 |
|
|
Keywords |
Crowdsourcing; Gamification; Handwritten documents; Performance evaluation |
|
|
Abstract |
Nowadays, there are still many handwritten historical documents in archives waiting to be transcribed and indexed. Since manual transcription is tedious and time consuming, the automatic transcription seems the path to follow. However, the performance of current handwriting recognition techniques is not perfect, so a manual validation is mandatory. Crowdsourcing is a good strategy for manual validation, however it is a tedious task. In this paper we analyze experiences based in gamification
in order to propose and design a gamesourcing framework that increases the interest of users. Then, we describe and analyze our experience when validating the automatic transcription using the gamesourcing application. Moreover, thanks to the combination of clustering and handwriting recognition techniques, we can speed up the validation while maintaining the performance. |
|
|
Address |
Niagara Falls, USA; August 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG; 600.097; 603.057; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CRF2018 |
Serial |
3169 |
|
Permanent link to this record |
|
|
|
|
Author |
Manuel Carbonell; Mauricio Villegas; Alicia Fornes; Josep Llados |
|
|
Title |
Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model |
Type |
Conference Article |
|
Year |
2018 |
Publication |
13th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
399-404 |
|
|
Keywords |
Named entity recognition; Handwritten Text Recognition; neural networks |
|
|
Abstract |
When extracting information from handwritten documents, text transcription and named entity recognition are usually faced as separate subsequent tasks. This has the disadvantage that errors in the first module affect heavily the
performance of the second module. In this work we propose to do both tasks jointly, using a single neural network with a common architecture used for plain text recognition. Experimentally, the work has been tested on a collection of historical marriage records. Results of experiments are presented to show the effect on the performance for different
configurations: different ways of encoding the information, doing or not transfer learning and processing at text line or multi-line region level. The results are comparable to state of the art reported in the ICDAR 2017 Information Extraction competition, even though the proposed technique does not use any dictionaries, language modeling or post processing. |
|
|
Address |
Vienna; Austria; April 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.097; 603.057; 601.311; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CVF2018 |
Serial |
3170 |
|
Permanent link to this record |
|
|
|
|
Author |
Katerine Diaz; Jesus Martinez del Rincon; Marçal Rusiñol; Aura Hernandez-Sabate |
|
|
Title |
Feature Extraction by Using Dual-Generalized Discriminative Common Vectors |
Type |
Journal Article |
|
Year |
2019 |
Publication |
Journal of Mathematical Imaging and Vision |
Abbreviated Journal |
JMIV |
|
|
Volume |
61 |
Issue |
3 |
Pages |
331-351 |
|
|
Keywords |
Online feature extraction; Generalized discriminative common vectors; Dual learning; Incremental learning; Decremental learning |
|
|
Abstract |
In this paper, a dual online subspace-based learning method called dual-generalized discriminative common vectors (Dual-GDCV) is presented. The method extends incremental GDCV by exploiting simultaneously both the concepts of incremental and decremental learning for supervised feature extraction and classification. Our methodology is able to update the feature representation space without recalculating the full projection or accessing the previously processed training data. It allows both adding information and removing unnecessary data from a knowledge base in an efficient way, while retaining the previously acquired knowledge. The proposed method has been theoretically proved and empirically validated in six standard face recognition and classification datasets, under two scenarios: (1) removing and adding samples of existent classes, and (2) removing and adding new classes to a classification problem. Results show a considerable computational gain without compromising the accuracy of the model in comparison with both batch methodologies and other state-of-art adaptive methods. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; ADAS; 600.084; 600.118; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ DRR2019 |
Serial |
3172 |
|
Permanent link to this record |
|
|
|
|
Author |
Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas |
|
|
Title |
Learning from# Barcelona Instagram data what Locals and Tourists post about its Neighbourhoods |
Type |
Conference Article |
|
Year |
2018 |
Publication |
15th European Conference on Computer Vision Workshops |
Abbreviated Journal |
|
|
|
Volume |
11134 |
Issue |
|
Pages |
530-544 |
|
|
Keywords |
|
|
|
Abstract |
Massive tourism is becoming a big problem for some cities, such as Barcelona, due to its concentration in some neighborhoods. In this work we gather Instagram data related to Barcelona consisting on images-captions pairs and, using the text as a supervisory signal, we learn relations between images, words and neighborhoods. Our goal is to learn which visual elements appear in photos when people is posting about each neighborhood. We perform a language separate treatment of the data and show that it can be extrapolated to a tourists and locals separate analysis, and that tourism is reflected in Social Media at a neighborhood level. The presented pipeline allows analyzing the differences between the images that tourists and locals associate to the different neighborhoods. The proposed method, which can be extended to other cities or subjects, proves that Instagram data can be used to train multi-modal (image and text) machine learning models that are useful to analyze publications about a city at a neighborhood level. We publish the collected dataset, InstaBarcelona and the code used in the analysis. |
|
|
Address |
Munich; Alemanya; September 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECCVW |
|
|
Notes |
DAG; 600.129; 601.338; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GGG2018b |
Serial |
3176 |
|
Permanent link to this record |
|
|
|
|
Author |
Y. Patel; Lluis Gomez; Raul Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar |
|
|
Title |
TextTopicNet-Self-Supervised Learning of Visual Features Through Embedding Images on Semantic Text Spaces |
Type |
Miscellaneous |
|
Year |
2018 |
Publication |
Arxiv |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
The immense success of deep learning based methods in computer vision heavily relies on large scale training datasets. These richly annotated datasets help the network learn discriminative visual features. Collecting and annotating such datasets requires a tremendous amount of human effort and annotations are limited to popular set of classes. As an alternative, learning visual features by designing auxiliary tasks which make use of freely available self-supervision has become increasingly popular in the computer vision community.
In this paper, we put forward an idea to take advantage of multi-modal context to provide self-supervision for the training of computer vision algorithms. We show that adequate visual features can be learned efficiently by training a CNN to predict the semantic textual context in which a particular image is more probable to appear as an illustration. More specifically we use popular text embedding techniques to provide the self-supervision for the training of deep CNN. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.084; 601.338; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ PGG2018 |
Serial |
3177 |
|
Permanent link to this record |
|
|
|
|
Author |
Anguelos Nicolaou; Sounak Dey; V.Christlein; A.Maier; Dimosthenis Karatzas |
|
|
Title |
Non-deterministic Behavior of Ranking-based Metrics when Evaluating Embeddings |
Type |
Conference Article |
|
Year |
2018 |
Publication |
International Workshop on Reproducible Research in Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
11455 |
Issue |
|
Pages |
71-82 |
|
|
Keywords |
|
|
|
Abstract |
Embedding data into vector spaces is a very popular strategy of pattern recognition methods. When distances between embeddings are quantized, performance metrics become ambiguous. In this paper, we present an analysis of the ambiguity quantized distances introduce and provide bounds on the effect. We demonstrate that it can have a measurable effect in empirical data in state-of-the-art systems. We also approach the phenomenon from a computer security perspective and demonstrate how someone being evaluated by a third party can exploit this ambiguity and greatly outperform a random predictor without even access to the input data. We also suggest a simple solution making the performance metrics, which rely on ranking, totally deterministic and impervious to such exploits. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ NDC2018 |
Serial |
3178 |
|
Permanent link to this record |
|
|
|
|
Author |
Dena Bazazian; Dimosthenis Karatzas; Andrew Bagdanov |
|
|
Title |
Word Spotting in Scene Images based on Character Recognition |
Type |
Conference Article |
|
Year |
2018 |
Publication |
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1872-1874 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we address the problem of unconstrained Word Spotting in scene images. We train a Fully Convolutional Network to produce heatmaps of all the character classes. Then, we employ the Text Proposals approach and, via a rectangle classifier, detect the most likely rectangle for each query word based on the character attribute maps. We evaluate the proposed method on ICDAR2015 and show that it is capable of identifying and recognizing query words in natural scene images. |
|
|
Address |
Salt Lake City; USA; June 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
DAG; 600.129; 600.121 |
Approved |
no |
|
|
Call Number |
BKB2018a |
Serial |
3179 |
|
Permanent link to this record |