Publicacions CVC -- Query Results

[161–170] << 171 172 173 174 175 176 177 178 179 180 >> [181–190]

Details

Records
Author	Juan Ignacio Toledo; Manuel Carbonell; Alicia Fornes; Josep Llados
Title	Information Extraction from Historical Handwritten Document Images with a Context-aware Neural Model			Type	Journal Article
Year	2019	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	86	Issue		Pages	27-36
Keywords	Document image analysis; Handwritten documents; Named entity recognition; Deep neural networks
Abstract	Many historical manuscripts that hold trustworthy memories of the past societies contain information organized in a structured layout (e.g. census, birth or marriage records). The precious information stored in these documents cannot be effectively used nor accessed without costly annotation efforts. The transcription driven by the semantic categories of words is crucial for the subsequent access. In this paper we describe an approach to extract information from structured historical handwritten text images and build a knowledge representation for the extraction of meaning out of historical data. The method extracts information, such as named entities, without the need of an intermediate transcription step, thanks to the incorporation of context information through language models. Our system has two variants, the first one is based on bigrams, whereas the second one is based on recurrent neural networks. Concretely, our second architecture integrates a Convolutional Neural Network to model visual information from word images together with a Bidirecitonal Long Short Term Memory network to model the relation among the words. This integrated sequential approach is able to extract more information than just the semantic category (e.g. a semantic category can be associated to a person in a record). Our system is generic, it deals with out-of-vocabulary words by design, and it can be applied to structured handwritten texts from different domains. The method has been validated with the ICDAR IEHHR competition protocol, outperforming the existing approaches.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.097; 601.311; 603.057; 600.084; 600.140; 600.121			Approved	no
Call Number	Admin @ si @ TCF2019			Serial	3166
Permanent link to this record



Author	Lei Kang; Juan Ignacio Toledo; Pau Riba; Mauricio Villegas; Alicia Fornes; Marçal Rusiñol
Title	Convolve, Attend and Spell: An Attention-based Sequence-to-Sequence Model for Handwritten Word Recognition			Type	Conference Article
Year	2018	Publication	40th German Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	459-472
Keywords
Abstract	This paper proposes Convolve, Attend and Spell, an attention based sequence-to-sequence model for handwritten word recognition. The proposed architecture has three main parts: an encoder, consisting of a CNN and a bi-directional GRU, an attention mechanism devoted to focus on the pertinent features and a decoder formed by a one-directional GRU, able to spell the corresponding word, character by character. Compared with the recent state-of-the-art, our model achieves competitive results on the IAM dataset without needing any pre-processing step, predefined lexicon nor language model. Code and additional results are available in https://github.com/omni-us/research-seq2seq-HTR.
Address	Stuttgart; Germany; October 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	GCPR
Notes	DAG; 600.097; 603.057; 302.065; 601.302; 600.084; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ KTR2018			Serial	3167
Permanent link to this record



Author	Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes
Title	Learning Graph Distances with Message Passing Neural Networks			Type	Conference Article
Year	2018	Publication	24th International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	2239-2244
Keywords	★Best Paper Award★
Abstract	Graph representations have been widely used in pattern recognition thanks to their powerful representation formalism and rich theoretical background. A number of error-tolerant graph matching algorithms such as graph edit distance have been proposed for computing a distance between two labelled graphs. However, they typically suffer from a high computational complexity, which makes it difficult to apply these matching algorithms in a real scenario. In this paper, we propose an efficient graph distance based on the emerging field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure and learns a metric with a siamese network approach. The performance of the proposed graph distance is validated in two application cases, graph classification and graph retrieval of handwritten words, and shows a promising performance when compared with (approximate) graph edit distance benchmarks.
Address	Beijing; China; August 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	DAG; 600.097; 603.057; 601.302; 600.121			Approved	no
Call Number	Admin @ si @ RFL2018			Serial	3168
Permanent link to this record



Author	Jialuo Chen; Pau Riba; Alicia Fornes; Juan Mas; Josep Llados; Joana Maria Pujadas-Mora
Title	Word-Hunter: A Gamesourcing Experience to Validate the Transcription of Historical Manuscripts			Type	Conference Article
Year	2018	Publication	16th International Conference on Frontiers in Handwriting Recognition	Abbreviated Journal
Volume		Issue		Pages	528-533
Keywords	Crowdsourcing; Gamification; Handwritten documents; Performance evaluation
Abstract	Nowadays, there are still many handwritten historical documents in archives waiting to be transcribed and indexed. Since manual transcription is tedious and time consuming, the automatic transcription seems the path to follow. However, the performance of current handwriting recognition techniques is not perfect, so a manual validation is mandatory. Crowdsourcing is a good strategy for manual validation, however it is a tedious task. In this paper we analyze experiences based in gamification in order to propose and design a gamesourcing framework that increases the interest of users. Then, we describe and analyze our experience when validating the automatic transcription using the gamesourcing application. Moreover, thanks to the combination of clustering and handwriting recognition techniques, we can speed up the validation while maintaining the performance.
Address	Niagara Falls, USA; August 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICFHR
Notes	DAG; 600.097; 603.057; 600.121			Approved	no
Call Number	Admin @ si @ CRF2018			Serial	3169
Permanent link to this record



Author	Manuel Carbonell; Mauricio Villegas; Alicia Fornes; Josep Llados
Title	Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model			Type	Conference Article
Year	2018	Publication	13th IAPR International Workshop on Document Analysis Systems	Abbreviated Journal
Volume		Issue		Pages	399-404
Keywords	Named entity recognition; Handwritten Text Recognition; neural networks
Abstract	When extracting information from handwritten documents, text transcription and named entity recognition are usually faced as separate subsequent tasks. This has the disadvantage that errors in the first module affect heavily the performance of the second module. In this work we propose to do both tasks jointly, using a single neural network with a common architecture used for plain text recognition. Experimentally, the work has been tested on a collection of historical marriage records. Results of experiments are presented to show the effect on the performance for different configurations: different ways of encoding the information, doing or not transfer learning and processing at text line or multi-line region level. The results are comparable to state of the art reported in the ICDAR 2017 Information Extraction competition, even though the proposed technique does not use any dictionaries, language modeling or post processing.
Address	Vienna; Austria; April 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	DAS
Notes	DAG; 600.097; 603.057; 601.311; 600.121			Approved	no
Call Number	Admin @ si @ CVF2018			Serial	3170
Permanent link to this record



Author	Alicia Fornes; Bart Lamiroy
Title	Graphics Recognition, Current Trends and Evolutions			Type	Book Whole
Year	2018	Publication	Graphics Recognition, Current Trends and Evolutions	Abbreviated Journal
Volume	11009	Issue		Pages
Keywords
Abstract	This book constitutes the thoroughly refereed post-conference proceedings of the 12th International Workshop on Graphics Recognition, GREC 2017, held in Kyoto, Japan, in November 2017. The 10 revised full papers presented were carefully reviewed and selected from 14 initial submissions. They contain both classical and emerging topics of graphics rcognition, namely analysis and detection of diagrams, search and classification, optical music recognition, interpretation of engineering drawings and maps.
Address
Corporate Author				Thesis
Publisher	Springer International Publishing	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-030-02283-9	Medium
Area		Expedition		Conference
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ FoL2018			Serial	3171
Permanent link to this record



Author	Katerine Diaz; Jesus Martinez del Rincon; Marçal Rusiñol; Aura Hernandez-Sabate
Title	Feature Extraction by Using Dual-Generalized Discriminative Common Vectors			Type	Journal Article
Year	2019	Publication	Journal of Mathematical Imaging and Vision	Abbreviated Journal	JMIV
Volume	61	Issue	3	Pages	331-351
Keywords	Online feature extraction; Generalized discriminative common vectors; Dual learning; Incremental learning; Decremental learning
Abstract	In this paper, a dual online subspace-based learning method called dual-generalized discriminative common vectors (Dual-GDCV) is presented. The method extends incremental GDCV by exploiting simultaneously both the concepts of incremental and decremental learning for supervised feature extraction and classification. Our methodology is able to update the feature representation space without recalculating the full projection or accessing the previously processed training data. It allows both adding information and removing unnecessary data from a knowledge base in an efficient way, while retaining the previously acquired knowledge. The proposed method has been theoretically proved and empirically validated in six standard face recognition and classification datasets, under two scenarios: (1) removing and adding samples of existent classes, and (2) removing and adding new classes to a classification problem. Results show a considerable computational gain without compromising the accuracy of the model in comparison with both batch methodologies and other state-of-art adaptive methods.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; ADAS; 600.084; 600.118; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ DRR2019			Serial	3172
Permanent link to this record



Author	Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas
Title	Learning from# Barcelona Instagram data what Locals and Tourists post about its Neighbourhoods			Type	Conference Article
Year	2018	Publication	15th European Conference on Computer Vision Workshops	Abbreviated Journal
Volume	11134	Issue		Pages	530-544
Keywords
Abstract	Massive tourism is becoming a big problem for some cities, such as Barcelona, due to its concentration in some neighborhoods. In this work we gather Instagram data related to Barcelona consisting on images-captions pairs and, using the text as a supervisory signal, we learn relations between images, words and neighborhoods. Our goal is to learn which visual elements appear in photos when people is posting about each neighborhood. We perform a language separate treatment of the data and show that it can be extrapolated to a tourists and locals separate analysis, and that tourism is reflected in Social Media at a neighborhood level. The presented pipeline allows analyzing the differences between the images that tourists and locals associate to the different neighborhoods. The proposed method, which can be extended to other cities or subjects, proves that Instagram data can be used to train multi-modal (image and text) machine learning models that are useful to analyze publications about a city at a neighborhood level. We publish the collected dataset, InstaBarcelona and the code used in the analysis.
Address	Munich; Alemanya; September 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	DAG; 600.129; 601.338; 600.121			Approved	no
Call Number	Admin @ si @ GGG2018b			Serial	3176
Permanent link to this record



Author	Y. Patel; Lluis Gomez; Raul Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar
Title	TextTopicNet-Self-Supervised Learning of Visual Features Through Embedding Images on Semantic Text Spaces			Type	Miscellaneous
Year	2018	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	The immense success of deep learning based methods in computer vision heavily relies on large scale training datasets. These richly annotated datasets help the network learn discriminative visual features. Collecting and annotating such datasets requires a tremendous amount of human effort and annotations are limited to popular set of classes. As an alternative, learning visual features by designing auxiliary tasks which make use of freely available self-supervision has become increasingly popular in the computer vision community. In this paper, we put forward an idea to take advantage of multi-modal context to provide self-supervision for the training of computer vision algorithms. We show that adequate visual features can be learned efficiently by training a CNN to predict the semantic textual context in which a particular image is more probable to appear as an illustration. More specifically we use popular text embedding techniques to provide the self-supervision for the training of deep CNN.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.084; 601.338; 600.121			Approved	no
Call Number	Admin @ si @ PGG2018			Serial	3177
Permanent link to this record



Author	Anguelos Nicolaou; Sounak Dey; V.Christlein; A.Maier; Dimosthenis Karatzas
Title	Non-deterministic Behavior of Ranking-based Metrics when Evaluating Embeddings			Type	Conference Article
Year	2018	Publication	International Workshop on Reproducible Research in Pattern Recognition	Abbreviated Journal
Volume	11455	Issue		Pages	71-82
Keywords
Abstract	Embedding data into vector spaces is a very popular strategy of pattern recognition methods. When distances between embeddings are quantized, performance metrics become ambiguous. In this paper, we present an analysis of the ambiguity quantized distances introduce and provide bounds on the effect. We demonstrate that it can have a measurable effect in empirical data in state-of-the-art systems. We also approach the phenomenon from a computer security perspective and demonstrate how someone being evaluated by a third party can exploit this ambiguity and greatly outperform a random predictor without even access to the input data. We also suggest a simple solution making the performance metrics, which rely on ranking, totally deterministic and impervious to such exploits.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ NDC2018			Serial	3178
Permanent link to this record



Author	Dena Bazazian; Dimosthenis Karatzas; Andrew Bagdanov
Title	Word Spotting in Scene Images based on Character Recognition			Type	Conference Article
Year	2018	Publication	IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops	Abbreviated Journal
Volume		Issue		Pages	1872-1874
Keywords
Abstract	In this paper we address the problem of unconstrained Word Spotting in scene images. We train a Fully Convolutional Network to produce heatmaps of all the character classes. Then, we employ the Text Proposals approach and, via a rectangle classifier, detect the most likely rectangle for each query word based on the character attribute maps. We evaluate the proposed method on ICDAR2015 and show that it is capable of identifying and recognizing query words in natural scene images.
Address	Salt Lake City; USA; June 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	DAG; 600.129; 600.121			Approved	no
Call Number	BKB2018a			Serial	3179
Permanent link to this record



Author	Adrien Gaidon; Antonio Lopez; Florent Perronnin
Title	The Reasonable Effectiveness of Synthetic Visual Data			Type	Journal Article
Year	2018	Publication	International Journal of Computer Vision	Abbreviated Journal	IJCV
Volume	126	Issue	9	Pages	899–901
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.118			Approved	no
Call Number	Admin @ si @ GLP2018			Serial	3180
Permanent link to this record



Author	Zhijie Fang; Antonio Lopez
Title	Is the Pedestrian going to Cross? Answering by 2D Pose Estimation			Type	Conference Article
Year	2018	Publication	IEEE Intelligent Vehicles Symposium	Abbreviated Journal
Volume		Issue		Pages	1271 - 1276
Keywords
Abstract	Our recent work suggests that, thanks to nowadays powerful CNNs, image-based 2D pose estimation is a promising cue for determining pedestrian intentions such as crossing the road in the path of the ego-vehicle, stopping before entering the road, and starting to walk or bending towards the road. This statement is based on the results obtained on non-naturalistic sequences (Daimler dataset), i.e. in sequences choreographed specifically for performing the study. Fortunately, a new publicly available dataset (JAAD) has appeared recently to allow developing methods for detecting pedestrian intentions in naturalistic driving conditions; more specifically, for addressing the relevant question is the pedestrian going to cross? Accordingly, in this paper we use JAAD to assess the usefulness of 2D pose estimation for answering such a question. We combine CNN-based pedestrian detection, tracking and pose estimation to predict the crossing action from monocular images. Overall, the proposed pipeline provides new state-ofthe-art results.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IV
Notes	ADAS; 600.124; 600.116; 600.118			Approved	no
Call Number	Admin @ si @ FaL2018			Serial	3181
Permanent link to this record



Author	Jiaolong Xu; Peng Wang; Heng Yang; Antonio Lopez
Title	Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving			Type	Conference Article
Year	2019	Publication	IEEE International Conference on Robotics and Automation	Abbreviated Journal
Volume		Issue		Pages	2379-2384
Keywords
Abstract	Autonomous driving has harsh requirements of small model size and energy efficiency, in order to enable the embedded system to achieve real-time on-board object detection. Recent deep convolutional neural network based object detectors have achieved state-of-the-art accuracy. However, such models are trained with numerous parameters and their high computational costs and large storage prohibit the deployment to memory and computation resource limited systems. Low-precision neural networks are popular techniques for reducing the computation requirements and memory footprint. Among them, binary weight neural network (BWN) is the extreme case which quantizes the float-point into just bit. BWNs are difficult to train and suffer from accuracy deprecation due to the extreme low-bit representation. To address this problem, we propose a knowledge transfer (KT) method to aid the training of BWN using a full-precision teacher network. We built DarkNet-and MobileNet-based binary weight YOLO-v2 detectors and conduct experiments on KITTI benchmark for car, pedestrian and cyclist detection. The experimental results show that the proposed method maintains high detection accuracy while reducing the model size of DarkNet-YOLO from 257 MB to 8.8 MB and MobileNet-YOLO from 193 MB to 7.9 MB.
Address	Montreal; Canada; May 2019
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICRA
Notes	ADAS; 600.124; 600.116; 600.118			Approved	no
Call Number	Admin @ si @ XWY2018			Serial	3182
Permanent link to this record



Author	Akhil Gurram; Onay Urfalioglu; Ibrahim Halfaoui; Fahd Bouzaraa; Antonio Lopez
Title	Monocular Depth Estimation by Learning from Heterogeneous Datasets			Type	Conference Article
Year	2018	Publication	IEEE Intelligent Vehicles Symposium	Abbreviated Journal
Volume		Issue		Pages	2176 - 2181
Keywords
Abstract	Depth estimation provides essential information to perform autonomous driving and driver assistance. Especially, Monocular Depth Estimation is interesting from a practical point of view, since using a single camera is cheaper than many other options and avoids the need for continuous calibration strategies as required by stereo-vision approaches. State-of-the-art methods for Monocular Depth Estimation are based on Convolutional Neural Networks (CNNs). A promising line of work consists of introducing additional semantic information about the traffic scene when training CNNs for depth estimation. In practice, this means that the depth data used for CNN training is complemented with images having pixel-wise semantic labels, which usually are difficult to annotate (eg crowded urban images). Moreover, so far it is common practice to assume that the same raw training data is associated with both types of ground truth, ie, depth and semantic labels. The main contribution of this paper is to show that this hard constraint can be circumvented, ie, that we can train CNNs for depth estimation by leveraging the depth and semantic information coming from heterogeneous datasets. In order to illustrate the benefits of our approach, we combine KITTI depth and Cityscapes semantic segmentation datasets, outperforming state-of-the-art results on Monocular Depth Estimation.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IV
Notes	ADAS; 600.124; 600.116; 600.118			Approved	no
Call Number	Admin @ si @ GUH2018			Serial	3183
Permanent link to this record