Home | [21–30] << 31 32 33 34 35 36 37 38 39 40 >> [41–50] |
Records | |||||
---|---|---|---|---|---|
Author | Ayan Banerjee; Palaiahnakote Shivakumara; Parikshit Acharya; Umapada Pal; Josep Llados | ||||
Title | TWD: A New Deep E2E Model for Text Watermark Detection in Video Images | Type | Conference Article | ||
Year | 2022 | Publication | 26th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Deep learning; U-Net; FCENet; Scene text detection; Video text detection; Watermark text detection | ||||
Abstract | Text watermark detection in video images is challenging because text watermark characteristics are different from caption and scene texts in the video images. Developing a successful model for detecting text watermark, caption, and scene texts is an open challenge. This study aims at developing a new Deep End-to-End model for Text Watermark Detection (TWD), caption and scene text in video images. To standardize non-uniform contrast, quality, and resolution, we explore the U-Net3+ model for enhancing poor quality text without affecting high-quality text. Similarly, to address the challenges of arbitrary orientation, text shapes and complex background, we explore Stacked Hourglass Encoded Fourier Contour Embedding Network (SFCENet) by feeding the output of the U-Net3+ model as input. Furthermore, the proposed work integrates enhancement and detection models as an end-to-end model for detecting multi-type text in video images. To validate the proposed model, we create our own dataset (named TW-866), which provides video images containing text watermark, caption (subtitles), as well as scene text. The proposed model is also evaluated on standard natural scene text detection datasets, namely, ICDAR 2019 MLT, CTW1500, Total-Text, and DAST1500. The results show that the proposed method outperforms the existing methods. This is the first work on text watermark detection in video images to the best of our knowledge | ||||
Address | Montreal; Quebec; Canada; August 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; | Approved | no | ||
Call Number | Admin @ si @ BSA2022 | Serial | 3788 | ||
Permanent link to this record | |||||
Author | David Vazquez; Jorge Bernal; F. Javier Sanchez; Gloria Fernandez Esparrach; Antonio Lopez; Adriana Romero; Michal Drozdzal; Aaron Courville | ||||
Title | A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images | Type | Conference Article | ||
Year | 2017 | Publication | 31st International Congress and Exhibition on Computer Assisted Radiology and Surgery | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Deep Learning; Medical Imaging | ||||
Abstract | Colorectal cancer (CRC) is the third cause of cancer death worldwide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss-rate and inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing Decision Support Systems (DSS) aiming to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. We provide new baselines on this dataset by training standard fully convolutional networks (FCN) for semantic segmentation and significantly outperforming, without any further post-processing, prior results in endoluminal scene segmentation. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CARS | ||
Notes | ADAS; MV; 600.075; 600.085; 600.076; 601.281; 600.118 | Approved | no | ||
Call Number | ADAS @ adas @ VBS2017a | Serial | 2880 | ||
Permanent link to this record | |||||
Author | Parichehr Behjati Ardakani; Pau Rodriguez; Carles Fernandez; Armin Mehri; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez | ||||
Title | Frequency-based Enhancement Network for Efficient Super-Resolution | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Access | Abbreviated Journal | ACCESS |
Volume | 10 | Issue | Pages | 57383-57397 | |
Keywords | Deep learning; Frequency-based methods; Lightweight architectures; Single image super-resolution | ||||
Abstract | Recently, deep convolutional neural networks (CNNs) have provided outstanding performance in single image super-resolution (SISR). Despite their remarkable performance, the lack of high-frequency information in the recovered images remains a core problem. Moreover, as the networks increase in depth and width, deep CNN-based SR methods are faced with the challenge of computational complexity in practice. A promising and under-explored solution is to adapt the amount of compute based on the different frequency bands of the input. To this end, we present a novel Frequency-based Enhancement Block (FEB) which explicitly enhances the information of high frequencies while forwarding low-frequencies to the output. In particular, this block efficiently decomposes features into low- and high-frequency and assigns more computation to high-frequency ones. Thus, it can help the network generate more discriminative representations by explicitly recovering finer details. Our FEB design is simple and generic and can be used as a direct replacement of commonly used SR blocks with no need to change network architectures. We experimentally show that when replacing SR blocks with FEB we consistently improve the reconstruction error, while reducing the number of parameters in the model. Moreover, we propose a lightweight SR model — Frequency-based Enhancement Network (FENet) — based on FEB that matches the performance of larger models. Extensive experiments demonstrate that our proposal performs favorably against the state-of-the-art SR algorithms in terms of visual quality, memory footprint, and inference time. The code is available at https://github.com/pbehjatii/FENet | ||||
Address | 18 May 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | IEEE | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ BRF2022a | Serial | 3747 | ||
Permanent link to this record | |||||
Author | Pau Rodriguez; Josep M. Gonfaus; Guillem Cucurull; Xavier Roca; Jordi Gonzalez | ||||
Title | Attend and Rectify: A Gated Attention Mechanism for Fine-Grained Recovery | Type | Conference Article | ||
Year | 2018 | Publication | 15th European Conference on Computer Vision | Abbreviated Journal | |
Volume | 11212 | Issue | Pages | 357-372 | |
Keywords | Deep Learning; Convolutional Neural Networks; Attention | ||||
Abstract | We propose a novel attention mechanism to enhance Convolutional Neural Networks for fine-grained recognition. It learns to attend to lower-level feature activations without requiring part annotations and uses these activations to update and rectify the output likelihood distribution. In contrast to other approaches, the proposed mechanism is modular, architecture-independent and efficient both in terms of parameters and computation required. Experiments show that networks augmented with our approach systematically improve their classification accuracy and become more robust to clutter. As a result, Wide Residual Networks augmented with our proposal surpasses the state of the art classification accuracies in CIFAR-10, the Adience gender recognition task, Stanford dogs, and UEC Food-100. | ||||
Address | Munich; September 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECCV | ||
Notes | ISE; 600.098; 602.121; 600.119 | Approved | no | ||
Call Number | Admin @ si @ RGC2018 | Serial | 3139 | ||
Permanent link to this record | |||||
Author | Vacit Oguz Yazici; Joost Van de Weijer; Arnau Ramisa | ||||
Title | Color Naming for Multi-Color Fashion Items | Type | Conference Article | ||
Year | 2018 | Publication | 6th World Conference on Information Systems and Technologies | Abbreviated Journal | |
Volume | 747 | Issue | Pages | 64-73 | |
Keywords | Deep learning; Color; Multi-label | ||||
Abstract | There exists a significant amount of research on color naming of single colored objects. However in reality many fashion objects consist of multiple colors. Currently, searching in fashion datasets for multi-colored objects can be a laborious task. Therefore, in this paper we focus on color naming for images with multi-color fashion items. We collect a dataset, which consists of images which may have from one up to four colors. We annotate the images with the 11 basic colors of the English language. We experiment with several designs for deep neural networks with different losses. We show that explicitly estimating the number of colors in the fashion item leads to improved results. | ||||
Address | Naples; March 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WORLDCIST | ||
Notes | LAMP; 600.109; 601.309; 600.120 | Approved | no | ||
Call Number | Admin @ si @ YWR2018 | Serial | 3161 | ||
Permanent link to this record | |||||
Author | Jorge Charco; Boris X. Vintimilla; Angel Sappa | ||||
Title | Deep learning based camera pose estimation in multi-view environment | Type | Conference Article | ||
Year | 2018 | Publication | 14th IEEE International Conference on Signal Image Technology & Internet Based System | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Deep learning; Camera pose estimation; Multiview environment; Siamese architecture | ||||
Abstract | This paper proposes to use a deep learning network architecture for relative camera pose estimation on a multi-view environment. The proposed network is a variant architecture of AlexNet to use as regressor for prediction the relative translation and rotation as output. The proposed approach is trained from
scratch on a large data set that takes as input a pair of imagesfrom the same scene. This new architecture is compared with a previous approach using standard metrics, obtaining better results on the relative camera pose. |
||||
Address | Las Palmas de Gran Canaria; November 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | SITIS | ||
Notes | MSIAU; 600.086; 600.130; 600.122 | Approved | no | ||
Call Number | Admin @ si @ CVS2018 | Serial | 3194 | ||
Permanent link to this record | |||||
Author | Meysam Madadi; Hugo Bertiche; Sergio Escalera | ||||
Title | SMPLR: Deep learning based SMPL reverse for 3D human pose and shape recovery | Type | Journal Article | ||
Year | 2020 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 106 | Issue | Pages | 107472 | |
Keywords | Deep learning; 3D Human pose; Body shape; SMPL; Denoising autoencoder; Volumetric stack hourglass | ||||
Abstract | In this paper we propose to embed SMPL within a deep-based model to accurately estimate 3D pose and shape from a still RGB image. We use CNN-based 3D joint predictions as an intermediate representation to regress SMPL pose and shape parameters. Later, 3D joints are reconstructed again in the SMPL output. This module can be seen as an autoencoder where the encoder is a deep neural network and the decoder is SMPL model. We refer to this as SMPL reverse (SMPLR). By implementing SMPLR as an encoder-decoder we avoid the need of complex constraints on pose and shape. Furthermore, given that in-the-wild datasets usually lack accurate 3D annotations, it is desirable to lift 2D joints to 3D without pairing 3D annotations with RGB images. Therefore, we also propose a denoising autoencoder (DAE) module between CNN and SMPLR, able to lift 2D joints to 3D and partially recover from structured error. We evaluate our method on SURREAL and Human3.6M datasets, showing improvement over SMPL-based state-of-the-art alternatives by about 4 and 12 mm, respectively. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ MBE2020 | Serial | 3439 | ||
Permanent link to this record | |||||
Author | Katerine Diaz; Jesus Martinez del Rincon; Aura Hernandez-Sabate | ||||
Title | Decremental generalized discriminative common vectors applied to images classification | Type | Journal Article | ||
Year | 2017 | Publication | Knowledge-Based Systems | Abbreviated Journal | KBS |
Volume | 131 | Issue | Pages | 46-57 | |
Keywords | Decremental learning; Generalized Discriminative Common Vectors; Feature extraction; Linear subspace methods; Classification | ||||
Abstract | In this paper, a novel decremental subspace-based learning method called Decremental Generalized Discriminative Common Vectors method (DGDCV) is presented. The method makes use of the concept of decremental learning, which we introduce in the field of supervised feature extraction and classification. By efficiently removing unnecessary data and/or classes for a knowledge base, our methodology is able to update the model without recalculating the full projection or accessing to the previously processed training data, while retaining the previously acquired knowledge. The proposed method has been validated in 6 standard face recognition datasets, showing a considerable computational gain without compromising the accuracy of the model. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118; 600.121 | Approved | no | ||
Call Number | Admin @ si @ DMH2017a | Serial | 3003 | ||
Permanent link to this record | |||||
Author | Joost Van de Weijer; Shida Beigpour | ||||
Title | The Dichromatic Reflection Model: Future Research Directions and Applications | Type | Conference Article | ||
Year | 2011 | Publication | International Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | dblp | ||||
Abstract | The dichromatic reflection model (DRM) predicts that color distributions form a parallelogram in color space, whose shape is defined by the body reflectance and the illuminant color. In this paper we resume the assumptions which led to the DRM and shortly recall two of its main applications domains: color image segmentation and photometric invariant feature computation. After having introduced the model we discuss several limitations of the theory, especially those which are raised once working on real-world uncalibrated images. In addition, we summerize recent extensions of the model which allow to handle more complicated light interactions. Finally, we suggest some future research directions which would further extend its applicability. | ||||
Address | Algarve, Portugal | ||||
Corporate Author | Thesis | ||||
Publisher | SciTePress | Place of Publication | Editor | Mestetskiy, Leonid and Braz, José | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-989-8425-47-8 | Medium | ||
Area | Expedition | Conference | VISIGRAPP | ||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ WeB2011 | Serial | 1778 | ||
Permanent link to this record | |||||
Author | Adria Molina; Lluis Gomez; Oriol Ramos Terrades; Josep Llados | ||||
Title | A Generic Image Retrieval Method for Date Estimation of Historical Document Collections | Type | Conference Article | ||
Year | 2022 | Publication | Document Analysis Systems.15th IAPR International Workshop, (DAS2022) | Abbreviated Journal | |
Volume | 13237 | Issue | Pages | 583–597 | |
Keywords | Date estimation; Document retrieval; Image retrieval; Ranking loss; Smooth-nDCG | ||||
Abstract | Date estimation of historical document images is a challenging problem, with several contributions in the literature that lack of the ability to generalize from one dataset to others. This paper presents a robust date estimation system based in a retrieval approach that generalizes well in front of heterogeneous collections. We use a ranking loss function named smooth-nDCG to train a Convolutional Neural Network that learns an ordination of documents for each problem. One of the main usages of the presented approach is as a tool for historical contextual retrieval. It means that scholars could perform comparative analysis of historical images from big datasets in terms of the period where they were produced. We provide experimental evaluation on different types of documents from real datasets of manuscript and newspaper images. | ||||
Address | La Rochelle, France; May 22–25, 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | DAS | ||
Notes | DAG; 600.140; 600.121 | Approved | no | ||
Call Number | Admin @ si @ MGR2022 | Serial | 3694 | ||
Permanent link to this record | |||||
Author | Debora Gil; Antonio Esteban Lansaque; Sebastian Stefaniga; Mihail Gaianu; Carles Sanchez | ||||
Title | Data Augmentation from Sketch | Type | Conference Article | ||
Year | 2019 | Publication | International Workshop on Uncertainty for Safe Utilization of Machine Learning in Medical Imaging | Abbreviated Journal | |
Volume | 11840 | Issue | Pages | 155-162 | |
Keywords | Data augmentation; cycleGANs; Multi-objective optimization | ||||
Abstract | State of the art machine learning methods need huge amounts of data with unambiguous annotations for their training. In the context of medical imaging this is, in general, a very difficult task due to limited access to clinical data, the time required for manual annotations and variability across experts. Simulated data could serve for data augmentation provided that its appearance was comparable to the actual appearance of intra-operative acquisitions. Generative Adversarial Networks (GANs) are a powerful tool for artistic style transfer, but lack a criteria for selecting epochs ensuring also preservation of intra-operative content.
We propose a multi-objective optimization strategy for a selection of cycleGAN epochs ensuring a mapping between virtual images and the intra-operative domain preserving anatomical content. Our approach has been applied to simulate intra-operative bronchoscopic videos and chest CT scans from virtual sketches generated using simple graphical primitives. |
||||
Address | Shenzhen; China; October 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CLIP | ||
Notes | IAM; 600.145; 601.337; 600.139; 600.145 | Approved | no | ||
Call Number | Admin @ si @ GES2019 | Serial | 3359 | ||
Permanent link to this record | |||||
Author | Daniel Hernandez; Antonio Espinosa; David Vazquez; Antonio Lopez; Juan C. Moure | ||||
Title | 3D Perception With Slanted Stixels on GPU | Type | Journal Article | ||
Year | 2021 | Publication | IEEE Transactions on Parallel and Distributed Systems | Abbreviated Journal | TPDS |
Volume | 32 | Issue | 10 | Pages | 2434-2447 |
Keywords | Daniel Hernandez-Juarez; Antonio Espinosa; David Vazquez; Antonio M. Lopez; Juan C. Moure | ||||
Abstract | This article presents a GPU-accelerated software design of the recently proposed model of Slanted Stixels, which represents the geometric and semantic information of a scene in a compact and accurate way. We reformulate the measurement depth model to reduce the computational complexity of the algorithm, relying on the confidence of the depth estimation and the identification of invalid values to handle outliers. The proposed massively parallel scheme and data layout for the irregular computation pattern that corresponds to a Dynamic Programming paradigm is described and carefully analyzed in performance terms. Performance is shown to scale gracefully on current generation embedded GPUs. We assess the proposed methods in terms of semantic and geometric accuracy as well as run-time performance on three publicly available benchmark datasets. Our approach achieves real-time performance with high accuracy for 2048 × 1024 image sizes and 4 × 4 Stixel resolution on the low-power embedded GPU of an NVIDIA Tegra Xavier. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.124; 600.118 | Approved | no | ||
Call Number | Admin @ si @ HEV2021 | Serial | 3561 | ||
Permanent link to this record | |||||
Author | Jorge Bernal; David Vazquez (eds) | ||||
Title | Computer vision Trends and Challenges | Type | Book Whole | ||
Year | 2013 | Publication | Computer vision Trends and Challenges | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | CVCRD; Computer Vision | ||||
Abstract | This book contains the papers presented at the Eighth CVC Workshop on Computer Vision Trends and Challenges (CVCR&D'2013). The workshop was held at the Computer Vision Center (Universitat Autònoma de Barcelona), the October 25th, 2013. The CVC workshops provide an excellent opportunity for young researchers and project engineers to share new ideas and knowledge about the progress of their work, and also, to discuss about challenges and future perspectives. In addition, the workshop is the welcome event for new people that recently have joined the institute.
The program of CVCR&D is organized in a single-track single-day workshop. It comprises several sessions dedicated to specific topics. For each session, a doctor working on the topic introduces the general research lines. The PhD students expose their specific research. A poster session will be held for open questions. Session topics cover the current research lines and development projects of the CVC: Medical Imaging, Medical Imaging, Color & Texture Analysis, Object Recognition, Image Sequence Evaluation, Advanced Driver Assistance Systems, Machine Vision, Document Analysis, Pattern Recognition and Applications. We want to thank all paper authors and Program Committee members. Their contribution shows that the CVC has a dynamic, active, and promising scientific community. We hope you all enjoy this Eighth workshop and we are looking forward to meeting you and new people next year in the Ninth CVCR&D. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | Jorge Bernal; David Vazquez | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-940902-2-6 | Medium | ||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number | ADAS @ adas @ BeV2013 | Serial | 2339 | ||
Permanent link to this record | |||||
Author | Daniel Hernandez; Alejandro Chacon; Antonio Espinosa; David Vazquez; Juan Carlos Moure; Antonio Lopez | ||||
Title | Stereo Matching using SGM on the GPU | Type | Report | ||
Year | 2016 | Publication | Programming and Tuning Massively Parallel Systems | Abbreviated Journal | PUMPS |
Volume | Issue | Pages | |||
Keywords | CUDA; Stereo; Autonomous Vehicle | ||||
Abstract | Dense, robust and real-time computation of depth information from stereo-camera systems is a computationally demanding requirement for robotics, advanced driver assistance systems (ADAS) and autonomous vehicles. Semi-Global Matching (SGM) is a widely used algorithm that propagates consistency constraints along several paths across the image. This work presents a real-time system producing reliable disparity estimation results on the new embedded energy efficient GPU devices. Our design runs on a Tegra X1 at 42 frames per second (fps) for an image size of 640x480, 128 disparity levels, and using 4 path directions for the SGM method. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | PUMPS | ||
Notes | ADAS; 600.085; 600.087; 600.076 | Approved | no | ||
Call Number | ADAS @ adas @ HCE2016b | Serial | 2776 | ||
Permanent link to this record | |||||
Author | Jialuo Chen; Pau Riba; Alicia Fornes; Juan Mas; Josep Llados; Joana Maria Pujadas-Mora | ||||
Title | Word-Hunter: A Gamesourcing Experience to Validate the Transcription of Historical Manuscripts | Type | Conference Article | ||
Year | 2018 | Publication | 16th International Conference on Frontiers in Handwriting Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 528-533 | ||
Keywords | Crowdsourcing; Gamification; Handwritten documents; Performance evaluation | ||||
Abstract | Nowadays, there are still many handwritten historical documents in archives waiting to be transcribed and indexed. Since manual transcription is tedious and time consuming, the automatic transcription seems the path to follow. However, the performance of current handwriting recognition techniques is not perfect, so a manual validation is mandatory. Crowdsourcing is a good strategy for manual validation, however it is a tedious task. In this paper we analyze experiences based in gamification
in order to propose and design a gamesourcing framework that increases the interest of users. Then, we describe and analyze our experience when validating the automatic transcription using the gamesourcing application. Moreover, thanks to the combination of clustering and handwriting recognition techniques, we can speed up the validation while maintaining the performance. |
||||
Address | Niagara Falls, USA; August 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICFHR | ||
Notes | DAG; 600.097; 603.057; 600.121 | Approved | no | ||
Call Number | Admin @ si @ CRF2018 | Serial | 3169 | ||
Permanent link to this record |