|   | 
Details
   web
Records
Author Vacit Oguz Yazici; Abel Gonzalez-Garcia; Arnau Ramisa; Bartlomiej Twardowski; Joost Van de Weijer
Title Orderless Recurrent Models for Multi-label Classification Type Conference Article
Year 2020 Publication 33rd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages
Keywords (up)
Abstract Recurrent neural networks (RNN) are popular for many computer vision tasks, including multi-label classification. Since RNNs produce sequential outputs, labels need to be ordered for the multi-label classification task. Current approaches sort labels according to their frequency, typically ordering them in either rare-first or frequent-first. These imposed orderings do not take into account that the natural order to generate the labels can change for each image, e.g.\ first the dominant object before summing up the smaller objects in the image. Therefore, in this paper, we propose ways to dynamically order the ground truth labels with the predicted label sequence. This allows for the faster training of more optimal LSTM models for multi-label classification. Analysis evidences that our method does not suffer from duplicate generation, something which is common for other models. Furthermore, it outperforms other CNN-RNN models, and we show that a standard architecture of an image encoder and language decoder trained with our proposed loss obtains the state-of-the-art results on the challenging MS-COCO, WIDER Attribute and PA-100K and competitive results on NUS-WIDE.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes LAMP; 600.109; 601.309; 600.141; 600.120 Approved no
Call Number Admin @ si @ YGR2020 Serial 3408
Permanent link to this record
 

 
Author Khalid El Asnaoui; Petia Radeva
Title Automatically Assess Day Similarity Using Visual Lifelogs Type Journal Article
Year 2020 Publication International Journal of Intelligent Systems Abbreviated Journal IJIS
Volume 29 Issue Pages 298–310
Keywords (up)
Abstract Today, we witness the appearance of many lifelogging cameras that are able to capture the life of a person wearing the camera and which produce a large number of images everyday. Automatically characterizing the experience and extracting patterns of behavior of individuals from this huge collection of unlabeled and unstructured egocentric data present major challenges and require novel and efficient algorithmic solutions. The main goal of this work is to propose a new method to automatically assess day similarity from the lifelogging images of a person. We propose a technique to measure the similarity between images based on the Swain’s distance and generalize it to detect the similarity between daily visual data. To this purpose, we apply the dynamic time warping (DTW) combined with the Swain’s distance for final day similarity estimation. For validation, we apply our technique on the Egocentric Dataset of University of Barcelona (EDUB) of 4912 daily images acquired by four persons with preliminary encouraging results.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no proj Approved no
Call Number AsR2020 Serial 3409
Permanent link to this record
 

 
Author Margarita Torre; Beatriz Remeseiro; Petia Radeva; Fernando Martinez
Title DeepNEM: Deep Network Energy-Minimization for Agricultural Field Segmentation Type Journal Article
Year 2020 Publication IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Abbreviated Journal JSTAEOR
Volume 13 Issue Pages 726-737
Keywords (up)
Abstract One of the main characteristics of agricultural fields is that the appearance of different crops and their growth status, in an aerial image, is varied, and has a wide range of radiometric values and high level of variability. The extraction of these fields and their monitoring are activities that require a high level of human intervention. In this article, we propose a novel automatic algorithm, named deep network energy-minimization (DeepNEM), to extract agricultural fields in aerial images. The model-guided process selects the most relevant image clues extracted by a deep network, completes them and finally generates regions that represent the agricultural fields under a minimization scheme. DeepNEM has been tested over a broad range of fields in terms of size, shape, and content. Different measures were used to compare the DeepNEM with other methods, and to prove that it represents an improved approach to achieve a high-quality segmentation of agricultural fields. Furthermore, this article also presents a new public dataset composed of 1200 images with their parcels boundaries annotations.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB Approved no
Call Number Admin @ si @ TRR2020 Serial 3410
Permanent link to this record
 

 
Author Shifeng Zhang; Ajian Liu; Jun Wan; Yanyan Liang; Guogong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z. Li
Title CASIA-SURF: A Dataset and Benchmark for Large-scale Multi-modal Face Anti-spoofing Type Journal
Year 2020 Publication IEEE Transactions on Biometrics, Behavior, and Identity Science Abbreviated Journal TTBIS
Volume 2 Issue 2 Pages 182 - 193
Keywords (up)
Abstract Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects (≤170) and modalities (≤2), which hinder the further development of the academic community. To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and modalities. Specifically, it consists of 1,000 subjects with 21,000 videos and each sample has 3 modalities ( i.e. , RGB, Depth and IR). We also provide comprehensive evaluation metrics, diverse evaluation protocols, training/validation/testing subsets and a measurement tool, developing a new benchmark for face anti-spoofing. Moreover, we present a novel multi-modal multi-scale fusion method as a strong baseline, which performs feature re-weighting to select the more informative channel features while suppressing the less useful ones for each modality across different scales. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability. The dataset is available at https://sites.google.com/qq.com/face-anti-spoofing/welcome/challengecvpr2019?authuser=0
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ ZLW2020 Serial 3412
Permanent link to this record
 

 
Author Yunan Li; Jun Wan; Qiguang Miao; Sergio Escalera; Huijuan Fang; Huizhou Chen; Xiangda Qi; Guodong Guo
Title CR-Net: A Deep Classification-Regression Network for Multimodal Apparent Personality Analysis Type Journal Article
Year 2020 Publication International Journal of Computer Vision Abbreviated Journal IJCV
Volume 128 Issue Pages 2763–2780
Keywords (up)
Abstract First impressions strongly influence social interactions, having a high impact in the personal and professional life. In this paper, we present a deep Classification-Regression Network (CR-Net) for analyzing the Big Five personality problem and further assisting on job interview recommendation in a first impressions setup. The setup is based on the ChaLearn First Impressions dataset, including multimodal data with video, audio, and text converted from the corresponding audio data, where each person is talking in front of a camera. In order to give a comprehensive prediction, we analyze the videos from both the entire scene (including the person’s motions and background) and the face of the person. Our CR-Net first performs personality trait classification and applies a regression later, which can obtain accurate predictions for both personality traits and interview recommendation. Furthermore, we present a new loss function called Bell Loss to address inaccurate predictions caused by the regression-to-the-mean problem. Extensive experiments on the First Impressions dataset show the effectiveness of our proposed network, outperforming the state-of-the-art.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no menciona Approved no
Call Number Admin @ si @ LWM2020 Serial 3413
Permanent link to this record
 

 
Author Pau Rodriguez; Jordi Gonzalez; Josep M. Gonfaus; Xavier Roca
Title Integrating Vision and Language in Social Networks for Identifying Visual Patterns of Personality Traits Type Journal
Year 2019 Publication International Journal of Social Science and Humanity Abbreviated Journal IJSSH
Volume 9 Issue 1 Pages 6-12
Keywords (up)
Abstract Social media, as a major platform for communication and information exchange, is a rich repository of the opinions and sentiments of 2.3 billion users about a vast spectrum of topics. In this sense, user text interactions are widely used to sense the whys of certain social user’s demands and cultural- driven interests. However, the knowledge embedded in the 1.8 billion pictures which are uploaded daily in public profiles has just started to be exploited. Following this trend on visual-based social analysis, we present a novel methodology based on neural networks to build a combined image-and-text based personality trait model, trained with images posted together with words found highly correlated to specific personality traits. So, the key contribution in this work is to explore whether OCEAN personality trait modeling can be addressed based on images, here called MindPics, appearing with certain tags with psychological insights. We found that there is a correlation between posted images and the personality estimated from their accompanying texts. Thus, the experimental results are consistent with previous cyber-psychology results based on texts, suggesting that images could also be used for personality estimation: classification results on some personality traits show that specific and characteristic visual patterns emerge, in essence representing abstract concepts. These results open new avenues of research for further refining the proposed personality model under the supervision of psychology experts, and to further substitute current textual personality questionnaires by image-based ones.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE; 600.119 Approved no
Call Number Admin @ si @ RGG2019 Serial 3414
Permanent link to this record
 

 
Author M. Ivasic-Kos; M. Pobar; Jordi Gonzalez
Title Active Player Detection in Handball Videos Using Optical Flow and STIPs Based Measures Type Conference Article
Year 2019 Publication 13th International Conference on Signal Processing and Communication Systems Abbreviated Journal
Volume Issue Pages
Keywords (up)
Abstract In handball videos recorded during the training, multiple players are present in the scene at the same time. Although they all might move and interact, not all players contribute to the currently relevant exercise nor practice the given handball techniques. The goal of this experiment is to automatically determine players on training footage that perform given handball techniques and are therefore considered active. It is a very challenging task for which a precise object detector is needed that can handle cluttered scenes with poor illumination, with many players present in different sizes and distances from the camera, partially occluded, moving fast. To determine which of the detected players are active, additional information is needed about the level of player activity. Since many handball actions are characterized by considerable changes in speed, position, and variations in the player's appearance, we propose using spatio-temporal interest points (STIPs) and optical flow (OF). Therefore, we propose an active player detection method combining the YOLO object detector and two activity measures based on STIPs and OF. The performance of the proposed method and activity measures are evaluated on a custom handball video dataset acquired during handball training lessons.
Address Gold Coast; Australia; December 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICSPCS2
Notes ISE; 600.098; 600.119 Approved no
Call Number Admin @ si @ IPG2019 Serial 3415
Permanent link to this record
 

 
Author Pau Rodriguez; Diego Velazquez; Guillem Cucurull; Josep M. Gonfaus; Xavier Roca; Jordi Gonzalez
Title Pay attention to the activations: a modular attention mechanism for fine-grained image recognition Type Journal Article
Year 2020 Publication IEEE Transactions on Multimedia Abbreviated Journal TMM
Volume 22 Issue 2 Pages 502-514
Keywords (up)
Abstract Fine-grained image recognition is central to many multimedia tasks such as search, retrieval, and captioning. Unfortunately, these tasks are still challenging since the appearance of samples of the same class can be more different than those from different classes. This issue is mainly due to changes in deformation, pose, and the presence of clutter. In the literature, attention has been one of the most successful strategies to handle the aforementioned problems. Attention has been typically implemented in neural networks by selecting the most informative regions of the image that improve classification. In contrast, in this paper, attention is not applied at the image level but to the convolutional feature activations. In essence, with our approach, the neural model learns to attend to lower-level feature activations without requiring part annotations and uses those activations to update and rectify the output likelihood distribution. The proposed mechanism is modular, architecture-independent, and efficient in terms of both parameters and computation required. Experiments demonstrate that well-known networks such as wide residual networks and ResNeXt, when augmented with our approach, systematically improve their classification accuracy and become more robust to changes in deformation and pose and to the presence of clutter. As a result, our proposal reaches state-of-the-art classification accuracies in CIFAR-10, the Adience gender recognition task, Stanford Dogs, and UEC-Food100 while obtaining competitive performance in ImageNet, CIFAR-100, CUB200 Birds, and Stanford Cars. In addition, we analyze the different components of our model, showing that the proposed attention modules succeed in finding the most discriminative regions of the image. Finally, as a proof of concept, we demonstrate that with only local predictions, an augmented neural network can successfully classify an image before reaching any fully connected layer, thus reducing the computational amount up to 10%.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE; 600.119; 600.098 Approved no
Call Number Admin @ si @ RVC2020a Serial 3417
Permanent link to this record
 

 
Author Xialei Liu; Chenshen Wu; Mikel Menta; Luis Herranz; Bogdan Raducanu; Andrew Bagdanov; Shangling Jui; Joost Van de Weijer
Title Generative Feature Replay for Class-Incremental Learning Type Conference Article
Year 2020 Publication CLVISION – Workshop on Continual Learning in Computer Vision Abbreviated Journal
Volume Issue Pages
Keywords (up)
Abstract Humans are capable of learning new tasks without forgetting previous ones, while neural networks fail due to catastrophic forgetting between new and previously-learned tasks. We consider a class-incremental setting which means that the task-ID is unknown at inference time. The imbalance between old and new classes typically results in a bias of the network towards the newest ones. This imbalance problem can either be addressed by storing exemplars from previous tasks, or by using image replay methods. However, the latter can only be applied to toy datasets since image generation for complex datasets is a hard problem.
We propose a solution to the imbalance problem based on generative feature replay which does not require any exemplars. To do this, we split the network into two parts: a feature extractor and a classifier. To prevent forgetting, we combine generative feature replay in the classifier with feature distillation in the feature extractor. Through feature generation, our method reduces the complexity of generative replay and prevents the imbalance problem. Our approach is computationally efficient and scalable to large datasets. Experiments confirm that our approach achieves state-of-the-art results on CIFAR-100 and ImageNet, while requiring only a fraction of the storage needed for exemplar-based continual learning
Address Virtual CVPR
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes LAMP; 601.309; 602.200; 600.141; 600.120 Approved no
Call Number Admin @ si @ LWM2020 Serial 3419
Permanent link to this record
 

 
Author Raul Gomez; Jaume Gibert; Lluis Gomez; Dimosthenis Karatzas
Title Location Sensitive Image Retrieval and Tagging Type Conference Article
Year 2020 Publication 16th European Conference on Computer Vision Abbreviated Journal
Volume Issue Pages
Keywords (up)
Abstract People from different parts of the globe describe objects and concepts in distinct manners. Visual appearance can thus vary across different geographic locations, which makes location a relevant contextual information when analysing visual data. In this work, we address the task of image retrieval related to a given tag conditioned on a certain location on Earth. We present LocSens, a model that learns to rank triplets of images, tags and coordinates by plausibility, and two training strategies to balance the location influence in the final ranking. LocSens learns to fuse textual and location information of multimodal queries to retrieve related images at different levels of location granularity, and successfully utilizes location information to improve image tagging.
Address Virtual; August 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCV
Notes DAG; 600.121; 600.129 Approved no
Call Number Admin @ si @ GGG2020b Serial 3420
Permanent link to this record
 

 
Author Yaxing Wang; Abel Gonzalez-Garcia; David Berga; Luis Herranz; Fahad Shahbaz Khan; Joost Van de Weijer
Title MineGAN: effective knowledge transfer from GANs to target domains with few images Type Conference Article
Year 2020 Publication 33rd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages
Keywords (up)
Abstract One of the attractive characteristics of deep neural networks is their ability to transfer knowledge obtained in one domain to other related domains. As a result, high-quality networks can be trained in domains with relatively little training data. This property has been extensively studied for discriminative networks but has received significantly less attention for generative models. Given the often enormous effort required to train GANs, both computationally as well as in the dataset collection, the re-use of pretrained GANs is a desirable objective. We propose a novel knowledge transfer method for generative models based on mining the knowledge that is most beneficial to a specific target domain, either from a single or multiple pretrained GANs. This is done using a miner network that identifies which part of the generative distribution of each pretrained GAN outputs samples closest to the target domain. Mining effectively steers GAN sampling towards suitable regions of the latent space, which facilitates the posterior finetuning and avoids pathologies of other methods such as mode collapse and lack of flexibility. We perform experiments on several complex datasets using various GAN architectures (BigGAN, Progressive GAN) and show that the proposed method, called MineGAN, effectively transfers knowledge to domains with few target images, outperforming existing methods. In addition, MineGAN can successfully transfer knowledge from multiple pretrained GANs.
Address Virtual CVPR
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes LAMP; 600.109; 600.141; 600.120 Approved no
Call Number Admin @ si @ WGB2020 Serial 3421
Permanent link to this record
 

 
Author Lu Yu; Bartlomiej Twardowski; Xialei Liu; Luis Herranz; Kai Wang; Yongmai Cheng; Shangling Jui; Joost Van de Weijer
Title Semantic Drift Compensation for Class-Incremental Learning of Embeddings Type Conference Article
Year 2020 Publication 33rd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages
Keywords (up)
Abstract Class-incremental learning of deep networks sequentially increases the number of classes to be classified. During training, the network has only access to data of one task at a time, where each task contains several classes. In this setting, networks suffer from catastrophic forgetting which refers to the drastic drop in performance on previous tasks. The vast majority of methods have studied this scenario for classification networks, where for each new task the classification layer of the network must be augmented with additional weights to make room for the newly added classes. Embedding networks have the advantage that new classes can be naturally included into the network without adding new weights. Therefore, we study incremental learning for embedding networks. In addition, we propose a new method to estimate the drift, called semantic drift, of features and compensate for it without the need of any exemplars. We approximate the drift of previous tasks based on the drift that is experienced by current task data. We perform experiments on fine-grained datasets, CIFAR100 and ImageNet-Subset. We demonstrate that embedding networks suffer significantly less from catastrophic forgetting. We outperform existing methods which do not require exemplars and obtain competitive results compared to methods which store exemplars. Furthermore, we show that our proposed SDC when combined with existing methods to prevent forgetting consistently improves results.
Address Virtual CVPR
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes LAMP; 600.141; 601.309; 602.200; 600.120 Approved no
Call Number Admin @ si @ YTL2020 Serial 3422
Permanent link to this record
 

 
Author Xiangyang Li; Luis Herranz; Shuqiang Jiang
Title Multifaceted Analysis of Fine-Tuning in Deep Model for Visual Recognition Type Journal
Year 2020 Publication ACM Transactions on Data Science Abbreviated Journal ACM
Volume Issue Pages
Keywords (up)
Abstract In recent years, convolutional neural networks (CNNs) have achieved impressive performance for various visual recognition scenarios. CNNs trained on large labeled datasets can not only obtain significant performance on most challenging benchmarks but also provide powerful representations, which can be used to a wide range of other tasks. However, the requirement of massive amounts of data to train deep neural networks is a major drawback of these models, as the data available is usually limited or imbalanced. Fine-tuning (FT) is an effective way to transfer knowledge learned in a source dataset to a target task. In this paper, we introduce and systematically investigate several factors that influence the performance of fine-tuning for visual recognition. These factors include parameters for the retraining procedure (e.g., the initial learning rate of fine-tuning), the distribution of the source and target data (e.g., the number of categories in the source dataset, the distance between the source and target datasets) and so on. We quantitatively and qualitatively analyze these factors, evaluate their influence, and present many empirical observations. The results reveal insights into what fine-tuning changes CNN parameters and provide useful and evidence-backed intuitions about how to implement fine-tuning for computer vision tasks.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.141; 600.120 Approved no
Call Number Admin @ si @ LHJ2020 Serial 3423
Permanent link to this record
 

 
Author Yaxing Wang; Luis Herranz; Joost Van de Weijer
Title Mix and match networks: multi-domain alignment for unpaired image-to-image translation Type Journal Article
Year 2020 Publication International Journal of Computer Vision Abbreviated Journal IJCV
Volume 128 Issue Pages 2849–2872
Keywords (up)
Abstract This paper addresses the problem of inferring unseen cross-modal image-to-image translations between multiple modalities. We assume that only some of the pairwise translations have been seen (i.e. trained) and infer the remaining unseen translations (where training pairs are not available). We propose mix and match networks, an approach where multiple encoders and decoders are aligned in such a way that the desired translation can be obtained by simply cascading the source encoder and the target decoder, even when they have not interacted during the training stage (i.e. unseen). The main challenge lies in the alignment of the latent representations at the bottlenecks of encoder-decoder pairs. We propose an architecture with several tools to encourage alignment, including autoencoders and robust side information and latent consistency losses. We show the benefits of our approach in terms of effectiveness and scalability compared with other pairwise image-to-image translation approaches. We also propose zero-pair cross-modal image translation, a challenging setting where the objective is inferring semantic segmentation from depth (and vice-versa) without explicit segmentation-depth pairs, and only from two (disjoint) segmentation-RGB and depth-RGB training sets. We observe that a certain part of the shared information between unseen modalities might not be reachable, so we further propose a variant that leverages pseudo-pairs which allows us to exploit this shared information between the unseen modalities
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.109; 600.106; 600.141; 600.120 Approved no
Call Number Admin @ si @ WHW2020 Serial 3424
Permanent link to this record
 

 
Author Lei Kang; Pau Riba; Marçal Rusiñol; Alicia Fornes; Mauricio Villegas
Title Distilling Content from Style for Handwritten Word Recognition Type Conference Article
Year 2020 Publication 17th International Conference on Frontiers in Handwriting Recognition Abbreviated Journal
Volume Issue Pages
Keywords (up)
Abstract Despite the latest transcription accuracies reached using deep neural network architectures, handwritten text recognition still remains a challenging problem, mainly because of the large inter-writer style variability. Both augmenting the training set with artificial samples using synthetic fonts, and writer adaptation techniques have been proposed to yield more generic approaches aimed at dodging style unevenness. In this work, we take a step closer to learn style independent features from handwritten word images. We propose a novel method that is able to disentangle the content and style aspects of input images by jointly optimizing a generative process and a handwritten
word recognizer. The generator is aimed at transferring writing style features from one sample to another in an image-to-image translation approach, thus leading to a learned content-centric features that shall be independent to writing style attributes.
Our proposed recognition model is able then to leverage such writer-agnostic features to reach better recognition performances. We advance over prior training strategies and demonstrate with qualitative and quantitative evaluations the performance of both
the generative process and the recognition efficiency in the IAM dataset.
Address Virtual ICFHR; September 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICFHR
Notes DAG; 600.129; 600.140; 600.121 Approved no
Call Number Admin @ si @ KRR2020 Serial 3425
Permanent link to this record