|   | 
Details
   web
Records
Author Gabriel Villalonga; Antonio Lopez
Title Co-Training for On-Board Deep Object Detection Type Journal Article
Year 2020 Publication IEEE Access Abbreviated Journal ACCESS
Volume Issue Pages (down) 194441 - 194456
Keywords
Abstract Providing ground truth supervision to train visual models has been a bottleneck over the years, exacerbated by domain shifts which degenerate the performance of such models. This was the case when visual tasks relied on handcrafted features and shallow machine learning and, despite its unprecedented performance gains, the problem remains open within the deep learning paradigm due to its data-hungry nature. Best performing deep vision-based object detectors are trained in a supervised manner by relying on human-labeled bounding boxes which localize class instances (i.e. objects) within the training images. Thus, object detection is one of such tasks for which human labeling is a major bottleneck. In this article, we assess co-training as a semi-supervised learning method for self-labeling objects in unlabeled images, so reducing the human-labeling effort for developing deep object detectors. Our study pays special attention to a scenario involving domain shift; in particular, when we have automatically generated virtual-world images with object bounding boxes and we have real-world images which are unlabeled. Moreover, we are particularly interested in using co-training for deep object detection in the context of driver assistance systems and/or self-driving vehicles. Thus, using well-established datasets and protocols for object detection in these application contexts, we will show how co-training is a paradigm worth to pursue for alleviating object labeling, working both alone and together with task-agnostic domain adaptation.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ ViL2020 Serial 3488
Permanent link to this record
 

 
Author Debora Gil; Antonio Esteban Lansaque; Agnes Borras; Esmitt Ramirez; Carles Sanchez
Title Intraoperative Extraction of Airways Anatomy in VideoBronchoscopy Type Journal Article
Year 2020 Publication IEEE Access Abbreviated Journal ACCESS
Volume 8 Issue Pages (down) 159696 - 159704
Keywords
Abstract A main bottleneck in bronchoscopic biopsy sampling is to efficiently reach the lesion navigating across bronchial levels. Any guidance system should be able to localize the scope position during the intervention with minimal costs and alteration of clinical protocols. With the final goal of an affordable image-based guidance, this work presents a novel strategy to extract and codify the anatomical structure of bronchi, as well as, the scope navigation path from videobronchoscopy. Experiments using interventional data show that our method accurately identifies the bronchial structure. Meanwhile, experiments using simulated data verify that the extracted navigation path matches the 3D route.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM; 600.139; 600.145 Approved no
Call Number Admin @ si @ GEB2020 Serial 3467
Permanent link to this record
 

 
Author Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
Title Hand sign language recognition using multi-view hand skeleton Type Journal Article
Year 2020 Publication Expert Systems With Applications Abbreviated Journal ESWA
Volume 150 Issue Pages (down) 113336
Keywords Multi-view hand skeleton; Hand sign language recognition; 3DCNN; Hand pose estimation; RGB video; Hand action recognition
Abstract Hand sign language recognition from video is a challenging research area in computer vision, which performance is affected by hand occlusion, fast hand movement, illumination changes, or background complexity, just to mention a few. In recent years, deep learning approaches have achieved state-of-the-art results in the field, though previous challenges are not completely solved. In this work, we propose a novel deep learning-based pipeline architecture for efficient automatic hand sign language recognition using Single Shot Detector (SSD), 2D Convolutional Neural Network (2DCNN), 3D Convolutional Neural Network (3DCNN), and Long Short-Term Memory (LSTM) from RGB input videos. We use a CNN-based model which estimates the 3D hand keypoints from 2D input frames. After that, we connect these estimated keypoints to build the hand skeleton by using midpoint algorithm. In order to obtain a more discriminative representation of hands, we project 3D hand skeleton into three views surface images. We further employ the heatmap image of detected keypoints as input for refinement in a stacked fashion. We apply 3DCNNs on the stacked features of hand, including pixel level, multi-view hand skeleton, and heatmap features, to extract discriminant local spatio-temporal features from these stacked inputs. The outputs of the 3DCNNs are fused and fed to a LSTM to model long-term dynamics of hand sign gestures. Analyzing 2DCNN vs. 3DCNN using different number of stacked inputs into the network, we demonstrate that 3DCNN better capture spatio-temporal dynamics of hands. To the best of our knowledge, this is the first time that this multi-modal and multi-view set of hand skeleton features are applied for hand sign language recognition. Furthermore, we present a new large-scale hand sign language dataset, namely RKS-PERSIANSIGN, including 10′000 RGB videos of 100 Persian sign words. Evaluation results of the proposed model on three datasets, NYU, First-Person, and RKS-PERSIANSIGN, indicate that our model outperforms state-of-the-art models in hand sign language recognition, hand pose estimation, and hand action recognition.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ RKE2020a Serial 3411
Permanent link to this record
 

 
Author Meysam Madadi; Hugo Bertiche; Sergio Escalera
Title SMPLR: Deep learning based SMPL reverse for 3D human pose and shape recovery Type Journal Article
Year 2020 Publication Pattern Recognition Abbreviated Journal PR
Volume 106 Issue Pages (down) 107472
Keywords Deep learning; 3D Human pose; Body shape; SMPL; Denoising autoencoder; Volumetric stack hourglass
Abstract In this paper we propose to embed SMPL within a deep-based model to accurately estimate 3D pose and shape from a still RGB image. We use CNN-based 3D joint predictions as an intermediate representation to regress SMPL pose and shape parameters. Later, 3D joints are reconstructed again in the SMPL output. This module can be seen as an autoencoder where the encoder is a deep neural network and the decoder is SMPL model. We refer to this as SMPL reverse (SMPLR). By implementing SMPLR as an encoder-decoder we avoid the need of complex constraints on pose and shape. Furthermore, given that in-the-wild datasets usually lack accurate 3D annotations, it is desirable to lift 2D joints to 3D without pairing 3D annotations with RGB images. Therefore, we also propose a denoising autoencoder (DAE) module between CNN and SMPLR, able to lift 2D joints to 3D and partially recover from structured error. We evaluate our method on SURREAL and Human3.6M datasets, showing improvement over SMPL-based state-of-the-art alternatives by about 4 and 12 mm, respectively.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ MBE2020 Serial 3439
Permanent link to this record
 

 
Author Estefania Talavera; Carolin Wuerich; Nicolai Petkov; Petia Radeva
Title Topic modelling for routine discovery from egocentric photo-streams Type Journal Article
Year 2020 Publication Pattern Recognition Abbreviated Journal PR
Volume 104 Issue Pages (down) 107330
Keywords Routine; Egocentric vision; Lifestyle; Behaviour analysis; Topic modelling
Abstract Developing tools to understand and visualize lifestyle is of high interest when addressing the improvement of habits and well-being of people. Routine, defined as the usual things that a person does daily, helps describe the individuals’ lifestyle. With this paper, we are the first ones to address the development of novel tools for automatic discovery of routine days of an individual from his/her egocentric images. In the proposed model, sequences of images are firstly characterized by semantic labels detected by pre-trained CNNs. Then, these features are organized in temporal-semantic documents to later be embedded into a topic models space. Finally, Dynamic-Time-Warping and Spectral-Clustering methods are used for final day routine/non-routine discrimination. Moreover, we introduce a new EgoRoutine-dataset, a collection of 104 egocentric days with more than 100.000 images recorded by 7 users. Results show that routine can be discovered and behavioural patterns can be observed.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no proj Approved no
Call Number Admin @ si @ TWP2020 Serial 3435
Permanent link to this record
 

 
Author Idoia Ruiz; Bogdan Raducanu; Rakesh Mehta; Jaume Amores
Title Optimizing speed/accuracy trade-off for person re-identification via knowledge distillation Type Journal Article
Year 2020 Publication Engineering Applications of Artificial Intelligence Abbreviated Journal EAAI
Volume 87 Issue Pages (down) 103309
Keywords Person re-identification; Network distillation; Image retrieval; Model compression; Surveillance
Abstract Finding a person across a camera network plays an important role in video surveillance. For a real-world person re-identification application, in order to guarantee an optimal time response, it is crucial to find the balance between accuracy and speed. We analyse this trade-off, comparing a classical method, that comprises hand-crafted feature description and metric learning, in particular, LOMO and XQDA, to deep learning based techniques, using image classification networks, ResNet and MobileNets. Additionally, we propose and analyse network distillation as a learning strategy to reduce the computational cost of the deep learning approach at test time. We evaluate both methods on the Market-1501 and DukeMTMC-reID large-scale datasets, showing that distillation helps reducing the computational cost at inference time while even increasing the accuracy performance.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.109; 600.120 Approved no
Call Number Admin @ si @ RRM2020 Serial 3401
Permanent link to this record
 

 
Author Wenlong Deng; Yongli Mou; Takahiro Kashiwa; Sergio Escalera; Kohei Nagai; Kotaro Nakayama; Yutaka Matsuo; Helmut Prendinger
Title Vision based Pixel-level Bridge Structural Damage Detection Using a Link ASPP Network Type Journal Article
Year 2020 Publication Automation in Construction Abbreviated Journal AC
Volume 110 Issue Pages (down) 102973
Keywords Semantic image segmentation; Deep learning
Abstract Structural Health Monitoring (SHM) has greatly benefited from computer vision. Recently, deep learning approaches are widely used to accurately estimate the state of deterioration of infrastructure. In this work, we focus on the problem of bridge surface structural damage detection, such as delamination and rebar exposure. It is well known that the quality of a deep learning model is highly dependent on the quality of the training dataset. Bridge damage detection, our application domain, has the following main challenges: (i) labeling the damages requires knowledgeable civil engineering professionals, which makes it difficult to collect a large annotated dataset; (ii) the damage area could be very small, whereas the background area is large, which creates an unbalanced training environment; (iii) due to the difficulty to exactly determine the extension of the damage, there is often a variation among different labelers who perform pixel-wise labeling. In this paper, we propose a novel model for bridge structural damage detection to address the first two challenges. This paper follows the idea of an atrous spatial pyramid pooling (ASPP) module that is designed as a novel network for bridge damage detection. Further, we introduce the weight balanced Intersection over Union (IoU) loss function to achieve accurate segmentation on a highly unbalanced small dataset. The experimental results show that (i) the IoU loss function improves the overall performance of damage detection, as compared to cross entropy loss or focal loss, and (ii) the proposed model has a better ability to detect a minority class than other light segmentation networks.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ DMK2020 Serial 3314
Permanent link to this record
 

 
Author Alejandro Cartas; Petia Radeva; Mariella Dimiccoli
Title Activities of Daily Living Monitoring via a Wearable Camera: Toward Real-World Applications Type Journal Article
Year 2020 Publication IEEE Access Abbreviated Journal ACCESS
Volume 8 Issue Pages (down) 77344 - 77363
Keywords
Abstract Activity recognition from wearable photo-cameras is crucial for lifestyle characterization and health monitoring. However, to enable its wide-spreading use in real-world applications, a high level of generalization needs to be ensured on unseen users. Currently, state-of-the-art methods have been tested only on relatively small datasets consisting of data collected by a few users that are partially seen during training. In this paper, we built a new egocentric dataset acquired by 15 people through a wearable photo-camera and used it to test the generalization capabilities of several state-of-the-art methods for egocentric activity recognition on unseen users and daily image sequences. In addition, we propose several variants to state-of-the-art deep learning architectures, and we show that it is possible to achieve 79.87% accuracy on users unseen during training. Furthermore, to show that the proposed dataset and approach can be useful in real-world applications, where data can be acquired by different wearable cameras and labeled data are scarcely available, we employed a domain adaptation strategy on two egocentric activity recognition benchmark datasets. These experiments show that the model learned with our dataset, can easily be transferred to other domains with a very small amount of labeled data. Taken together, those results show that activity recognition from wearable photo-cameras is mature enough to be tested in real-world applications.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no proj Approved no
Call Number Admin @ si @ CRD2020 Serial 3436
Permanent link to this record
 

 
Author Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
Title Video-based Isolated Hand Sign Language Recognition Using a Deep Cascaded Model Type Journal Article
Year 2020 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 79 Issue Pages (down) 22965–22987
Keywords
Abstract In this paper, we propose an efficient cascaded model for sign language recognition taking benefit from spatio-temporal hand-based information using deep learning approaches, especially Single Shot Detector (SSD), Convolutional Neural Network (CNN), and Long Short Term Memory (LSTM), from videos. Our simple yet efficient and accurate model includes two main parts: hand detection and sign recognition. Three types of spatial features, including hand features, Extra Spatial Hand Relation (ESHR) features, and Hand Pose (HP) features, have been fused in the model to feed to LSTM for temporal features extraction. We train SSD model for hand detection using some videos collected from five online sign dictionaries. Our model is evaluated on our proposed dataset (Rastgoo et al., Expert Syst Appl 150: 113336, 2020), including 10’000 sign videos for 100 Persian sign using 10 contributors in 10 different backgrounds, and isoGD dataset. Using the 5-fold cross-validation method, our model outperforms state-of-the-art alternatives in sign language recognition
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no menciona Approved no
Call Number Admin @ si @ RKE2020b Serial 3442
Permanent link to this record
 

 
Author Diana Ramirez Cifuentes; Ana Freire; Ricardo Baeza Yates; Joaquim Punti Vidal; Pilar Medina Bravo; Diego Velazquez; Josep M. Gonfaus; Jordi Gonzalez
Title Detection of Suicidal Ideation on Social Media: Multimodal, Relational, and Behavioral Analysis Type Journal Article
Year 2020 Publication Journal of Medical Internet Research Abbreviated Journal JMIR
Volume 22 Issue 7 Pages (down) e17758
Keywords
Abstract Background:
Suicide risk assessment usually involves an interaction between doctors and patients. However, a significant number of people with mental disorders receive no treatment for their condition due to the limited access to mental health care facilities; the reduced availability of clinicians; the lack of awareness; and stigma, neglect, and discrimination surrounding mental disorders. In contrast, internet access and social media usage have increased significantly, providing experts and patients with a means of communication that may contribute to the development of methods to detect mental health issues among social media users.

Objective:
This paper aimed to describe an approach for the suicide risk assessment of Spanish-speaking users on social media. We aimed to explore behavioral, relational, and multimodal data extracted from multiple social platforms and develop machine learning models to detect users at risk.

Methods:
We characterized users based on their writings, posting patterns, relations with other users, and images posted. We also evaluated statistical and deep learning approaches to handle multimodal data for the detection of users with signs of suicidal ideation (suicidal ideation risk group). Our methods were evaluated over a dataset of 252 users annotated by clinicians. To evaluate the performance of our models, we distinguished 2 control groups: users who make use of suicide-related vocabulary (focused control group) and generic random users (generic control group).

Results:
We identified significant statistical differences between the textual and behavioral attributes of each of the control groups compared with the suicidal ideation risk group. At a 95% CI, when comparing the suicidal ideation risk group and the focused control group, the number of friends (P=.04) and median tweet length (P=.04) were significantly different. The median number of friends for a focused control user (median 578.5) was higher than that for a user at risk (median 372.0). Similarly, the median tweet length was higher for focused control users, with 16 words against 13 words of suicidal ideation risk users. Our findings also show that the combination of textual, visual, relational, and behavioral data outperforms the accuracy of using each modality separately. We defined text-based baseline models based on bag of words and word embeddings, which were outperformed by our models, obtaining an increase in accuracy of up to 8% when distinguishing users at risk from both types of control users.

Conclusions:
The types of attributes analyzed are significant for detecting users at risk, and their combination outperforms the results provided by generic, exclusively text-based baseline models. After evaluating the contribution of image-based predictive models, we believe that our results can be improved by enhancing the models based on textual and relational features. These methods can be extended and applied to different use cases related to other mental disorders.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE; 600.098; 600.119 Approved no
Call Number Admin @ si @ RFB2020 Serial 3552
Permanent link to this record
 

 
Author Fei Yang; Yongmei Cheng; Joost Van de Weijer; Mikhail Mozerov
Title Improved Discrete Optical Flow Estimation With Triple Image Matching Cost Type Journal Article
Year 2020 Publication IEEE Access Abbreviated Journal ACCESS
Volume 8 Issue Pages (down) 17093 - 17102
Keywords
Abstract Approaches that use more than two consecutive video frames in the optical flow estimation have a long research history. However, almost all such methods utilize extra information for a pre-processing flow prediction or for a post-processing flow correction and filtering. In contrast, this paper differs from previously developed techniques. We propose a new algorithm for the likelihood function calculation (alternatively the matching cost volume) that is used in the maximum a posteriori estimation. We exploit the fact that in general, optical flow is locally constant in the sense of time and the likelihood function depends on both the previous and the future frame. Implementation of our idea increases the robustness of optical flow estimation. As a result, our method outperforms 9% over the DCFlow technique, which we use as prototype for our CNN based computation architecture, on the most challenging MPI-Sintel dataset for the non-occluded mask metric. Furthermore, our approach considerably increases the accuracy of the flow estimation for the matching cost processing, consequently outperforming the original DCFlow algorithm results up to 50% in occluded regions and up to 9% in non-occluded regions on the MPI-Sintel dataset. The experimental section shows that the proposed method achieves state-of-the-arts results especially on the MPI-Sintel dataset.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ YCW2020 Serial 3345
Permanent link to this record
 

 
Author Anjan Dutta; Pau Riba; Josep Llados; Alicia Fornes
Title Hierarchical Stochastic Graphlet Embedding for Graph-based Pattern Recognition Type Journal Article
Year 2020 Publication Neural Computing and Applications Abbreviated Journal NEUCOMA
Volume 32 Issue Pages (down) 11579–11596
Keywords
Abstract Despite being very successful within the pattern recognition and machine learning community, graph-based methods are often unusable because of the lack of mathematical operations defined in graph domain. Graph embedding, which maps graphs to a vectorial space, has been proposed as a way to tackle these difficulties enabling the use of standard machine learning techniques. However, it is well known that graph embedding functions usually suffer from the loss of structural information. In this paper, we consider the hierarchical structure of a graph as a way to mitigate this loss of information. The hierarchical structure is constructed by topologically clustering the graph nodes and considering each cluster as a node in the upper hierarchical level. Once this hierarchical structure is constructed, we consider several configurations to define the mapping into a vector space given a classical graph embedding, in particular, we propose to make use of the stochastic graphlet embedding (SGE). Broadly speaking, SGE produces a distribution of uniformly sampled low-to-high-order graphlets as a way to embed graphs into the vector space. In what follows, the coarse-to-fine structure of a graph hierarchy and the statistics fetched by the SGE complements each other and includes important structural information with varied contexts. Altogether, these two techniques substantially cope with the usual information loss involved in graph embedding techniques, obtaining a more robust graph representation. This fact has been corroborated through a detailed experimental evaluation on various benchmark graph datasets, where we outperform the state-of-the-art methods.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 600.121; 600.141 Approved no
Call Number Admin @ si @ DRL2020 Serial 3348
Permanent link to this record
 

 
Author Rahma Kalboussi; Aymen Azaza; Joost Van de Weijer; Mehrez Abdellaoui; Ali Douik
Title Object proposals for salient object segmentation in videos Type Journal Article
Year 2020 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 79 Issue 13 Pages (down) 8677-8693
Keywords
Abstract Salient object segmentation in videos is generally broken up in a video segmentation part and a saliency assignment part. Recently, object proposals, which are used to segment the image, have had significant impact on many computer vision applications, including image segmentation, object detection, and recently saliency detection in still images. However, their usage has not yet been evaluated for salient object segmentation in videos. Therefore, in this paper, we investigate the application of object proposals to salient object segmentation in videos. In addition, we propose a new motion feature derived from the optical flow structure tensor for video saliency detection. Experiments on two standard benchmark datasets for video saliency show that the proposed motion feature improves saliency estimation results, and that object proposals are an efficient method for salient object segmentation. Results on the challenging SegTrack v2 and Fukuchi benchmark data sets show that we significantly outperform the state-of-the-art.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number KAW2020 Serial 3504
Permanent link to this record
 

 
Author Pau Rodriguez; Diego Velazquez; Guillem Cucurull; Josep M. Gonfaus; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez
Title Personality Trait Analysis in Social Networks Based on Weakly Supervised Learning of Shared Images Type Journal Article
Year 2020 Publication Applied Sciences Abbreviated Journal APPLSCI
Volume 10 Issue 22 Pages (down) 8170
Keywords sentiment analysis, personality trait analysis; weakly-supervised learning; visual classification; OCEAN model; social networks
Abstract Social networks have attracted the attention of psychologists, as the behavior of users can be used to assess personality traits, and to detect sentiments and critical mental situations such as depression or suicidal tendencies. Recently, the increasing amount of image uploads to social networks has shifted the focus from text to image-based personality assessment. However, obtaining the ground-truth requires giving personality questionnaires to the users, making the process very costly and slow, and hindering research on large populations. In this paper, we demonstrate that it is possible to predict which images are most associated with each personality trait of the OCEAN personality model, without requiring ground-truth personality labels. Namely, we present a weakly supervised framework which shows that the personality scores obtained using specific images textually associated with particular personality traits are highly correlated with scores obtained using standard text-based personality questionnaires. We trained an OCEAN trait model based on Convolutional Neural Networks (CNNs), learned from 120K pictures posted with specific textual hashtags, to infer whether the personality scores from the images uploaded by users are consistent with those scores obtained from text. In order to validate our claims, we performed a personality test on a heterogeneous group of 280 human subjects, showing that our model successfully predicts which kind of image will match a person with a given level of a trait. Looking at the results, we obtained evidence that personality is not only correlated with text, but with image content too. Interestingly, different visual patterns emerged from those images most liked by persons with a particular personality trait: for instance, pictures most associated with high conscientiousness usually contained healthy food, while low conscientiousness pictures contained injuries, guns, and alcohol. These findings could pave the way to complement text-based personality questionnaires with image-based questions.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE; 600.119 Approved no
Call Number Admin @ si @ RVC2020b Serial 3553
Permanent link to this record
 

 
Author Idoia Ruiz; Joan Serrat
Title Rank-based ordinal classification Type Conference Article
Year 2020 Publication 25th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages (down) 8069-8076
Keywords
Abstract Differently from the regular classification task, in ordinal classification there is an order in the classes. As a consequence not all classification errors matter the same: a predicted class close to the groundtruth one is better than predicting a farther away class. To account for this, most previous works employ loss functions based on the absolute difference between the predicted and groundtruth class labels. We argue that there are many cases in ordinal classification where label values are arbitrary (for instance 1. . . C, being C the number of classes) and thus such loss functions may not be the best choice. We instead propose a network architecture that produces not a single class prediction but an ordered vector, or ranking, of all the possible classes from most to least likely. This is thanks to a loss function that compares groundtruth and predicted rankings of these class labels, not the labels themselves. Another advantage of this new formulation is that we can enforce consistency in the predictions, namely, predicted rankings come from some unimodal vector of scores with mode at the groundtruth class. We compare with the state of the art ordinal classification methods, showing
that ours attains equal or better performance, as measured by common ordinal classification metrics, on three benchmark datasets. Furthermore, it is also suitable for a new task on image aesthetics assessment, i.e. most voted score prediction. Finally, we also apply it to building damage assessment from satellite images, providing an analysis of its performance depending on the degree of imbalance of the dataset.
Address Virtual; January 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes ADAS; 600.118; 600.124 Approved no
Call Number Admin @ si @ RuS2020 Serial 3549
Permanent link to this record