Home | [151–160] << 161 162 163 164 165 166 167 168 169 170 >> [171–180] |
Records | |||||
---|---|---|---|---|---|
Author | David Geronimo; Antonio Lopez | ||||
Title | Vision-based Pedestrian Protection Systems for Intelligent Vehicles | Type | Book Whole | ||
Year | 2014 | Publication | SpringerBriefs in Computer Science | Abbreviated Journal | |
Volume | Issue | Pages | 1-114 | ||
Keywords | Computer Vision; Driver Assistance Systems; Intelligent Vehicles; Pedestrian Detection; Vulnerable Road Users | ||||
Abstract | Pedestrian Protection Systems (PPSs) are on-board systems aimed at detecting and tracking people in the surroundings of a vehicle in order to avoid potentially dangerous situations. These systems, together with other Advanced Driver Assistance Systems (ADAS) such as lane departure warning or adaptive cruise control, are one of the most promising ways to improve traffic safety. By the use of computer vision, cameras working either in the visible or infra-red spectra have been demonstrated as a reliable sensor to perform this task. Nevertheless, the variability of human’s appearance, not only in terms of clothing and sizes but also as a result of their dynamic shape, makes pedestrians one of the most complex classes even for computer vision. Moreover, the unstructured changing and unpredictable environment in which such on-board systems must work makes detection a difficult task to be carried out with the demanded robustness. In this brief, the state of the art in PPSs is introduced through the review of the most relevant papers of the last decade. A common computational architecture is presented as a framework to organize each method according to its main contribution. More than 300 papers are referenced, most of them addressing pedestrian detection and others corresponding to the descriptors (features), pedestrian models, and learning machines used. In addition, an overview of topics such as real-time aspects, systems benchmarking and future challenges of this research area are presented. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer Briefs in Computer Vision | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4614-7986-4 | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.076 | Approved | no | ||
Call Number | GeL2014 | Serial | 2325 | ||
Permanent link to this record | |||||
Author | Roberto Morales; Juan Quispe; Eduardo Aguilar | ||||
Title | Exploring multi-food detection using deep learning-based algorithms | Type | Conference Article | ||
Year | 2023 | Publication | 13th International Conference on Pattern Recognition Systems | Abbreviated Journal | |
Volume | Issue | Pages | 1-7 | ||
Keywords | |||||
Abstract | People are becoming increasingly concerned about their diet, whether for disease prevention, medical treatment or other purposes. In meals served in restaurants, schools or public canteens, it is not easy to identify the ingredients and/or the nutritional information they contain. Currently, technological solutions based on deep learning models have facilitated the recording and tracking of food consumed based on the recognition of the main dish present in an image. Considering that sometimes there may be multiple foods served on the same plate, food analysis should be treated as a multi-class object detection problem. EfficientDet and YOLOv5 are object detection algorithms that have demonstrated high mAP and real-time performance on general domain data. However, these models have not been evaluated and compared on public food datasets. Unlike general domain objects, foods have more challenging features inherent in their nature that increase the complexity of detection. In this work, we performed a performance evaluation of Efficient-Det and YOLOv5 on three public food datasets: UNIMIB2016, UECFood256 and ChileanFood64. From the results obtained, it can be seen that YOLOv5 provides a significant difference in terms of both mAP and response time compared to EfficientDet in all datasets. Furthermore, YOLOv5 outperforms the state-of-the-art on UECFood256, achieving an improvement of more than 4% in terms of mAP@.50 over the best reported. | ||||
Address | Guayaquil; Ecuador; July 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPRS | ||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ MQA2023 | Serial | 3843 | ||
Permanent link to this record | |||||
Author | Yagmur Gucluturk; Umut Guclu; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera; Marcel A. J. van Gerven; Rob van Lier | ||||
Title | Multimodal First Impression Analysis with Deep Residual Networks | Type | Journal Article | ||
Year | 2018 | Publication | IEEE Transactions on Affective Computing | Abbreviated Journal | TAC |
Volume | 8 | Issue | 3 | Pages | 316-329 |
Keywords | |||||
Abstract | People form first impressions about the personalities of unfamiliar individuals even after very brief interactions with them. In this study we present and evaluate several models that mimic this automatic social behavior. Specifically, we present several models trained on a large dataset of short YouTube video blog posts for predicting apparent Big Five personality traits of people and whether they seem suitable to be recommended to a job interview. Along with presenting our audiovisual approach and results that won the third place in the ChaLearn First Impressions Challenge, we investigate modeling in different modalities including audio only, visual only, language only, audiovisual, and combination of audiovisual and language. Our results demonstrate that the best performance could be obtained using a fusion of all data modalities. Finally, in order to promote explainability in machine learning and to provide an example for the upcoming ChaLearn challenges, we present a simple approach for explaining the predictions for job interview recommendations | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ GGB2018 | Serial | 3210 | ||
Permanent link to this record | |||||
Author | Raul Gomez; Jaume Gibert; Lluis Gomez; Dimosthenis Karatzas | ||||
Title | Location Sensitive Image Retrieval and Tagging | Type | Conference Article | ||
Year | 2020 | Publication | 16th European Conference on Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | People from different parts of the globe describe objects and concepts in distinct manners. Visual appearance can thus vary across different geographic locations, which makes location a relevant contextual information when analysing visual data. In this work, we address the task of image retrieval related to a given tag conditioned on a certain location on Earth. We present LocSens, a model that learns to rank triplets of images, tags and coordinates by plausibility, and two training strategies to balance the location influence in the final ranking. LocSens learns to fuse textual and location information of multimodal queries to retrieve related images at different levels of location granularity, and successfully utilizes location information to improve image tagging. | ||||
Address | Virtual; August 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECCV | ||
Notes | DAG; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ GGG2020b | Serial | 3420 | ||
Permanent link to this record | |||||
Author | Joan Codina-Filba; Sergio Escalera; Joan Escudero; Coen Antens; Pau Buch-Cardona; Mireia Farrus | ||||
Title | Mobile eHealth Platform for Home Monitoring of Bipolar Disorder | Type | Conference Article | ||
Year | 2021 | Publication | 27th ACM International Conference on Multimedia Modeling | Abbreviated Journal | |
Volume | 12573 | Issue | Pages | 330-341 | |
Keywords | |||||
Abstract | People suffering Bipolar Disorder (BD) experiment changes in mood status having depressive or manic episodes with normal periods in the middle. BD is a chronic disease with a high level of non-adherence to medication that needs a continuous monitoring of patients to detect when they relapse in an episode, so that physicians can take care of them. Here we present MoodRecord, an easy-to-use, non-intrusive, multilingual, robust and scalable platform suitable for home monitoring patients with BD, that allows physicians and relatives to track the patient state and get alarms when abnormalities occur.
MoodRecord takes advantage of the capabilities of smartphones as a communication and recording device to do a continuous monitoring of patients. It automatically records user activity, and asks the user to answer some questions or to record himself in video, according to a predefined plan designed by physicians. The video is analysed, recognising the mood status from images and bipolar assessment scores are extracted from speech parameters. The data obtained from the different sources are merged periodically to observe if a relapse may start and if so, raise the corresponding alarm. The application got a positive evaluation in a pilot with users from three different countries. During the pilot, the predictions of the voice and image modules showed a coherent correlation with the diagnosis performed by clinicians. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | MMM | ||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ CEE2021 | Serial | 3659 | ||
Permanent link to this record | |||||
Author | Sangeeth Reddy; Minesh Mathew; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar | ||||
Title | RoadText-1K: Text Detection and Recognition Dataset for Driving Videos | Type | Conference Article | ||
Year | 2020 | Publication | IEEE International Conference on Robotics and Automation | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Perceiving text is crucial to understand semantics of outdoor scenes and hence is a critical requirement to build intelligent systems for driver assistance and self-driving. Most of the existing datasets for text detection and recognition comprise still images and are mostly compiled keeping text in mind. This paper introduces a new ”RoadText-1K” dataset for text in driving videos. The dataset is 20 times larger than the existing largest dataset for text in videos. Our dataset comprises 1000 video clips of driving without any bias towards text and with annotations for text bounding boxes and transcriptions in every frame. State of the art methods for text detection,
recognition and tracking are evaluated on the new dataset and the results signify the challenges in unconstrained driving videos compared to existing datasets. This suggests that RoadText-1K is suited for research and development of reading systems, robust enough to be incorporated into more complex downstream tasks like driver assistance and self-driving. The dataset can be found at http://cvit.iiit.ac.in/research/ projects/cvit-projects/roadtext-1k |
||||
Address | Paris; Francia; ??? | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICRA | ||
Notes | DAG; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ RMG2020 | Serial | 3400 | ||
Permanent link to this record | |||||
Author | Xim Cerda-Company; C. Alejandro Parraga; Xavier Otazu | ||||
Title | Which tone-mapping is the best? A comparative study of tone-mapping perceived quality | Type | Abstract | ||
Year | 2014 | Publication | Perception | Abbreviated Journal | |
Volume | 43 | Issue | Pages | 106 | |
Keywords | |||||
Abstract | Perception 43 ECVP Abstract Supplement
High-dynamic-range (HDR) imaging refers to the methods designed to increase the brightness dynamic range present in standard digital imaging techniques. This increase is achieved by taking the same picture under dierent exposure values and mapping the intensity levels into a single image by way of a tone-mapping operator (TMO). Currently, there is no agreement on how to evaluate the quality of dierent TMOs. In this work we psychophysically evaluate 15 dierent TMOs obtaining rankings based on the perceived properties of the resulting tone-mapped images. We performed two dierent experiments on a CRT calibrated display using 10 subjects: (1) a study of the internal relationships between grey-levels and (2) a pairwise comparison of the resulting 15 tone-mapped images. In (1) observers internally matched the grey-levels to a reference inside the tone-mapped images and in the real scene. In (2) observers performed a pairwise comparison of the tone-mapped images alongside the real scene. We obtained two rankings of the TMOs according their performance. In (1) the best algorithm was ICAM by J.Kuang et al (2007) and in (2) the best algorithm was a TMO by Krawczyk et al (2005). Our results also show no correlation between these two rankings. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECVP | ||
Notes | NEUROBIT; 600.074 | Approved | no | ||
Call Number | Admin @ si @ CPO2014 | Serial | 2527 | ||
Permanent link to this record | |||||
Author | Saad Minhas; Aura Hernandez-Sabate; Shoaib Ehsan; Klaus McDonald Maier | ||||
Title | Effects of Non-Driving Related Tasks during Self-Driving mode | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Transactions on Intelligent Transportation Systems | Abbreviated Journal | TITS |
Volume | 23 | Issue | 2 | Pages | 1391-1399 |
Keywords | |||||
Abstract | Perception reaction time and mental workload have proven to be crucial in manual driving. Moreover, in highly automated cars, where most of the research is focusing on Level 4 Autonomous driving, take-over performance is also a key factor when taking road safety into account. This study aims to investigate how the immersion in non-driving related tasks affects the take-over performance of drivers in given scenarios. The paper also highlights the use of virtual simulators to gather efficient data that can be crucial in easing the transition between manual and autonomous driving scenarios. The use of Computer Aided Simulations is of absolute importance in this day and age since the automotive industry is rapidly moving towards Autonomous technology. An experiment comprising of 40 subjects was performed to examine the reaction times of driver and the influence of other variables in the success of take-over performance in highly automated driving under different circumstances within a highway virtual environment. The results reflect the relationship between reaction times under different scenarios that the drivers might face under the circumstances stated above as well as the importance of variables such as velocity in the success on regaining car control after automated driving. The implications of the results acquired are important for understanding the criteria needed for designing Human Machine Interfaces specifically aimed towards automated driving conditions. Understanding the need to keep drivers in the loop during automation, whilst allowing drivers to safely engage in other non-driving related tasks is an important research area which can be aided by the proposed study. | ||||
Address | Feb. 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; 600.139; 600.145 | Approved | no | ||
Call Number | Admin @ si @ MHE2022 | Serial | 3468 | ||
Permanent link to this record | |||||
Author | Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Michael Felsberg; Carlo Gatta | ||||
Title | Semantic Pyramids for Gender and Action Recognition | Type | Journal Article | ||
Year | 2014 | Publication | IEEE Transactions on Image Processing | Abbreviated Journal | TIP |
Volume | 23 | Issue | 8 | Pages | 3633-3645 |
Keywords | |||||
Abstract | Person description is a challenging problem in computer vision. We investigated two major aspects of person description: 1) gender and 2) action recognition in still images. Most state-of-the-art approaches for gender and action recognition rely on the description of a single body part, such as face or full-body. However, relying on a single body part is suboptimal due to significant variations in scale, viewpoint, and pose in real-world images. This paper proposes a semantic pyramid approach for pose normalization. Our approach is fully automatic and based on combining information from full-body, upper-body, and face regions for gender and action recognition in still images. The proposed approach does not require any annotations for upper-body and face of a person. Instead, we rely on pretrained state-of-the-art upper-body and face detectors to automatically extract semantic information of a person. Given multiple bounding boxes from each body part detector, we then propose a simple method to select the best candidate bounding box, which is used for feature extraction. Finally, the extracted features from the full-body, upper-body, and face regions are combined into a single representation for classification. To validate the proposed approach for gender recognition, experiments are performed on three large data sets namely: 1) human attribute; 2) head-shoulder; and 3) proxemics. For action recognition, we perform experiments on four data sets most used for benchmarking action recognition in still images: 1) Sports; 2) Willow; 3) PASCAL VOC 2010; and 4) Stanford-40. Our experiments clearly demonstrate that the proposed approach, despite its simplicity, outperforms state-of-the-art methods for gender and action recognition. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1057-7149 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | CIC; LAMP; 601.160; 600.074; 600.079;MILAB | Approved | no | ||
Call Number | Admin @ si @ KWR2014 | Serial | 2507 | ||
Permanent link to this record | |||||
Author | Julio C. S. Jacques Junior; Xavier Baro; Sergio Escalera | ||||
Title | Exploiting feature representations through similarity learning, post-ranking and ranking aggregation for person re-identification | Type | Journal Article | ||
Year | 2018 | Publication | Image and Vision Computing | Abbreviated Journal | IMAVIS |
Volume | 79 | Issue | Pages | 76-85 | |
Keywords | |||||
Abstract | Person re-identification has received special attention by the human analysis community in the last few years. To address the challenges in this field, many researchers have proposed different strategies, which basically exploit either cross-view invariant features or cross-view robust metrics. In this work, we propose to exploit a post-ranking approach and combine different feature representations through ranking aggregation. Spatial information, which potentially benefits the person matching, is represented using a 2D body model, from which color and texture information are extracted and combined. We also consider background/foreground information, automatically extracted via Deep Decompositional Network, and the usage of Convolutional Neural Network (CNN) features. To describe the matching between images we use the polynomial feature map, also taking into account local and global information. The Discriminant Context Information Analysis based post-ranking approach is used to improve initial ranking lists. Finally, the Stuart ranking aggregation method is employed to combine complementary ranking lists obtained from different feature representations. Experimental results demonstrated that we improve the state-of-the-art on VIPeR and PRID450s datasets, achieving 67.21% and 75.64% on top-1 rank recognition rate, respectively, as well as obtaining competitive results on CUHK01 dataset. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; 602.143 | Approved | no | ||
Call Number | Admin @ si @ JBE2018 | Serial | 3138 | ||
Permanent link to this record | |||||
Author | Julio C. S. Jacques Junior; Xavier Baro; Sergio Escalera | ||||
Title | Exploiting feature representations through similarity learning and ranking aggregation for person re-identification | Type | Conference Article | ||
Year | 2017 | Publication | 12th IEEE International Conference on Automatic Face and Gesture Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Person re-identification has received special attentionby the human analysis community in the last few years.To address the challenges in this field, many researchers haveproposed different strategies, which basically exploit eithercross-view invariant features or cross-view robust metrics. Inthis work we propose to combine different feature representationsthrough ranking aggregation. Spatial information, whichpotentially benefits the person matching, is represented usinga 2D body model, from which color and texture informationare extracted and combined. We also consider contextualinformation (background and foreground data), automaticallyextracted via Deep Decompositional Network, and the usage ofConvolutional Neural Network (CNN) features. To describe thematching between images we use the polynomial feature map,also taking into account local and global information. Finally,the Stuart ranking aggregation method is employed to combinecomplementary ranking lists obtained from different featurerepresentations. Experimental results demonstrated that weimprove the state-of-the-art on VIPeR and PRID450s datasets,achieving 58.77% and 71.56% on top-1 rank recognitionrate, respectively, as well as obtaining competitive results onCUHK01 dataset. | ||||
Address | Washington; DC; USA; May 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | FG | ||
Notes | HUPBA; 602.143 | Approved | no | ||
Call Number | Admin @ si @ JBE2017 | Serial | 2923 | ||
Permanent link to this record | |||||
Author | Andreas Møgelmose; Chris Bahnsen; Thomas B. Moeslund; Albert Clapes; Sergio Escalera | ||||
Title | Tri-modal Person Re-identification with RGB, Depth and Thermal Features | Type | Conference Article | ||
Year | 2013 | Publication | 9th IEEE Workshop on Perception beyond the visible Spectrum, Computer Vision and Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 301-307 | ||
Keywords | |||||
Abstract | Person re-identification is about recognizing people who have passed by a sensor earlier. Previous work is mainly based on RGB data, but in this work we for the first time present a system where we combine RGB, depth, and thermal data for re-identification purposes. First, from each of the three modalities, we obtain some particular features: from RGB data, we model color information from different regions of the body, from depth data, we compute different soft body biometrics, and from thermal data, we extract local structural information. Then, the three information types are combined in a joined classifier. The tri-modal system is evaluated on a new RGB-D-T dataset, showing successful results in re-identification scenarios. | ||||
Address | Portland; oregon; June 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-0-7695-4990-3 | Medium | ||
Area | Expedition | Conference | CVPRW | ||
Notes | HUPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ MBM2013 | Serial | 2253 | ||
Permanent link to this record | |||||
Author | Julio C. S. Jacques Junior; Yagmur Gucluturk; Marc Perez; Umut Guçlu; Carlos Andujar; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Marcel A. J. van Gerven; Rob van Lier; Sergio Escalera | ||||
Title | First Impressions: A Survey on Vision-Based Apparent Personality Trait Analysis | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Transactions on Affective Computing | Abbreviated Journal | TAC |
Volume | 13 | Issue | 1 | Pages | 75-95 |
Keywords | Personality computing; first impressions; person perception; big-five; subjective bias; computer vision; machine learning; nonverbal signals; facial expression; gesture; speech analysis; multi-modal recognition | ||||
Abstract | Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed. | ||||
Address | 1 Jan.-March 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA | Approved | no | ||
Call Number | Admin @ si @ JGP2022 | Serial | 3724 | ||
Permanent link to this record | |||||
Author | David Curto; Albert Clapes; Javier Selva; Sorina Smeureanu; Julio C. S. Jacques Junior; David Gallardo-Pujol; Georgina Guilera; David Leiva; Thomas B. Moeslund; Sergio Escalera; Cristina Palmero | ||||
Title | Dyadformer: A Multi-Modal Transformer for Long-Range Modeling of Dyadic Interactions | Type | Conference Article | ||
Year | 2021 | Publication | IEEE/CVF International Conference on Computer Vision Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 2177-2188 | ||
Keywords | |||||
Abstract | Personality computing has become an emerging topic in computer vision, due to the wide range of applications it can be used for. However, most works on the topic have focused on analyzing the individual, even when applied to interaction scenarios, and for short periods of time. To address these limitations, we present the Dyadformer, a novel multi-modal multi-subject Transformer architecture to model individual and interpersonal features in dyadic interactions using variable time windows, thus allowing the capture of long-term interdependencies. Our proposed cross-subject layer allows the network to explicitly model interactions among subjects through attentional operations. This proof-of-concept approach shows how multi-modality and joint modeling of both interactants for longer periods of time helps to predict individual attributes. With Dyadformer, we improve state-of-the-art self-reported personality inference results on individual subjects on the UDIVA v0.5 dataset. | ||||
Address | Virtual; October 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCVW | ||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ CCS2021 | Serial | 3648 | ||
Permanent link to this record | |||||
Author | Ricardo Dario Perez Principi; Cristina Palmero; Julio C. S. Jacques Junior; Sergio Escalera | ||||
Title | On the Effect of Observed Subject Biases in Apparent Personality Analysis from Audio-visual Signals | Type | Journal Article | ||
Year | 2021 | Publication | IEEE Transactions on Affective Computing | Abbreviated Journal | TAC |
Volume | 12 | Issue | 3 | Pages | 607-621 |
Keywords | |||||
Abstract | Personality perception is implicitly biased due to many subjective factors, such as cultural, social, contextual, gender and appearance. Approaches developed for automatic personality perception are not expected to predict the real personality of the target, but the personality external observers attributed to it. Hence, they have to deal with human bias, inherently transferred to the training data. However, bias analysis in personality computing is an almost unexplored area. In this work, we study different possible sources of bias affecting personality perception, including emotions from facial expressions, attractiveness, age, gender, and ethnicity, as well as their influence on prediction ability for apparent personality estimation. To this end, we propose a multi-modal deep neural network that combines raw audio and visual information alongside predictions of attribute-specific models to regress apparent personality. We also analyse spatio-temporal aggregation schemes and the effect of different time intervals on first impressions. We base our study on the ChaLearn First Impressions dataset, consisting of one-person conversational videos. Our model shows state-of-the-art results regressing apparent personality based on the Big-Five model. Furthermore, given the interpretability nature of our network design, we provide an incremental analysis on the impact of each possible source of bias on final network predictions. | ||||
Address | 1 July-Sept. 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ PPJ2019 | Serial | 3312 | ||
Permanent link to this record |