Records |
Author |
Cristina Palmero; Jordi Esquirol; Vanessa Bayo; Miquel Angel Cos; Pouya Ahmadmonfared; Joan Salabert; David Sanchez; Sergio Escalera |
Title |
Automatic Sleep System Recommendation by Multi-modal RBG-Depth-Pressure Anthropometric Analysis |
Type |
Journal Article |
Year |
2017 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
Volume |
122 |
Issue |
2 |
Pages |
212–227 |
Keywords |
Sleep system recommendation; RGB-Depth data Pressure imaging; Anthropometric landmark extraction; Multi-part human body segmentation |
Abstract |
This paper presents a novel system for automatic sleep system recommendation using RGB, depth and pressure information. It consists of a validated clinical knowledge-based model that, along with a set of prescription variables extracted automatically, obtains a personalized bed design recommendation. The automatic process starts by performing multi-part human body RGB-D segmentation combining GrabCut, 3D Shape Context descriptor and Thin Plate Splines, to then extract a set of anthropometric landmark points by applying orthogonal plates to the segmented human body. The extracted variables are introduced to the computerized clinical model to calculate body circumferences, weight, morphotype and Body Mass Index categorization. Furthermore, pressure image analysis is performed to extract pressure values and at-risk points, which are also introduced to the model to eventually obtain the final prescription of mattress, topper, and pillow. We validate the complete system in a set of 200 subjects, showing accurate category classification and high correlation results with respect to manual measures. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
HuPBA;MILAB; 303.100 |
Approved |
no |
Call Number |
Admin @ si @ PEB2017 |
Serial |
2765 |
Permanent link to this record |
|
|
|
Author |
Maria Elena Meza-de-Luna; Juan Ramon Terven Salinas; Bogdan Raducanu; Joaquin Salas |
Title |
A Social-Aware Assistant to support individuals with visual impairments during social interaction: A systematic requirements analysis |
Type |
Journal Article |
Year |
2019 |
Publication |
International Journal of Human-Computer Studies |
Abbreviated Journal |
IJHC |
Volume |
122 |
Issue |
|
Pages |
50-60 |
Keywords |
|
Abstract |
Visual impairment affects the normal course of activities in everyday life including mobility, education, employment, and social interaction. Most of the existing technical solutions devoted to empowering the visually impaired people are in the areas of navigation (obstacle avoidance), access to printed information and object recognition. Less effort has been dedicated so far in developing solutions to support social interactions. In this paper, we introduce a Social-Aware Assistant (SAA) that provides visually impaired people with cues to enhance their face-to-face conversations. The system consists of a perceptive component (represented by smartglasses with an embedded video camera) and a feedback component (represented by a haptic belt). When the vision system detects a head nodding, the belt vibrates, thus suggesting the user to replicate (mirror) the gesture. In our experiments, sighted persons interacted with blind people wearing the SAA. We instructed the former to mirror the noddings according to the vibratory signal, while the latter interacted naturally. After the face-to-face conversation, the participants had an interview to express their experience regarding the use of this new technological assistant. With the data collected during the experiment, we have assessed quantitatively and qualitatively the device usefulness and user satisfaction. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
LAMP; 600.109; 600.120 |
Approved |
no |
Call Number |
Admin @ si @ MTR2019 |
Serial |
3142 |
Permanent link to this record |
|
|
|
Author |
Arnau Baro; Pau Riba; Jorge Calvo-Zaragoza; Alicia Fornes |
Title |
From Optical Music Recognition to Handwritten Music Recognition: a Baseline |
Type |
Journal Article |
Year |
2019 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
Volume |
123 |
Issue |
|
Pages |
1-8 |
Keywords |
|
Abstract |
Optical Music Recognition (OMR) is the branch of document image analysis that aims to convert images of musical scores into a computer-readable format. Despite decades of research, the recognition of handwritten music scores, concretely the Western notation, is still an open problem, and the few existing works only focus on a specific stage of OMR. In this work, we propose a full Handwritten Music Recognition (HMR) system based on Convolutional Recurrent Neural Networks, data augmentation and transfer learning, that can serve as a baseline for the research community. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG; 600.097; 601.302; 601.330; 600.140; 600.121 |
Approved |
no |
Call Number |
Admin @ si @ BRC2019 |
Serial |
3275 |
Permanent link to this record |
|
|
|
Author |
Zhengying Liu; Zhen Xu; Shangeth Rajaa; Meysam Madadi; Julio C. S. Jacques Junior; Sergio Escalera; Adrien Pavao; Sebastien Treguer; Wei-Wei Tu; Isabelle Guyon |
Title |
Towards Automated Deep Learning: Analysis of the AutoDL challenge series 2019 |
Type |
Conference Article |
Year |
2020 |
Publication |
Proceedings of Machine Learning Research |
Abbreviated Journal |
|
Volume |
123 |
Issue |
|
Pages |
242-252 |
Keywords |
|
Abstract |
We present the design and results of recent competitions in Automated Deep Learning (AutoDL). In the AutoDL challenge series 2019, we organized 5 machine learning challenges: AutoCV, AutoCV2, AutoNLP, AutoSpeech and AutoDL. The first 4 challenges concern each a specific application domain, such as computer vision, natural language processing and speech recognition. At the time of March 2020, the last challenge AutoDL is still on-going and we only present its design. Some highlights of this work include: (1) a benchmark suite of baseline AutoML solutions, with emphasis on domains for which Deep Learning methods have had prior success (image, video, text, speech, etc); (2) a novel any-time learning framework, which opens doors for further theoretical consideration; (3) a repository of around 100 datasets (from all above domains) over half of which are released as public datasets to enable research on meta-learning; (4) analyses revealing that winning solutions generalize to new unseen datasets, validating progress towards universal AutoML solution; (5) open-sourcing of the challenge platform, the starting kit, the dataset formatting toolkit, and all winning solutions (All information available at {autodl.chalearn.org}). |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
NEURIPS |
Notes |
HUPBA |
Approved |
no |
Call Number |
Admin @ si @ LXR2020 |
Serial |
3500 |
Permanent link to this record |
|
|
|
Author |
S.K. Jemni; Mohamed Ali Souibgui; Yousri Kessentini; Alicia Fornes |
Title |
Enhance to Read Better: A Multi-Task Adversarial Network for Handwritten Document Image Enhancement |
Type |
Journal Article |
Year |
2022 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
Volume |
123 |
Issue |
|
Pages |
108370 |
Keywords |
|
Abstract |
Handwritten document images can be highly affected by degradation for different reasons: Paper ageing, daily-life scenarios (wrinkles, dust, etc.), bad scanning process and so on. These artifacts raise many readability issues for current Handwritten Text Recognition (HTR) algorithms and severely devalue their efficiency. In this paper, we propose an end to end architecture based on Generative Adversarial Networks (GANs) to recover the degraded documents into a and form. Unlike the most well-known document binarization methods, which try to improve the visual quality of the degraded document, the proposed architecture integrates a handwritten text recognizer that promotes the generated document image to be more readable. To the best of our knowledge, this is the first work to use the text information while binarizing handwritten documents. Extensive experiments conducted on degraded Arabic and Latin handwritten documents demonstrate the usefulness of integrating the recognizer within the GAN architecture, which improves both the visual quality and the readability of the degraded document images. Moreover, we outperform the state of the art in H-DIBCO challenges, after fine tuning our pre-trained model with synthetically degraded Latin handwritten images, on this task. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG; 600.124; 600.121; 602.230 |
Approved |
no |
Call Number |
Admin @ si @ JSK2022 |
Serial |
3613 |
Permanent link to this record |
|
|
|
Author |
Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades |
Title |
Sparse representation over learned dictionary for symbol recognition |
Type |
Journal Article |
Year |
2016 |
Publication |
Signal Processing |
Abbreviated Journal |
SP |
Volume |
125 |
Issue |
|
Pages |
36-47 |
Keywords |
Symbol Recognition; Sparse Representation; Learned Dictionary; Shape Context; Interest Points |
Abstract |
In this paper we propose an original sparse vector model for symbol retrieval task. More specically, we apply the K-SVD algorithm for learning a visual dictionary based on symbol descriptors locally computed around interest points. Results on benchmark datasets show that the obtained sparse representation is competitive related to state-of-the-art methods. Moreover, our sparse representation is invariant to rotation and scale transforms and also robust to degraded images and distorted symbols. Thereby, the learned visual dictionary is able to represent instances of unseen classes of symbols. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG; 600.061; 600.077 |
Approved |
no |
Call Number |
Admin @ si @ DTR2016 |
Serial |
2946 |
Permanent link to this record |
|
|
|
Author |
Mariella Dimiccoli |
Title |
Figure-ground segregation: A fully nonlocal approach |
Type |
Journal Article |
Year |
2016 |
Publication |
Vision Research |
Abbreviated Journal |
VR |
Volume |
126 |
Issue |
|
Pages |
308-317 |
Keywords |
Figure-ground segregation; Nonlocal approach; Directional linear voting; Nonlinear diffusion |
Abstract |
We present a computational model that computes and integrates in a nonlocal fashion several configural cues for automatic figure-ground segregation. Our working hypothesis is that the figural status of each pixel is a nonlocal function of several geometric shape properties and it can be estimated without explicitly relying on object boundaries. The methodology is grounded on two elements: multi-directional linear voting and nonlinear diffusion. A first estimation of the figural status of each pixel is obtained as a result of a voting process, in which several differently oriented line-shaped neighborhoods vote to express their belief about the figural status of the pixel. A nonlinear diffusion process is then applied to enforce the coherence of figural status estimates among perceptually homogeneous regions. Computer simulations fit human perception and match the experimental evidence that several cues cooperate in defining figure-ground segregation. The results of this work suggest that figure-ground segregation involves feedback from cells with larger receptive fields in higher visual cortical areas. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MILAB; |
Approved |
no |
Call Number |
Admin @ si @ Dim2016b |
Serial |
2623 |
Permanent link to this record |
|
|
|
Author |
Sergio Escalera; Jordi Gonzalez; Hugo Jair Escalante; Xavier Baro; Isabelle Guyon |
Title |
Looking at People Special Issue |
Type |
Journal Article |
Year |
2018 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
Volume |
126 |
Issue |
2-4 |
Pages |
141-143 |
Keywords |
|
Abstract |
|
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
HUPBA; ISE; 600.119 |
Approved |
no |
Call Number |
Admin @ si @ EGJ2018 |
Serial |
3093 |
Permanent link to this record |
|
|
|
Author |
Arash Akbarinia; C. Alejandro Parraga |
Title |
Feedback and Surround Modulated Boundary Detection |
Type |
Journal Article |
Year |
2018 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
Volume |
126 |
Issue |
12 |
Pages |
1367–1380 |
Keywords |
Boundary detection; Surround modulation; Biologically-inspired vision |
Abstract |
Edges are key components of any visual scene to the extent that we can recognise objects merely by their silhouettes. The human visual system captures edge information through neurons in the visual cortex that are sensitive to both intensity discontinuities and particular orientations. The “classical approach” assumes that these cells are only responsive to the stimulus present within their receptive fields, however, recent studies demonstrate that surrounding regions and inter-areal feedback connections influence their responses significantly. In this work we propose a biologically-inspired edge detection model in which orientation selective neurons are represented through the first derivative of a Gaussian function resembling double-opponent cells in the primary visual cortex (V1). In our model we account for four kinds of receptive field surround, i.e. full, far, iso- and orthogonal-orientation, whose contributions are contrast-dependant. The output signal from V1 is pooled in its perpendicular direction by larger V2 neurons employing a contrast-variant centre-surround kernel. We further introduce a feedback connection from higher-level visual areas to the lower ones. The results of our model on three benchmark datasets show a big improvement compared to the current non-learning and biologically-inspired state-of-the-art algorithms while being competitive to the learning-based methods. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
NEUROBIT; 600.068; 600.072 |
Approved |
no |
Call Number |
Admin @ si @ AkP2018b |
Serial |
2991 |
Permanent link to this record |
|
|
|
Author |
Adrien Gaidon; Antonio Lopez; Florent Perronnin |
Title |
The Reasonable Effectiveness of Synthetic Visual Data |
Type |
Journal Article |
Year |
2018 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
Volume |
126 |
Issue |
9 |
Pages |
899–901 |
Keywords |
|
Abstract |
|
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
ADAS; 600.118 |
Approved |
no |
Call Number |
Admin @ si @ GLP2018 |
Serial |
3180 |
Permanent link to this record |
|
|
|
Author |
Pau Riba; Lutz Goldmann; Oriol Ramos Terrades; Diede Rusticus; Alicia Fornes; Josep Llados |
Title |
Table detection in business document images by message passing networks |
Type |
Journal Article |
Year |
2022 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
Volume |
127 |
Issue |
|
Pages |
108641 |
Keywords |
|
Abstract |
Tabular structures in business documents offer a complementary dimension to the raw textual data. For instance, there is information about the relationships among pieces of information. Nowadays, digital mailroom applications have become a key service for workflow automation. Therefore, the detection and interpretation of tables is crucial. With the recent advances in information extraction, table detection and recognition has gained interest in document image analysis, in particular, with the absence of rule lines and unknown information about rows and columns. However, business documents usually contain sensitive contents limiting the amount of public benchmarking datasets. In this paper, we propose a graph-based approach for detecting tables in document images which do not require the raw content of the document. Hence, the sensitive content can be previously removed and, instead of using the raw image or textual content, we propose a purely structural approach to keep sensitive data anonymous. Our framework uses graph neural networks (GNNs) to describe the local repetitive structures that constitute a table. In particular, our main application domain are business documents. We have carefully validated our approach in two invoice datasets and a modern document benchmark. Our experiments demonstrate that tables can be detected by purely structural approaches. |
Address |
July 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG; 600.162; 600.121 |
Approved |
no |
Call Number |
Admin @ si @ RGR2022 |
Serial |
3729 |
Permanent link to this record |
|
|
|
Author |
Daniel Hernandez; Lukas Schneider; P. Cebrian; A. Espinosa; David Vazquez; Antonio Lopez; Uwe Franke; Marc Pollefeys; Juan Carlos Moure |
Title |
Slanted Stixels: A way to represent steep streets |
Type |
Journal Article |
Year |
2019 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
Volume |
127 |
Issue |
|
Pages |
1643–1658 |
Keywords |
|
Abstract |
This work presents and evaluates a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound global energy minimization formulation. Furthermore, a novel approximation scheme is introduced in order to significantly reduce the computational complexity of the Stixel algorithm, and then achieve real-time computation capabilities. The idea is to first perform an over-segmentation of the image, discarding the unlikely Stixel cuts, and apply the algorithm only on the remaining Stixel cuts. This work presents a novel over-segmentation strategy based on a fully convolutional network, which outperforms an approach based on using local extrema of the disparity map. We evaluate the proposed methods in terms of semantic and geometric accuracy as well as run-time on four publicly available benchmark datasets. Our approach maintains accuracy on flat road scene datasets while improving substantially on a novel non-flat road dataset. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
ADAS; 600.118; 600.124 |
Approved |
no |
Call Number |
Admin @ si @ HSC2019 |
Serial |
3304 |
Permanent link to this record |
|
|
|
Author |
Albert Gordo |
Title |
A Cyclic Page Layout Descriptor for Document Classification & Retrieval |
Type |
Report |
Year |
2009 |
Publication |
CVC Technical Report |
Abbreviated Journal |
|
Volume |
128 |
Issue |
|
Pages |
|
Keywords |
|
Abstract |
|
Address |
|
Corporate Author |
Computer Vision Center |
Thesis |
Master's thesis |
Publisher |
|
Place of Publication |
Bellaterra, Barcelona |
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
CIC;DAG |
Approved |
no |
Call Number |
Admin @ si @ Gor2009 |
Serial |
2387 |
Permanent link to this record |
|
|
|
Author |
Cesar de Souza; Adrien Gaidon; Yohann Cabon; Naila Murray; Antonio Lopez |
Title |
Generating Human Action Videos by Coupling 3D Game Engines and Probabilistic Graphical Models |
Type |
Journal Article |
Year |
2020 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
Volume |
128 |
Issue |
|
Pages |
1505–1536 |
Keywords |
Procedural generation; Human action recognition; Synthetic data; Physics |
Abstract |
Deep video action recognition models have been highly successful in recent years but require large quantities of manually-annotated data, which are expensive and laborious to obtain. In this work, we investigate the generation of synthetic training data for video action recognition, as synthetic data have been successfully used to supervise models for a variety of other computer vision tasks. We propose an interpretable parametric generative model of human action videos that relies on procedural generation, physics models and other components of modern game engines. With this model we generate a diverse, realistic, and physically plausible dataset of human action videos, called PHAV for “Procedural Human Action Videos”. PHAV contains a total of 39,982 videos, with more than 1000 examples for each of 35 action categories. Our video generation approach is not limited to existing motion capture sequences: 14 of these 35 categories are procedurally-defined synthetic actions. In addition, each video is represented with 6 different data modalities, including RGB, optical flow and pixel-level semantic labels. These modalities are generated almost simultaneously using the Multiple Render Targets feature of modern GPUs. In order to leverage PHAV, we introduce a deep multi-task (i.e. that considers action classes from multiple datasets) representation learning architecture that is able to simultaneously learn from synthetic and real video datasets, even when their action categories differ. Our experiments on the UCF-101 and HMDB-51 benchmarks suggest that combining our large set of synthetic videos with small real-world datasets can boost recognition performance. Our approach also significantly outperforms video representations produced by fine-tuning state-of-the-art unsupervised generative models of videos. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
ADAS; 600.124; 600.118 |
Approved |
no |
Call Number |
Admin @ si @ SGC2019 |
Serial |
3303 |
Permanent link to this record |
|
|
|
Author |
Yunan Li; Jun Wan; Qiguang Miao; Sergio Escalera; Huijuan Fang; Huizhou Chen; Xiangda Qi; Guodong Guo |
Title |
CR-Net: A Deep Classification-Regression Network for Multimodal Apparent Personality Analysis |
Type |
Journal Article |
Year |
2020 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
Volume |
128 |
Issue |
|
Pages |
2763–2780 |
Keywords |
|
Abstract |
First impressions strongly influence social interactions, having a high impact in the personal and professional life. In this paper, we present a deep Classification-Regression Network (CR-Net) for analyzing the Big Five personality problem and further assisting on job interview recommendation in a first impressions setup. The setup is based on the ChaLearn First Impressions dataset, including multimodal data with video, audio, and text converted from the corresponding audio data, where each person is talking in front of a camera. In order to give a comprehensive prediction, we analyze the videos from both the entire scene (including the person’s motions and background) and the face of the person. Our CR-Net first performs personality trait classification and applies a regression later, which can obtain accurate predictions for both personality traits and interview recommendation. Furthermore, we present a new loss function called Bell Loss to address inaccurate predictions caused by the regression-to-the-mean problem. Extensive experiments on the First Impressions dataset show the effectiveness of our proposed network, outperforming the state-of-the-art. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
HuPBA; no menciona |
Approved |
no |
Call Number |
Admin @ si @ LWM2020 |
Serial |
3413 |
Permanent link to this record |