Home | [161–170] << 171 172 173 174 175 176 177 178 179 180 >> [181–190] |
Records | |||||
---|---|---|---|---|---|
Author | Andrei Polzounov; Artsiom Ablavatski; Sergio Escalera; Shijian Lu; Jianfei Cai | ||||
Title | WordFences: Text Localization and Recognition | Type | Conference Article | ||
Year | 2017 | Publication | 24th International Conference on Image Processing | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Beijing; China; September 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICIP | ||
Notes | HUPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ PAE2017 | Serial | 3007 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Vassilis Athitsos; Isabelle Guyon | ||||
Title | Challenges in Multi-modal Gesture Recognition | Type | Book Chapter | ||
Year | 2017 | Publication | Abbreviated Journal | ||
Volume | Issue | Pages | 1-60 | ||
Keywords | Gesture recognition; Time series analysis; Multimodal data analysis; Computer vision; Pattern recognition; Wearable sensors; Infrared cameras; Kinect TMTM | ||||
Abstract | This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011–2015. We began right at the start of the Kinect TMTM revolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras, to record data, thus providing a good overview of uses of machine learning and computer vision using multimodal data in this area of application. Notably, we organized a series of challenges and made available several datasets we recorded for that purpose, including tens of thousands of videos, which are available to conduct further research. We also overview recent state of the art works on gesture recognition based on a proposed taxonomy for gesture recognition, discussing challenges and future lines of research. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ EAG2017 | Serial | 3008 | ||
Permanent link to this record | |||||
Author | Jordi Esquirol; Cristina Palmero; Vanessa Bayo; Miquel Angel Cos; Sergio Escalera; David Sanchez; Maider Sanchez; Noelia Serrano; Mireia Relats | ||||
Title | Automatic RBG-depth-pressure anthropometric analysis and individualised sleep solution prescription | Type | Journal | ||
Year | 2017 | Publication | Journal of Medical Engineering & Technology | Abbreviated Journal | JMET |
Volume | 41 | Issue | 6 | Pages | 486-497 |
Keywords | |||||
Abstract | INTRODUCTION:
Sleep surfaces must adapt to individual somatotypic features to maintain a comfortable, convenient and healthy sleep, preventing diseases and injuries. Individually determining the most adequate rest surface can often be a complex and subjective question. OBJECTIVES: To design and validate an automatic multimodal somatotype determination model to automatically recommend an individually designed mattress-topper-pillow combination. METHODS: Design and validation of an automated prescription model for an individualised sleep system is performed through a single-image 2 D-3 D analysis and body pressure distribution, to objectively determine optimal individual sleep surfaces combining five different mattress densities, three different toppers and three cervical pillows. RESULTS: A final study (n = 151) and re-analysis (n = 117) defined and validated the model, showing high correlations between calculated and real data (>85% in height and body circumferences, 89.9% in weight, 80.4% in body mass index and more than 70% in morphotype categorisation). CONCLUSIONS: Somatotype determination model can accurately prescribe an individualised sleep solution. This can be useful for healthy people and for health centres that need to adapt sleep surfaces to people with special needs. Next steps will increase model's accuracy and analise, if this prescribed individualised sleep solution can improve sleep quantity and quality; additionally, future studies will adapt the model to mattresses with technological improvements, tailor-made production and will define interfaces for people with special needs. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ EPB2017 | Serial | 3010 | ||
Permanent link to this record | |||||
Author | Fatemeh Noroozi; Marina Marjanovic; Angelina Njegus; Sergio Escalera; Gholamreza Anbarjafari | ||||
Title | Audio-Visual Emotion Recognition in Video Clips | Type | Journal Article | ||
Year | 2019 | Publication | IEEE Transactions on Affective Computing | Abbreviated Journal | TAC |
Volume | 10 | Issue | 1 | Pages | 60-75 |
Keywords | |||||
Abstract | This paper presents a multimodal emotion recognition system, which is based on the analysis of audio and visual cues. From the audio channel, Mel-Frequency Cepstral Coefficients, Filter Bank Energies and prosodic features are extracted. For the visual part, two strategies are considered. First, facial landmarks’ geometric relations, i.e. distances and angles, are computed. Second, we summarize each emotional video into a reduced set of key-frames, which are taught to visually discriminate between the emotions. In order to do so, a convolutional neural network is applied to key-frames summarizing videos. Finally, confidence outputs of all the classifiers from all the modalities are used to define a new feature space to be learned for final emotion label prediction, in a late fusion/stacking fashion. The experiments conducted on the SAVEE, eNTERFACE’05, and RML databases show significant performance improvements by our proposed system in comparison to current alternatives, defining the current state-of-the-art in all three databases. | ||||
Address | 1 Jan.-March 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; 602.143; 602.133 | Approved | no | ||
Call Number | Admin @ si @ NMN2017 | Serial | 3011 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon | ||||
Title | ChaLearn Looking at People: A Review of Events and Resources | Type | Conference Article | ||
Year | 2017 | Publication | 30th International Joint Conference on Neural Networks | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | This paper reviews the historic of ChaLearn Looking at People (LAP) events. We started in 2011 (with the release of the first Kinect device) to run challenges related to human action/activity and gesture recognition. Since then we have regularly organized events in a series of competitions covering all aspects of visual analysis of humans. So far we have organized more than 10 international challenges and events in this field. This paper reviews associated events, and introduces the ChaLearn LAP platform where public resources (including code, data and preprints of papers) related to the organized events are available. We also provide a discussion on perspectives of ChaLearn LAP activities. | ||||
Address | Anchorage; Alaska; USA; May 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IJCNN | ||
Notes | HuPBA; 602.143 | Approved | no | ||
Call Number | Admin @ si @ EBE2017 | Serial | 3012 | ||
Permanent link to this record | |||||
Author | Eirikur Agustsson; Radu Timofte; Sergio Escalera; Xavier Baro; Isabelle Guyon; Rasmus Rothe | ||||
Title | Apparent and real age estimation in still images with deep residual regressors on APPA-REAL database | Type | Conference Article | ||
Year | 2017 | Publication | 12th IEEE International Conference on Automatic Face and Gesture Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | After decades of research, the real (biological) age estimation from a single face image reached maturity thanks to the availability of large public face databases and impressive accuracies achieved by recently proposed methods.
The estimation of “apparent age” is a related task concerning the age perceived by human observers. Significant advances have been also made in this new research direction with the recent Looking At People challenges. In this paper we make several contributions to age estimation research. (i) We introduce APPA-REAL, a large face image database with both real and apparent age annotations. (ii) We study the relationship between real and apparent age. (iii) We develop a residual age regression method to further improve the performance. (iv) We show that real age estimation can be successfully tackled as an apparent age estimation followed by an apparent to real age residual regression. (v) We graphically reveal the facial regions on which the CNN focuses in order to perform apparent and real age estimation tasks. |
||||
Address | Washington;USA; May 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | FG | ||
Notes | HUPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ ATE2017 | Serial | 3013 | ||
Permanent link to this record | |||||
Author | Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera; Huamin Ren; Thomas B. Moeslund; Elham Etemad | ||||
Title | Locality Regularized Group Sparse Coding for Action Recognition | Type | Journal Article | ||
Year | 2017 | Publication | Computer Vision and Image Understanding | Abbreviated Journal | CVIU |
Volume | 158 | Issue | Pages | 106-114 | |
Keywords | Bag of words; Feature encoding; Locality constrained coding; Group sparse coding; Alternating direction method of multipliers; Action recognition | ||||
Abstract | Bag of visual words (BoVW) models are widely utilized in image/ video representation and recognition. The cornerstone of these models is the encoding stage, in which local features are decomposed over a codebook in order to obtain a representation of features. In this paper, we propose a new encoding algorithm by jointly encoding the set of local descriptors of each sample and considering the locality structure of descriptors. The proposed method takes advantages of locality coding such as its stability and robustness to noise in descriptors, as well as the strengths of the group coding strategy by taking into account the potential relation among descriptors of a sample. To efficiently implement our proposed method, we consider the Alternating Direction Method of Multipliers (ADMM) framework, which results in quadratic complexity in the problem size. The method is employed for a challenging classification problem: action recognition by depth cameras. Experimental results demonstrate the outperformance of our methodology compared to the state-of-the-art on the considered datasets. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ BGE2017 | Serial | 3014 | ||
Permanent link to this record | |||||
Author | Miguel Angel Bautista; Oriol Pujol; Fernando De la Torre; Sergio Escalera | ||||
Title | Error-Correcting Factorization | Type | Journal Article | ||
Year | 2018 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 40 | Issue | Pages | 2388-2401 | |
Keywords | |||||
Abstract | Error Correcting Output Codes (ECOC) is a successful technique in multi-class classification, which is a core problem in Pattern Recognition and Machine Learning. A major advantage of ECOC over other methods is that the multi- class problem is decoupled into a set of binary problems that are solved independently. However, literature defines a general error-correcting capability for ECOCs without analyzing how it distributes among classes, hindering a deeper analysis of pair-wise error-correction. To address these limitations this paper proposes an Error-Correcting Factorization (ECF) method, our contribution is three fold: (I) We propose a novel representation of the error-correction capability, called the design matrix, that enables us to build an ECOC on the basis of allocating correction to pairs of classes. (II) We derive the optimal code length of an ECOC using rank properties of the design matrix. (III) ECF is formulated as a discrete optimization problem, and a relaxed solution is found using an efficient constrained block coordinate descent approach. (IV) Enabled by the flexibility introduced with the design matrix we propose to allocate the error-correction on classes that are prone to confusion. Experimental results in several databases show that when allocating the error-correction to confusable classes ECF outperforms state-of-the-art approaches. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0162-8828 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | HuPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ BPT2018 | Serial | 3015 | ||
Permanent link to this record | |||||
Author | Patricia Suarez; Angel Sappa; Boris X. Vintimilla | ||||
Title | Colorizing Infrared Images through a Triplet Conditional DCGAN Architecture | Type | Conference Article | ||
Year | 2017 | Publication | 19th international conference on image analysis and processing | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | CNN in Multispectral Imaging; Image Colorization | ||||
Abstract | This paper focuses on near infrared (NIR) image colorization by using a Conditional Deep Convolutional Generative Adversarial Network (CDCGAN) architecture model. The proposed architecture is based on the usage of a conditional probabilistic generative model. Firstly, it learns to colorize the given input image, by using a triplet model architecture that tackle every channel in an independent way. In the proposed model, the nal layer of red channel consider the infrared image to enhance the details, resulting in a sharp RGB image. Then, in the second stage, a discriminative model is used to estimate the probability that the generated image came from the training dataset, rather than the image automatically generated. Experimental results with a large set of real images are provided showing the validity of the proposed approach. Additionally, the proposed approach is compared with a state of the art approach showing better results. | ||||
Address | Catania; Italy; September 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICIAP | ||
Notes | ADAS; MSIAU; 600.086; 600.122; 600.118 | Approved | no | ||
Call Number | Admin @ si @ SSV2017c | Serial | 3016 | ||
Permanent link to this record | |||||
Author | I. Sorodoc; S. Pezzelle; A. Herbelot; Mariella Dimiccoli; R. Bernardi | ||||
Title | Learning quantification from images: A structured neural architecture | Type | Journal Article | ||
Year | 2018 | Publication | Natural Language Engineering | Abbreviated Journal | NLE |
Volume | 24 | Issue | 3 | Pages | 363-392 |
Keywords | |||||
Abstract | Major advances have recently been made in merging language and vision representations. Most tasks considered so far have confined themselves to the processing of objects and lexicalised relations amongst objects (content words). We know, however, that humans (even pre-school children) can abstract over raw multimodal data to perform certain types of higher level reasoning, expressed in natural language by function words. A case in point is given by their ability to learn quantifiers, i.e. expressions like few, some and all. From formal semantics and cognitive linguistics, we know that quantifiers are relations over sets which, as a simplification, we can see as proportions. For instance, in most fish are red, most encodes the proportion of fish which are red fish. In this paper, we study how well current neural network strategies model such relations. We propose a task where, given an image and a query expressed by an object–property pair, the system must return a quantifier expressing which proportions of the queried object have the queried property. Our contributions are twofold. First, we show that the best performance on this task involves coupling state-of-the-art attention mechanisms with a network architecture mirroring the logical structure assigned to quantifiers by classic linguistic formalisation. Second, we introduce a new balanced dataset of image scenarios associated with quantification queries, which we hope will foster further research in this area. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ SPH2018 | Serial | 3021 | ||
Permanent link to this record | |||||
Author | Maedeh Aghaei; Mariella Dimiccoli; C. Canton-Ferrer; Petia Radeva | ||||
Title | Towards social pattern characterization from egocentric photo-streams | Type | Journal Article | ||
Year | 2018 | Publication | Computer Vision and Image Understanding | Abbreviated Journal | CVIU |
Volume | 171 | Issue | Pages | 104-117 | |
Keywords | Social pattern characterization; Social signal extraction; Lifelogging; Convolutional and recurrent neural networks | ||||
Abstract | Following the increasingly popular trend of social interaction analysis in egocentric vision, this article presents a comprehensive pipeline for automatic social pattern characterization of a wearable photo-camera user. The proposed framework relies merely on the visual analysis of egocentric photo-streams and consists of three major steps. The first step is to detect social interactions of the user where the impact of several social signals on the task is explored. The detected social events are inspected in the second step for categorization into different social meetings. These two steps act at event-level where each potential social event is modeled as a multi-dimensional time-series, whose dimensions correspond to a set of relevant features for each task; finally, LSTM is employed to classify the time-series. The last step of the framework is to characterize social patterns of the user. Our goal is to quantify the duration, the diversity and the frequency of the user social relations in various social situations. This goal is achieved by the discovery of recurrences of the same people across the whole set of social events related to the user. Experimental evaluation over EgoSocialStyle – the proposed dataset in this work, and EGO-GROUP demonstrates promising results on the task of social pattern characterization from egocentric photo-streams. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ ADC2018 | Serial | 3022 | ||
Permanent link to this record | |||||
Author | Alejandro Cartas; Mariella Dimiccoli; Petia Radeva | ||||
Title | Batch-based activity recognition from egocentric photo-streams | Type | Conference Article | ||
Year | 2017 | Publication | 1st International workshop on Egocentric Perception, Interaction and Computing | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Activity recognition from long unstructured egocentric photo-streams has several applications in assistive technology such as health monitoring and frailty detection, just to name a few. However, one of its main technical challenges is to deal with the low frame rate of wearable photo-cameras, which causes abrupt appearance changes between consecutive frames. In consequence, important discriminatory low-level features from motion such as optical flow cannot be estimated. In this paper, we present a batch-driven approach for training a deep learning architecture that strongly rely on Long short-term units to tackle this problem. We propose two different implementations of the same approach that process a photo-stream sequence using batches of fixed size with the goal of capturing the temporal evolution of high-level features. The main difference between these implementations is that one explicitly models consecutive batches by overlapping them. Experimental results over a public dataset acquired by three users demonstrate the validity of the proposed architectures to exploit the temporal evolution of convolutional features over time without relying on event boundaries. | ||||
Address | Venice; Italy; October 2017; | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCV - EPIC | ||
Notes | MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ CDR2017 | Serial | 3023 | ||
Permanent link to this record | |||||
Author | Aniol Lidon; Marc Bolaños; Mariella Dimiccoli; Petia Radeva; Maite Garolera; Xavier Giro | ||||
Title | Semantic Summarization of Egocentric Photo-Stream Events | Type | Conference Article | ||
Year | 2017 | Publication | 2nd Workshop on Lifelogging Tools and Applications | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | San Francisco; USA; October 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4503-5503-2 | Medium | ||
Area | Expedition | Conference | ACMW (LTA) | ||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ LBD2017 | Serial | 3024 | ||
Permanent link to this record | |||||
Author | Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva | ||||
Title | All the people around me: face clustering in egocentric photo streams | Type | Conference Article | ||
Year | 2017 | Publication | 24th International Conference on Image Processing | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | face discovery; face clustering; deepmatching; bag-of-tracklets; egocentric photo-streams | ||||
Abstract | arxiv1703.01790
Given an unconstrained stream of images captured by a wearable photo-camera (2fpm), we propose an unsupervised bottom-up approach for automatic clustering appearing faces into the individual identities present in these data. The problem is challenging since images are acquired under real world conditions; hence the visible appearance of the people in the images undergoes intensive variations. Our proposed pipeline consists of first arranging the photo-stream into events, later, localizing the appearance of multiple people in them, and finally, grouping various appearances of the same person across different events. Experimental results performed on a dataset acquired by wearing a photo-camera during one month, demonstrate the effectiveness of the proposed approach for the considered purpose. |
||||
Address | Beijing; China; September 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICIP | ||
Notes | MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ EDR2017 | Serial | 3025 | ||
Permanent link to this record | |||||
Author | Laura Igual; Santiago Segui | ||||
Title | Introduction to Data Science – A Python Approach to Concepts, Techniques and Applications. Undergraduate Topics in Computer Science | Type | Book Whole | ||
Year | 2017 | Publication | Abbreviated Journal | ||
Volume | Issue | Pages | 1-215 | ||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | 978-3-319-50016-4 | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-319-50016-4 | Medium | ||
Area | Expedition | Conference | |||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ IgS2017 | Serial | 3027 | ||
Permanent link to this record |