Home | [31–40] << 41 42 43 44 45 46 47 48 49 50 >> [51–60] |
Records | |||||
---|---|---|---|---|---|
Author | Mohammad Rouhani; Angel Sappa; E. Boyer | ||||
Title | Implicit B-Spline Surface Reconstruction | Type | Journal Article | ||
Year | 2015 | Publication | IEEE Transactions on Image Processing | Abbreviated Journal | TIP |
Volume | 24 | Issue | 1 | Pages | 22 - 32 |
Keywords | |||||
Abstract | This paper presents a fast and flexible curve, and surface reconstruction technique based on implicit B-spline. This representation does not require any parameterization and it is locally supported. This fact has been exploited in this paper to propose a reconstruction technique through solving a sparse system of equations. This method is further accelerated to reduce the dimension to the active control lattice. Moreover, the surface smoothness and user interaction are allowed for controlling the surface. Finally, a novel weighting technique has been introduced in order to blend small patches and smooth them in the overlapping regions. The whole framework is very fast and efficient and can handle large cloud of points with very low computational cost. The experimental results show the flexibility and accuracy of the proposed algorithm to describe objects with complex topologies. Comparisons with other fitting methods highlight the superiority of the proposed approach in the presence of noise and missing data. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1057-7149 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.076 | Approved | no | ||
Call Number | Admin @ si @ RSB2015 | Serial | 2541 | ||
Permanent link to this record | |||||
Author | Alvaro Cepero; Albert Clapes; Sergio Escalera | ||||
Title | Automatic non-verbal communication skills analysis: a quantitative evaluation | Type | Journal Article | ||
Year | 2015 | Publication | AI Communications | Abbreviated Journal | AIC |
Volume | 28 | Issue | 1 | Pages | 87-101 |
Keywords | Social signal processing; human behavior analysis; multi-modal data description; multi-modal data fusion; non-verbal communication analysis; e-Learning | ||||
Abstract | The oral communication competence is defined on the top of the most relevant skills for one's professional and personal life. Because of the importance of communication in our activities of daily living, it is crucial to study methods to evaluate and provide the necessary feedback that can be used in order to improve these communication capabilities and, therefore, learn how to express ourselves better. In this work, we propose a system capable of evaluating quantitatively the quality of oral presentations in an automatic fashion. The system is based on a multi-modal RGB, depth, and audio data description and a fusion approach in order to recognize behavioral cues and train classifiers able to eventually predict communication quality levels. The performance of the proposed system is tested on a novel dataset containing Bachelor thesis' real defenses, presentations from an 8th semester Bachelor courses, and Master courses' presentations at Universitat de Barcelona. Using as groundtruth the marks assigned by actual instructors, our system achieves high performance categorizing and ranking presentations by their quality, and also making real-valued mark predictions. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0921-7126 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | HUPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ CCE2015 | Serial | 2549 | ||
Permanent link to this record | |||||
Author | G. Zahnd; Simone Balocco; A. Serusclat; P. Moulin; M. Orkisz; D. Vray | ||||
Title | Progressive attenuation of the longitudinal kinetics in the common carotid artery: preliminary in vivo assessment Ultrasound in Medicine and Biology | Type | Journal Article | ||
Year | 2015 | Publication | Ultrasound in Medicine and Biology | Abbreviated Journal | UMB |
Volume | 41 | Issue | 1 | Pages | 339-345 |
Keywords | Arterial stiffness; Atherosclerosis; Common carotid artery; Longitudinal kinetics; Motion tracking; Ultrasound imaging | ||||
Abstract | Longitudinal kinetics (LOKI) of the arterial wall consists of the shearing motion of the intima-media complex over the adventitia layer in the direction parallel to the blood flow during the cardiac cycle. The aim of this study was to investigate the local variability of LOKI amplitude along the length of the vessel. By use of a previously validated motion-estimation framework, 35 in vivo longitudinal B-mode ultrasound cine loops of healthy common carotid arteries were analyzed. Results indicated that LOKI amplitude is progressively attenuated along the length of the artery, as it is larger in regions located on the proximal side of the image (i.e., toward the heart) and smaller in regions located on the distal side of the image (i.e., toward the head), with an average attenuation coefficient of -2.5 ± 2.0%/mm. Reported for the first time in this study, this phenomenon is likely to be of great importance in improving understanding of atherosclerosis mechanisms, and has the potential to be a novel index of arterial stiffness. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ ZBS2014 | Serial | 2556 | ||
Permanent link to this record | |||||
Author | Lluis Pere de las Heras; Oriol Ramos Terrades; Sergi Robles; Gemma Sanchez | ||||
Title | CVC-FP and SGT: a new database for structural floor plan analysis and its groundtruthing tool | Type | Journal Article | ||
Year | 2015 | Publication | International Journal on Document Analysis and Recognition | Abbreviated Journal | IJDAR |
Volume | 18 | Issue | 1 | Pages | 15-30 |
Keywords | |||||
Abstract | Recent results on structured learning methods have shown the impact of structural information in a wide range of pattern recognition tasks. In the field of document image analysis, there is a long experience on structural methods for the analysis and information extraction of multiple types of documents. Yet, the lack of conveniently annotated and free access databases has not benefited the progress in some areas such as technical drawing understanding. In this paper, we present a floor plan database, named CVC-FP, that is annotated for the architectural objects and their structural relations. To construct this database, we have implemented a groundtruthing tool, the SGT tool, that allows to make specific this sort of information in a natural manner. This tool has been made for general purpose groundtruthing: It allows to define own object classes and properties, multiple labeling options are possible, grants the cooperative work, and provides user and version control. We finally have collected some of the recent work on floor plan interpretation and present a quantitative benchmark for this database. Both CVC-FP database and the SGT tool are freely released to the research community to ease comparisons between methods and boost reproducible research. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1433-2833 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; ADAS; 600.061; 600.076; 600.077 | Approved | no | ||
Call Number | Admin @ si @ HRR2015 | Serial | 2567 | ||
Permanent link to this record | |||||
Author | Miguel Angel Bautista; Antonio Hernandez; Sergio Escalera; Laura Igual; Oriol Pujol; Josep Moya; Veronica Violant; Maria Teresa Anguera | ||||
Title | A Gesture Recognition System for Detecting Behavioral Patterns of ADHD | Type | Journal Article | ||
Year | 2016 | Publication | IEEE Transactions on System, Man and Cybernetics, Part B | Abbreviated Journal | TSMCB |
Volume | 46 | Issue | 1 | Pages | 136-147 |
Keywords | Gesture Recognition; ADHD; Gaussian Mixture Models; Convex Hulls; Dynamic Time Warping; Multi-modal RGB-Depth data | ||||
Abstract | We present an application of gesture recognition using an extension of Dynamic Time Warping (DTW) to recognize behavioural patterns of Attention Deficit Hyperactivity Disorder (ADHD). We propose an extension of DTW using one-class classifiers in order to be able to encode the variability of a gesture category, and thus, perform an alignment between a gesture sample and a gesture class. We model the set of gesture samples of a certain gesture category using either GMMs or an approximation of Convex Hulls. Thus, we add a theoretical contribution to classical warping path in DTW by including local modeling of intra-class gesture variability. This methodology is applied in a clinical context, detecting a group of ADHD behavioural patterns defined by experts in psychology/psychiatry, to provide support to clinicians in the diagnose procedure. The proposed methodology is tested on a novel multi-modal dataset (RGB plus Depth) of ADHD children recordings with behavioural patterns. We obtain satisfying results when compared to standard state-of-the-art approaches in the DTW context. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; MILAB; | Approved | no | ||
Call Number | Admin @ si @ BHE2016 | Serial | 2566 | ||
Permanent link to this record | |||||
Author | Sergio Vera; Miguel Angel Gonzalez Ballester; Debora Gil | ||||
Title | A Novel Cochlear Reference Frame Based On The Laplace Equation | Type | Conference Article | ||
Year | 2015 | Publication | 29th international Congress and Exhibition on Computer Assisted Radiology and Surgery | Abbreviated Journal | |
Volume | 10 | Issue | 1 | Pages | 1-312 |
Keywords | |||||
Abstract | Poster | ||||
Address | Barcelona; Spain; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CARS | ||
Notes | IAM; 600.075 | Approved | no | ||
Call Number | Admin @ si @ VGG2015 | Serial | 2615 | ||
Permanent link to this record | |||||
Author | Marc Bolaños; Mariella Dimiccoli; Petia Radeva | ||||
Title | Towards Storytelling from Visual Lifelogging: An Overview | Type | Journal Article | ||
Year | 2017 | Publication | IEEE Transactions on Human-Machine Systems | Abbreviated Journal | THMS |
Volume | 47 | Issue | 1 | Pages | 77 - 90 |
Keywords | |||||
Abstract | Visual lifelogging consists of acquiring images that capture the daily experiences of the user by wearing a camera over a long period of time. The pictures taken offer considerable potential for knowledge mining concerning how people live their lives, hence, they open up new opportunities for many potential applications in fields including healthcare, security, leisure and
the quantified self. However, automatically building a story from a huge collection of unstructured egocentric data presents major challenges. This paper provides a thorough review of advances made so far in egocentric data analysis, and in view of the current state of the art, indicates new lines of research to move us towards storytelling from visual lifelogging. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; 601.235 | Approved | no | ||
Call Number | Admin @ si @ BDR2017 | Serial | 2712 | ||
Permanent link to this record | |||||
Author | Antonio Hernandez; Sergio Escalera; Stan Sclaroff | ||||
Title | Poselet-basedContextual Rescoring for Human Pose Estimation via Pictorial Structures | Type | Journal Article | ||
Year | 2016 | Publication | International Journal of Computer Vision | Abbreviated Journal | IJCV |
Volume | 118 | Issue | 1 | Pages | 49–64 |
Keywords | Contextual rescoring; Poselets; Human pose estimation | ||||
Abstract | In this paper we propose a contextual rescoring method for predicting the position of body parts in a human pose estimation framework. A set of poselets is incorporated in the model, and their detections are used to extract spatial and score-related features relative to other body part hypotheses. A method is proposed for the automatic discovery of a compact subset of poselets that covers the different poses in a set of validation images while maximizing precision. A rescoring mechanism is defined as a set-based boosting classifier that computes a new score for each body joint detection, given its relationship to detections of other body joints and mid-level parts in the image. This new score is incorporated in the pictorial structure model as an additional unary potential, following the recent work of Pishchulin et al. Experiments on two benchmarks show comparable results to Pishchulin et al. while reducing the size of the mid-level representation by an order of magnitude, reducing the execution time by 68 % accordingly. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer US | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0920-5691 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB; | Approved | no | ||
Call Number | Admin @ si @ HES2016 | Serial | 2719 | ||
Permanent link to this record | |||||
Author | Isabelle Guyon; Imad Chaabane; Hugo Jair Escalante; Sergio Escalera; Damir Jajetic; James Robert Lloyd; Nuria Macia; Bisakha Ray; Lukasz Romaszko; Michele Sebag; Alexander Statnikov; Sebastien Treguer; Evelyne Viegas | ||||
Title | A brief Review of the ChaLearn AutoML Challenge: Any-time Any-dataset Learning without Human Intervention | Type | Conference Article | ||
Year | 2016 | Publication | AutoML Workshop | Abbreviated Journal | |
Volume | Issue | 1 | Pages | 1-8 | |
Keywords | AutoML Challenge; machine learning; model selection; meta-learning; repre- sentation learning; active learning | ||||
Abstract | The ChaLearn AutoML Challenge team conducted a large scale evaluation of fully automatic, black-box learning machines for feature-based classification and regression problems. The test bed was composed of 30 data sets from a wide variety of application domains and ranged across different types of complexity. Over six rounds, participants succeeded in delivering AutoML software capable of being trained and tested without human intervention. Although improvements can still be made to close the gap between human-tweaked and AutoML models, this competition contributes to the development of fully automated environments by challenging practitioners to solve problems under specific constraints and sharing their approaches; the platform will remain available for post-challenge submissions at http://codalab.org/AutoML. | ||||
Address | New York; USA; June 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICML | ||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ GCE2016 | Serial | 2769 | ||
Permanent link to this record | |||||
Author | Maria Salamo; Inmaculada Rodriguez; Maite Lopez; Anna Puig; Simone Balocco; Mariona Taule | ||||
Title | Recurso docente para la atención de la diversidad en el aula mediante la predicción de notas | Type | Journal | ||
Year | 2016 | Publication | ReVision | Abbreviated Journal | |
Volume | 9 | Issue | 1 | Pages | |
Keywords | Aprendizaje automatico; Sistema de prediccion de notas; Herramienta docente | ||||
Abstract | Desde la implantación del Espacio Europeo de Educación Superior (EEES) en los diferentes grados, se ha puesto de manifiesto la necesidad de utilizar diversos mecanismos que permitan tratar la diversidad en el aula, evaluando automáticamente y proporcionando una retroalimentación rápida tanto al alumnado como al profesorado sobre la evolución de los alumnos en una asignatura. En este artículo se presenta la evaluación de la exactitud en las predicciones de GRADEFORESEER, un recurso docente para la predicción de notas basado en técnicas de aprendizaje automático que permite evaluar la evolución del alumnado y estimar su nota final al terminar el curso. Este recurso se ha complementado con una interfaz de usuario para el profesorado que puede ser usada en diferentes plataformas software (sistemas operativos) y en cualquier asignatura de un grado en la que se utilice evaluación continuada. Además de la descripción del recurso, este artículo presenta los resultados obtenidos al aplicar el sistema de predicción en cuatro asignaturas de disciplinas distintas: Programación I (PI), Diseño de Software (DSW) del grado de Ingeniería Informática, Tecnologías de la Información y la Comunicación (TIC) del grado de Lingüística y la asignatura Fundamentos de Tecnología (FDT) del grado de Información y Documentación, todas ellas impartidas en la Universidad de Barcelona.
La capacidad predictiva se ha evaluado de forma binaria (aprueba o no) y según un criterio de rango (suspenso, aprobado, notable o sobresaliente), obteniendo mejores predicciones en los resultados evaluados de forma binaria. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; | Approved | no | ||
Call Number | Admin @ si @ SRL2016 | Serial | 2820 | ||
Permanent link to this record | |||||
Author | H. Martin Kjer; Jens Fagertun; Sergio Vera; Debora Gil; Miguel Angel Gonzalez Ballester; Rasmus R. Paulsena | ||||
Title | Free-form image registration of human cochlear uCT data using skeleton similarity as anatomical prior | Type | Journal Article | ||
Year | 2016 | Publication | Patter Recognition Letters | Abbreviated Journal | PRL |
Volume | 76 | Issue | 1 | Pages | 76-82 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; 600.060 | Approved | no | ||
Call Number | Admin @ si @ MFV2017b | Serial | 2941 | ||
Permanent link to this record | |||||
Author | Karim Lekadir; Alfiia Galimzianova; Angels Betriu; Maria del Mar Vila; Laura Igual; Daniel L. Rubin; Elvira Fernandez-Giraldez; Petia Radeva; Sandy Napel | ||||
Title | A Convolutional Neural Network for Automatic Characterization of Plaque Composition in Carotid Ultrasound | Type | Journal Article | ||
Year | 2017 | Publication | IEEE Journal Biomedical and Health Informatics | Abbreviated Journal | J-BHI |
Volume | 21 | Issue | 1 | Pages | 48-55 |
Keywords | |||||
Abstract | Characterization of carotid plaque composition, more specifically the amount of lipid core, fibrous tissue, and calcified tissue, is an important task for the identification of plaques that are prone to rupture, and thus for early risk estimation of cardiovascular and cerebrovascular events. Due to its low costs and wide availability, carotid ultrasound has the potential to become the modality of choice for plaque characterization in clinical practice. However, its significant image noise, coupled with the small size of the plaques and their complex appearance, makes it difficult for automated techniques to discriminate between the different plaque constituents. In this paper, we propose to address this challenging problem by exploiting the unique capabilities of the emerging deep learning framework. More specifically, and unlike existing works which require a priori definition of specific imaging features or thresholding values, we propose to build a convolutional neural network (CNN) that will automatically extract from the images the information that is optimal for the identification of the different plaque constituents. We used approximately 90 000 patches extracted from a database of images and corresponding expert plaque characterizations to train and to validate the proposed CNN. The results of cross-validation experiments show a correlation of about 0.90 with the clinical assessment for the estimation of lipid core, fibrous cap, and calcified tissue areas, indicating the potential of deep learning for the challenging task of automatic characterization of plaque composition in carotid ultrasound. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ LGB2017 | Serial | 2931 | ||
Permanent link to this record | |||||
Author | Fatemeh Noroozi; Marina Marjanovic; Angelina Njegus; Sergio Escalera; Gholamreza Anbarjafari | ||||
Title | Audio-Visual Emotion Recognition in Video Clips | Type | Journal Article | ||
Year | 2019 | Publication | IEEE Transactions on Affective Computing | Abbreviated Journal | TAC |
Volume | 10 | Issue | 1 | Pages | 60-75 |
Keywords | |||||
Abstract | This paper presents a multimodal emotion recognition system, which is based on the analysis of audio and visual cues. From the audio channel, Mel-Frequency Cepstral Coefficients, Filter Bank Energies and prosodic features are extracted. For the visual part, two strategies are considered. First, facial landmarks’ geometric relations, i.e. distances and angles, are computed. Second, we summarize each emotional video into a reduced set of key-frames, which are taught to visually discriminate between the emotions. In order to do so, a convolutional neural network is applied to key-frames summarizing videos. Finally, confidence outputs of all the classifiers from all the modalities are used to define a new feature space to be learned for final emotion label prediction, in a late fusion/stacking fashion. The experiments conducted on the SAVEE, eNTERFACE’05, and RML databases show significant performance improvements by our proposed system in comparison to current alternatives, defining the current state-of-the-art in all three databases. | ||||
Address | 1 Jan.-March 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; 602.143; 602.133 | Approved | no | ||
Call Number | Admin @ si @ NMN2017 | Serial | 3011 | ||
Permanent link to this record | |||||
Author | Mark Philip Philipsen; Jacob Velling Dueholm; Anders Jorgensen; Sergio Escalera; Thomas B. Moeslund | ||||
Title | Organ Segmentation in Poultry Viscera Using RGB-D | Type | Journal Article | ||
Year | 2018 | Publication | Sensors | Abbreviated Journal | SENS |
Volume | 18 | Issue | 1 | Pages | 117 |
Keywords | semantic segmentation; RGB-D; random forest; conditional random field; 2D; 3D; CNN | ||||
Abstract | We present a pattern recognition framework for semantic segmentation of visual structures, that is, multi-class labelling at pixel level, and apply it to the task of segmenting organs in the eviscerated viscera from slaughtered poultry in RGB-D images. This is a step towards replacing the current strenuous manual inspection at poultry processing plants. Features are extracted from feature maps such as activation maps from a convolutional neural network (CNN). A random forest classifier assigns class probabilities, which are further refined by utilizing context in a conditional random field. The presented method is compatible with both 2D and 3D features, which allows us to explore the value of adding 3D and CNN-derived features. The dataset consists of 604 RGB-D images showing 151 unique sets of eviscerated viscera from four different perspectives. A mean Jaccard index of 78.11% is achieved across the four classes of organs by using features derived from 2D, 3D and a CNN, compared to 74.28% using only basic 2D image features. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ PVJ2018 | Serial | 3072 | ||
Permanent link to this record | |||||
Author | Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Andrew Bagdanov; Michael Felsberg; Jorma | ||||
Title | Scale coding bag of deep features for human attribute and action recognition | Type | Journal Article | ||
Year | 2018 | Publication | Machine Vision and Applications | Abbreviated Journal | MVAP |
Volume | 29 | Issue | 1 | Pages | 55-71 |
Keywords | Action recognition; Attribute recognition; Bag of deep features | ||||
Abstract | Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding. Both in bag-of-words and the recently popular representations based on convolutional neural networks, local features are computed at multiple scales. However, these multi-scale convolutional features are pooled into a single scale-invariant representation. We argue that entirely scale-invariant image representations are sub-optimal and investigate approaches to scale coding within a bag of deep features framework. Our approach encodes multi-scale information explicitly during the image encoding stage. We propose two strategies to encode multi-scale information explicitly in the final image representation. We validate our two scale coding techniques on five datasets: Willow, PASCAL VOC 2010, PASCAL VOC 2012, Stanford-40 and Human Attributes (HAT-27). On all datasets, the proposed scale coding approaches outperform both the scale-invariant method and the standard deep features of the same network. Further, combining our scale coding approaches with standard deep features leads to consistent improvement over the state of the art. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.068; 600.079; 600.106; 600.120 | Approved | no | ||
Call Number | Admin @ si @ KWR2018 | Serial | 3107 | ||
Permanent link to this record |