Home | [171–180] << 181 182 183 184 185 186 187 188 189 190 >> [191–200] |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Maria Vanrell; Naila Murray; Robert Benavente; C. Alejandro Parraga; Xavier Otazu; Ramon Baldrich | ||||
Title | Perception Based Representations for Computational Colour | Type | Conference Article | ||
Year | 2011 | Publication | 3rd International Workshop on Computational Color Imaging | Abbreviated Journal | |
Volume | 6626 | Issue | Pages | 16-30 | |
Keywords ![]() |
colour perception, induction, naming, psychophysical data, saliency, segmentation | ||||
Abstract | The perceived colour of a stimulus is dependent on multiple factors stemming out either from the context of the stimulus or idiosyncrasies of the observer. The complexity involved in combining these multiple effects is the main reason for the gap between classical calibrated colour spaces from colour science and colour representations used in computer vision, where colour is just one more visual cue immersed in a digital image where surfaces, shadows and illuminants interact seemingly out of control. With the aim to advance a few steps towards bridging this gap we present some results on computational representations of colour for computer vision. They have been developed by introducing perceptual considerations derived from the interaction of the colour of a point with its context. We show some techniques to represent the colour of a point influenced by assimilation and contrast effects due to the image surround and we show some results on how colour saliency can be derived in real images. We outline a model for automatic assignment of colour names to image points directly trained on psychophysical data. We show how colour segments can be perceptually grouped in the image by imposing shading coherence in the colour space. | ||||
Address | Milan, Italy | ||||
Corporate Author | Thesis | ||||
Publisher | Springer-Verlag | Place of Publication | Editor | Raimondo Schettini, Shoji Tominaga, Alain Trémeau | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-642-20403-6 | Medium | ||
Area | Expedition | Conference | CCIW | ||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ VMB2011 | Serial | 1733 | ||
Permanent link to this record | |||||
Author | Alicia Fornes; V.C.Kieu; M. Visani; N.Journet; Anjan Dutta | ||||
Title | The ICDAR/GREC 2013 Music Scores Competition: Staff Removal | Type | Book Chapter | ||
Year | 2014 | Publication | Graphics Recognition. Current Trends and Challenges | Abbreviated Journal | |
Volume | 8746 | Issue | Pages | 207-220 | |
Keywords ![]() |
Competition; Graphics recognition; Music scores; Writer identification; Staff removal | ||||
Abstract | The first competition on music scores that was organized at ICDAR and GREC in 2011 awoke the interest of researchers, who participated in both staff removal and writer identification tasks. In this second edition, we focus on the staff removal task and simulate a real case scenario concerning old and degraded music scores. For this purpose, we have generated a new set of semi-synthetic images using two degradation models that we previously introduced: local noise and 3D distortions. In this extended paper we provide an extended description of the dataset, degradation models, evaluation metrics, the participant’s methods and the obtained results that could not be presented at ICDAR and GREC proceedings due to page limitations. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | B.Lamiroy; J.-M. Ogier | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-662-44853-3 | Medium | |
Area | Expedition | Conference | |||
Notes | DAG; 600.077; 600.061 | Approved | no | ||
Call Number | Admin @ si @ FKV2014 | Serial | 2581 | ||
Permanent link to this record | |||||
Author | V.C.Kieu; Alicia Fornes; M. Visani; N.Journet ; Anjan Dutta | ||||
Title | The ICDAR/GREC 2013 Music Scores Competition on Staff Removal | Type | Conference Article | ||
Year | 2013 | Publication | 10th IAPR International Workshop on Graphics Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords ![]() |
Competition; Music scores; Staff Removal | ||||
Abstract | The first competition on music scores that was organized at ICDAR and GREC in 2011 awoke the interest of researchers, who participated both at staff removal and writer identification tasks. In this second edition, we propose a staff removal competition where we simulate old music scores. Thus, we have created a new set of images, which contain noise and 3D distortions. This paper describes the distortion methods, metrics, the participant’s methods and the obtained results. | ||||
Address | Bethlehem; PA; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | GREC | ||
Notes | DAG; 600.045; 600.061 | Approved | no | ||
Call Number | Admin @ si @ KFV2013 | Serial | 2337 | ||
Permanent link to this record | |||||
Author | Dorota Kaminska; Kadir Aktas; Davit Rizhinashvili; Danila Kuklyanov; Abdallah Hussein Sham; Sergio Escalera; Kamal Nasrollahi; Thomas B. Moeslund; Gholamreza Anbarjafari | ||||
Title | Two-stage Recognition and Beyond for Compound Facial Emotion Recognition | Type | Journal Article | ||
Year | 2021 | Publication | Electronics | Abbreviated Journal | ELEC |
Volume | 10 | Issue | 22 | Pages | 2847 |
Keywords ![]() |
compound emotion recognition; facial expression recognition; dominant and complementary emotion recognition; deep learning | ||||
Abstract | Facial emotion recognition is an inherently complex problem due to individual diversity in facial features and racial and cultural differences. Moreover, facial expressions typically reflect the mixture of people’s emotional statuses, which can be expressed using compound emotions. Compound facial emotion recognition makes the problem even more difficult because the discrimination between dominant and complementary emotions is usually weak. We have created a database that includes 31,250 facial images with different emotions of 115 subjects whose gender distribution is almost uniform to address compound emotion recognition. In addition, we have organized a competition based on the proposed dataset, held at FG workshop 2020. This paper analyzes the winner’s approach—a two-stage recognition method (1st stage, coarse recognition; 2nd stage, fine recognition), which enhances the classification of symmetrical emotion labels. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ KAR2021 | Serial | 3642 | ||
Permanent link to this record | |||||
Author | Arjan Gijsenij; Theo Gevers; Joost Van de Weijer | ||||
Title | Computational Color Constancy: Survey and Experiments | Type | Journal Article | ||
Year | 2011 | Publication | IEEE Transactions on Image Processing | Abbreviated Journal | TIP |
Volume | 20 | Issue | 9 | Pages | 2475-2489 |
Keywords ![]() |
computational color constancy;computer vision application;gamut-based method;learning-based method;static method;colour vision;computer vision;image colour analysis;learning (artificial intelligence);lighting | ||||
Abstract | Computational color constancy is a fundamental prerequisite for many computer vision applications. This paper presents a survey of many recent developments and state-of-the- art methods. Several criteria are proposed that are used to assess the approaches. A taxonomy of existing algorithms is proposed and methods are separated in three groups: static methods, gamut-based methods and learning-based methods. Further, the experimental setup is discussed including an overview of publicly available data sets. Finally, various freely available methods, of which some are considered to be state-of-the-art, are evaluated on two data sets. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1057-7149 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ISE;CIC | Approved | no | ||
Call Number | Admin @ si @ GGW2011 | Serial | 1717 | ||
Permanent link to this record | |||||
Author | Mariano Vazquez; Ruth Aris; Guillaume Hozeaux; R.Aubry; P.Villar;Jaume Garcia ; Debora Gil; Francesc Carreras | ||||
Title | A massively parallel computational electrophysiology model of the heart | Type | Journal Article | ||
Year | 2011 | Publication | International Journal for Numerical Methods in Biomedical Engineering | Abbreviated Journal | IJNMBE |
Volume | 27 | Issue | Pages | 1911-1929 | |
Keywords ![]() |
computational electrophysiology; parallelization; finite element methods | ||||
Abstract | This paper presents a patient-sensitive simulation strategy capable of using the most efficient way the high-performance computational resources. The proposed strategy directly involves three different players: Computational Mechanics Scientists (CMS), Image Processing Scientists and Cardiologists, each one mastering its own expertise area within the project. This paper describes the general integrative scheme but focusing on the CMS side presents a massively parallel implementation of computational electrophysiology applied to cardiac tissue simulation. The paper covers different angles of the computational problem: equations, numerical issues, the algorithm and parallel implementation. The proposed methodology is illustrated with numerical simulations testing all the different possibilities, ranging from small domains up to very large ones. A key issue is the almost ideal scalability not only for large and complex problems but also for medium-size meshes. The explicit formulation is particularly well suited for solving this highly transient problems, with very short time-scale. | ||||
Address | Swansea (UK) | ||||
Corporate Author | John Wiley & Sons, Ltd. | Thesis | |||
Publisher | John Wiley & Sons, Ltd. | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM | Approved | no | ||
Call Number | IAM @ iam @ VAH2011 | Serial | 1198 | ||
Permanent link to this record | |||||
Author | Angel Sappa (ed) | ||||
Title | ICT Applications for Smart Cities | Type | Book Whole | ||
Year | 2022 | Publication | ICT Applications for Smart Cities | Abbreviated Journal | |
Volume | 224 | Issue | Pages | ||
Keywords ![]() |
Computational Intelligence; Intelligent Systems; Smart Cities; ICT Applications; Machine Learning; Pattern Recognition; Computer Vision; Image Processing | ||||
Abstract | Part of the book series: Intelligent Systems Reference Library (ISRL)
This book is the result of four-year work in the framework of the Ibero-American Research Network TICs4CI funded by the CYTED program. In the following decades, 85% of the world's population is expected to live in cities; hence, urban centers should be prepared to provide smart solutions for problems ranging from video surveillance and intelligent mobility to the solid waste recycling processes, just to mention a few. More specifically, the book describes underlying technologies and practical implementations of several successful case studies of ICTs developed in the following smart city areas: • Urban environment monitoring • Intelligent mobility • Waste recycling processes • Video surveillance • Computer-aided diagnose in healthcare systems • Computer vision-based approaches for efficiency in production processes The book is intended for researchers and engineers in the field of ICTs for smart cities, as well as to anyone who wants to know about state-of-the-art approaches and challenges on this field. |
||||
Address | September 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer | Place of Publication | Editor | Angel Sappa | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | ISRL | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-031-06306-0 | Medium | ||
Area | Expedition | Conference | |||
Notes | MSIAU; MACO | Approved | no | ||
Call Number | Admin @ si @ Sap2022 | Serial | 3812 | ||
Permanent link to this record | |||||
Author | Frederic Sampedro; Sergio Escalera; Anna Domenech; Ignasi Carrio | ||||
Title | A computational framework for cancer response assessment based on oncological PET-CT scans | Type | Journal Article | ||
Year | 2014 | Publication | Computers in Biology and Medicine | Abbreviated Journal | CBM |
Volume | 55 | Issue | Pages | 92–99 | |
Keywords ![]() |
Computer aided diagnosis; Nuclear medicine; Machine learning; Image processing; Quantitative analysis | ||||
Abstract | In this work we present a comprehensive computational framework to help in the clinical assessment of cancer response from a pair of time consecutive oncological PET-CT scans. In this scenario, the design and implementation of a supervised machine learning system to predict and quantify cancer progression or response conditions by introducing a novel feature set that models the underlying clinical context is described. Performance results in 100 clinical cases (corresponding to 200 whole body PET-CT scans) in comparing expert-based visual analysis and classifier decision making show up to 70% accuracy within a completely automatic pipeline and 90% accuracy when providing the system with expert-guided PET tumor segmentation masks. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ SED2014 | Serial | 2606 | ||
Permanent link to this record | |||||
Author | Victor Ponce | ||||
Title | Evolutionary Bags of Space-Time Features for Human Analysis | Type | Book Whole | ||
Year | 2016 | Publication | PhD Thesis Universitat de Barcelona, UOC and CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords ![]() |
Computer algorithms; Digital image processing; Digital video; Analysis of variance; Dynamic programming; Evolutionary computation; Gesture | ||||
Abstract | The representation (or feature) learning has been an emerging concept in the last years, since it collects a set of techniques that are present in any theoretical or practical methodology referring to artificial intelligence. In computer vision, a very common representation has adopted the form of the well-known Bag of Visual Words. This representation appears implicitly in most approaches where images are described, and is also present in a huge number of areas and domains: image content retrieval, pedestrian detection, human-computer interaction, surveillance, e-health, and social computing, amongst others. The early stages of this dissertation provide an approach for learning visual representations inside evolutionary algorithms, which consists of evolving weighting schemes to improve the BoVW representations for the task of recognizing categories of videos and images. Thus, we demonstrate the applicability of the most common weighting schemes, which are often used in text mining but are less frequently found in computer vision tasks. Beyond learning these visual representations, we provide an approach based on fusion strategies for learning spatiotemporal representations, from multimodal data obtained by depth sensors. Besides, we specially aim at the evolutionary and dynamic modelling, where the temporal factor is present in the nature of the data, such as video sequences of gestures and actions. Indeed, we explore the effects of probabilistic modelling for those approaches based on dynamic programming, so as to handle the temporal deformation and variance amongst video sequences of different categories. Finally, we integrate dynamic programming and generative models into an evolutionary computation framework, with the aim of learning Bags of SubGestures (BoSG) representations and hence to improve the generalization capability of standard gesture recognition approaches. The results obtained in the experimentation demonstrate, first, that evolutionary algorithms are useful for improving the representation of BoVW approaches in several datasets for recognizing categories in still images and video sequences. On the other hand, our experimentation reveals that both, the use of dynamic programming and generative models to align video sequences, and the representations obtained from applying fusion strategies in multimodal data, entail an enhancement on the performance when recognizing some gesture categories. Furthermore, the combination of evolutionary algorithms with models based on dynamic programming and generative approaches results, when aiming at the classification of video categories on large video datasets, in a considerable improvement over standard gesture and action recognition approaches. Finally, we demonstrate the applications of these representations in several domains for human analysis: classification of images where humans may be present, action and gesture recognition for general applications, and in particular for conversational settings within the field of restorative justice | ||||
Address | June 2016 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Sergio Escalera;Xavier Baro;Hugo Jair Escalante | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA | Approved | no | ||
Call Number | Pon2016 | Serial | 2814 | ||
Permanent link to this record | |||||
Author | Adriana Romero; Nicolas Ballas; Samira Ebrahimi Kahou; Antoine Chassang; Carlo Gatta; Yoshua Bengio | ||||
Title | FitNets: Hints for Thin Deep Nets | Type | Conference Article | ||
Year | 2015 | Publication | 3rd International Conference on Learning Representations ICLR2015 | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords ![]() |
Computer Science ; Learning; Computer Science ;Neural and Evolutionary Computing | ||||
Abstract | While depth tends to improve network performances, it also makes gradient-based training more difficult since deeper networks tend to be more non-linear. The recently proposed knowledge distillation approach is aimed at obtaining small and fast-to-execute models, and it has shown that a student network could imitate the soft output of a larger teacher network or ensemble of networks. In this paper, we extend this idea to allow the training of a student that is deeper and thinner than the teacher, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Because the student intermediate hidden layer will generally be smaller than the teacher's intermediate hidden layer, additional parameters are introduced to map the student hidden layer to the prediction of the teacher hidden layer. This allows one to train deeper students that can generalize better or run faster, a trade-off that is controlled by the chosen student capacity. For example, on CIFAR-10, a deep student network with almost 10.4 times less parameters outperforms a larger, state-of-the-art teacher network. | ||||
Address | San Diego; CA; May 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICLR | ||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ RBK2015 | Serial | 2593 | ||
Permanent link to this record | |||||
Author | Onur Ferhat | ||||
Title | Eye-Tracking with Webcam-Based Setups: Implementation of a Real-Time System and an Analysis of Factors Affecting Performance | Type | Report | ||
Year | 2012 | Publication | CVC Technical Report | Abbreviated Journal | |
Volume | 172 | Issue | Pages | ||
Keywords ![]() |
Computer vision, eye-tracking, gaussian process, feature selection, optical flow | ||||
Abstract | In the recent years commercial eye-tracking hardware has become more common, with the introduction of new models from several brands that have better performance and easier setup procedures. A cause and at the same time a result of this phenomenon is the popularity of eye-tracking research directed at marketing, accessibility and usability, among others.
One problem with these hardware components is scalability, because both the price and the necessary expertise to operate them makes it practically impossible in the large scale. In this work, we analyze the feasibility of a software eye-tracking system based on a single, ordinary webcam. Our aim is to discover the limits of such a system and to see whether it provides acceptable performances. The significance of this setup is that it is the most common setup found in consumer environments, off-the-shelf electronic devices such as laptops, mobile phones and tablet computers. As no special equipment such as infrared lights, mirrors or zoom lenses are used; setting up and calibrating the system is easier compared to other approaches using these components. Our work is based on the open source application Opengazer, which provides a good starting point for our contributions. We propose several improvements in order to push the system's performance further and make it feasible as a robust, real-time device. Then we carry out an elaborate experiment involving 18 human subjects and 4 different system setups. Finally, we give an analysis of the results and discuss the effects of setup changes, subject differences and modifications in the software. |
||||
Address | Bellaterra | ||||
Corporate Author | Computer Vision Center | Thesis | Master's thesis | ||
Publisher | Place of Publication | Editor | Fernando Vilariño | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MV | Approved | no | ||
Call Number | Admin @ si @ Fer2012; IAM @ iam @ Fer2012 | Serial | 2165 | ||
Permanent link to this record | |||||
Author | Zhengying Liu; Zhen Xu; Sergio Escalera; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Adrien Pavao; Sebastien Treguer; Wei-Wei Tu | ||||
Title | Towards automated computer vision: analysis of the AutoCV challenges 2019 | Type | Journal Article | ||
Year | 2020 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 135 | Issue | Pages | 196-203 | |
Keywords ![]() |
Computer vision; AutoML; Deep learning | ||||
Abstract | We present the results of recent challenges in Automated Computer Vision (AutoCV, renamed here for clarity AutoCV1 and AutoCV2, 2019), which are part of a series of challenge on Automated Deep Learning (AutoDL). These two competitions aim at searching for fully automated solutions for classification tasks in computer vision, with an emphasis on any-time performance. The first competition was limited to image classification while the second one included both images and videos. Our design imposed to the participants to submit their code on a challenge platform for blind testing on five datasets, both for training and testing, without any human intervention whatsoever. Winning solutions adopted deep learning techniques based on already published architectures, such as AutoAugment, MobileNet and ResNet, to reach state-of-the-art performance in the time budget of the challenge (only 20 minutes of GPU time). The novel contributions include strategies to deliver good preliminary results at any time during the learning process, such that a method can be stopped early and still deliver good performance. This feature is key for the adoption of such techniques by data analysts desiring to obtain rapidly preliminary results on large datasets and to speed up the development process. The soundness of our design was verified in several aspects: (1) Little overfitting of the on-line leaderboard providing feedback on 5 development datasets was observed, compared to the final blind testing on the 5 (separate) final test datasets, suggesting that winning solutions might generalize to other computer vision classification tasks; (2) Error bars on the winners’ performance allow us to say with confident that they performed significantly better than the baseline solutions we provided; (3) The ranking of participants according to the any-time metric we designed, namely the Area under the Learning Curve, was different from that of the fixed-time metric, i.e. AUC at the end of the fixed time budget. We released all winning solutions under open-source licenses. At the end of the AutoDL challenge series, all data of the challenge will be made publicly available, thus providing a collection of uniformly formatted datasets, which can serve to conduct further research, particularly on meta-learning. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ LXE2020 | Serial | 3427 | ||
Permanent link to this record | |||||
Author | Alex Gomez-Villa; Bartlomiej Twardowski; Lu Yu; Andrew Bagdanov; Joost Van de Weijer | ||||
Title | Continually Learning Self-Supervised Representations With Projected Functional Regularization | Type | Conference Article | ||
Year | 2022 | Publication | CVPR 2022 Workshop on Continual Learning (CLVision, 3rd Edition) | Abbreviated Journal | |
Volume | Issue | Pages | 3866-3876 | ||
Keywords ![]() |
Computer vision; Conferences; Self-supervised learning; Image representation; Pattern recognition | ||||
Abstract | Recent self-supervised learning methods are able to learn high-quality image representations and are closing the gap with supervised approaches. However, these methods are unable to acquire new knowledge incrementally – they are, in fact, mostly used only as a pre-training phase over IID data. In this work we investigate self-supervised methods in continual learning regimes without any replay
mechanism. We show that naive functional regularization,also known as feature distillation, leads to lower plasticity and limits continual learning performance. Instead, we propose Projected Functional Regularization in which a separate temporal projection network ensures that the newly learned feature space preserves information of the previous one, while at the same time allowing for the learning of new features. This prevents forgetting while maintaining the plasticity of the learner. Comparison with other incremental learning approaches applied to self-supervision demonstrates that our method obtains competitive performance in different scenarios and on multiple datasets. |
||||
Address | New Orleans, USA; 20 June 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | LAMP: 600.147; 600.120 | Approved | no | ||
Call Number | Admin @ si @ GTY2022 | Serial | 3704 | ||
Permanent link to this record | |||||
Author | Meysam Madadi; Sergio Escalera; Xavier Baro; Jordi Gonzalez | ||||
Title | End-to-end Global to Local CNN Learning for Hand Pose Recovery in Depth data | Type | Journal Article | ||
Year | 2022 | Publication | IET Computer Vision | Abbreviated Journal | IETCV |
Volume | 16 | Issue | 1 | Pages | 50-66 |
Keywords ![]() |
Computer vision; data acquisition; human computer interaction; learning (artificial intelligence); pose estimation | ||||
Abstract | Despite recent advances in 3D pose estimation of human hands, especially thanks to the advent of CNNs and depth cameras, this task is still far from being solved. This is mainly due to the highly non-linear dynamics of fingers, which make hand model training a challenging task. In this paper, we exploit a novel hierarchical tree-like structured CNN, in which branches are trained to become specialized in predefined subsets of hand joints, called local poses. We further fuse local pose features, extracted from hierarchical CNN branches, to learn higher order dependencies among joints in the final pose by end-to-end training. Lastly, the loss function used is also defined to incorporate appearance and physical constraints about doable hand motion and deformation. Finally, we introduce a non-rigid data augmentation approach to increase the amount of training depth data. Experimental results suggest that feeding a tree-shaped CNN, specialized in local poses, into a fusion network for modeling joints correlations and dependencies, helps to increase the precision of final estimations, outperforming state-of-the-art results on NYU and SyntheticHand datasets. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; ISE; 600.098; 600.119 | Approved | no | ||
Call Number | Admin @ si @ MEB2022 | Serial | 3652 | ||
Permanent link to this record | |||||
Author | Victoria Ruiz; Angel Sanchez; Jose F. Velez; Bogdan Raducanu | ||||
Title | Automatic Image-Based Waste Classification | Type | Conference Article | ||
Year | 2019 | Publication | International Work-Conference on the Interplay Between Natural and Artificial Computation. From Bioinspired Systems and Biomedical Applications to Machine Learning | Abbreviated Journal | |
Volume | 11487 | Issue | Pages | 422–431 | |
Keywords ![]() |
Computer Vision; Deep learning; Convolutional neural networks; Waste classification | ||||
Abstract | The management of solid waste in large urban environments has become a complex problem due to increasing amount of waste generated every day by citizens and companies. Current Computer Vision and Deep Learning techniques can help in the automatic detection and classification of waste types for further recycling tasks. In this work, we use the TrashNet dataset to train and compare different deep learning architectures for automatic classification of garbage types. In particular, several Convolutional Neural Networks (CNN) architectures were compared: VGG, Inception and ResNet. The best classification results were obtained using a combined Inception-ResNet model that achieved 88.6% of accuracy. These are the best results obtained with the considered dataset. | ||||
Address | Almeria; June 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IWINAC | ||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | RSV2019 | Serial | 3273 | ||
Permanent link to this record |