Home | [1–10] << 11 12 13 14 15 16 17 18 19 20 >> [21–30] |
Records | |||||
---|---|---|---|---|---|
Author | M. Altillawi; S. Li; S.M. Prakhya; Z. Liu; Joan Serrat | ||||
Title | Implicit Learning of Scene Geometry From Poses for Global Localization | Type | Journal Article | ||
Year | 2024 | Publication | IEEE Robotics and Automation Letters | Abbreviated Journal | ROBOTAUTOMLET |
Volume | 9 | Issue | 2 | Pages | 955-962 |
Keywords | Localization; Localization and mapping; Deep learning for visual perception; Visual learning | ||||
Abstract | Global visual localization estimates the absolute pose of a camera using a single image, in a previously mapped area. Obtaining the pose from a single image enables many robotics and augmented/virtual reality applications. Inspired by latest advances in deep learning, many existing approaches directly learn and regress 6 DoF pose from an input image. However, these methods do not fully utilize the underlying scene geometry for pose regression. The challenge in monocular relocalization is the minimal availability of supervised training data, which is just the corresponding 6 DoF poses of the images. In this letter, we propose to utilize these minimal available labels (i.e., poses) to learn the underlying 3D geometry of the scene and use the geometry to estimate the 6 DoF camera pose. We present a learning method that uses these pose labels and rigid alignment to learn two 3D geometric representations ( X, Y, Z coordinates ) of the scene, one in camera coordinate frame and the other in global coordinate frame. Given a single image, it estimates these two 3D scene representations, which are then aligned to estimate a pose that matches the pose label. This formulation allows for the active inclusion of additional learning constraints to minimize 3D alignment errors between the two 3D scene representations, and 2D re-projection errors between the 3D global scene representation and 2D image pixels, resulting in improved localization accuracy. During inference, our model estimates the 3D scene geometry in camera and global frames and aligns them rigidly to obtain pose in real-time. We evaluate our work on three common visual localization datasets, conduct ablation studies, and show that our method exceeds state-of-the-art regression methods' pose accuracy on all datasets. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2377-3766 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3857 | ||
Permanent link to this record | |||||
Author | Fernando Barrera; Felipe Lumbreras; Angel Sappa | ||||
Title | Multimodal Stereo Vision System: 3D Data Extraction and Algorithm Evaluation | Type | Journal Article | ||
Year | 2012 | Publication | IEEE Journal of Selected Topics in Signal Processing | Abbreviated Journal | J-STSP |
Volume | 6 | Issue | 5 | Pages | 437-446 |
Keywords | |||||
Abstract | This paper proposes an imaging system for computing sparse depth maps from multispectral images. A special stereo head consisting of an infrared and a color camera defines the proposed multimodal acquisition system. The cameras are rigidly attached so that their image planes are parallel. Details about the calibration and image rectification procedure are provided. Sparse disparity maps are obtained by the combined use of mutual information enriched with gradient information. The proposed approach is evaluated using a Receiver Operating Characteristics curve. Furthermore, a multispectral dataset, color and infrared images, together with their corresponding ground truth disparity maps, is generated and used as a test bed. Experimental results in real outdoor scenarios are provided showing its viability and that the proposed approach is not restricted to a specific domain. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1932-4553 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ BLS2012b | Serial | 2155 | ||
Permanent link to this record | |||||
Author | Karim Lekadir; Alfiia Galimzianova; Angels Betriu; Maria del Mar Vila; Laura Igual; Daniel L. Rubin; Elvira Fernandez-Giraldez; Petia Radeva; Sandy Napel | ||||
Title | A Convolutional Neural Network for Automatic Characterization of Plaque Composition in Carotid Ultrasound | Type | Journal Article | ||
Year | 2017 | Publication | IEEE Journal Biomedical and Health Informatics | Abbreviated Journal | J-BHI |
Volume | 21 | Issue | 1 | Pages | 48-55 |
Keywords | |||||
Abstract | Characterization of carotid plaque composition, more specifically the amount of lipid core, fibrous tissue, and calcified tissue, is an important task for the identification of plaques that are prone to rupture, and thus for early risk estimation of cardiovascular and cerebrovascular events. Due to its low costs and wide availability, carotid ultrasound has the potential to become the modality of choice for plaque characterization in clinical practice. However, its significant image noise, coupled with the small size of the plaques and their complex appearance, makes it difficult for automated techniques to discriminate between the different plaque constituents. In this paper, we propose to address this challenging problem by exploiting the unique capabilities of the emerging deep learning framework. More specifically, and unlike existing works which require a priori definition of specific imaging features or thresholding values, we propose to build a convolutional neural network (CNN) that will automatically extract from the images the information that is optimal for the identification of the different plaque constituents. We used approximately 90 000 patches extracted from a database of images and corresponding expert plaque characterizations to train and to validate the proposed CNN. The results of cross-validation experiments show a correlation of about 0.90 with the clinical assessment for the estimation of lipid core, fibrous cap, and calcified tissue areas, indicating the potential of deep learning for the challenging task of automatic characterization of plaque composition in carotid ultrasound. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ LGB2017 | Serial | 2931 | ||
Permanent link to this record | |||||
Author | Santiago Segui; Michal Drozdzal; Ekaterina Zaytseva; Fernando Azpiroz; Petia Radeva; Jordi Vitria | ||||
Title | Detection of wrinkle frames in endoluminal videos using betweenness centrality measures for images | Type | Journal Article | ||
Year | 2014 | Publication | IEEE Transactions on Information Technology in Biomedicine | Abbreviated Journal | TITB |
Volume | 18 | Issue | 6 | Pages | 1831-1838 |
Keywords | Wireless Capsule Endoscopy; Small Bowel Motility Dysfunction; Contraction Detection; Structured Prediction; Betweenness Centrality | ||||
Abstract | Intestinal contractions are one of the most important events to diagnose motility pathologies of the small intestine. When visualized by wireless capsule endoscopy (WCE), the sequence of frames that represents a contraction is characterized by a clear wrinkle structure in the central frames that corresponds to the folding of the intestinal wall. In this paper we present a new method to robustly detect wrinkle frames in full WCE videos by using a new mid-level image descriptor that is based on a centrality measure proposed for graphs. We present an extended validation, carried out in a very large database, that shows that the proposed method achieves state of the art performance for this task. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | OR; MILAB; 600.046;MV | Approved | no | ||
Call Number | Admin @ si @ SDZ2014 | Serial | 2385 | ||
Permanent link to this record | |||||
Author | Akhil Gurram; Onay Urfalioglu; Ibrahim Halfaoui; Fahd Bouzaraa; Antonio Lopez | ||||
Title | Monocular Depth Estimation by Learning from Heterogeneous Datasets | Type | Conference Article | ||
Year | 2018 | Publication | IEEE Intelligent Vehicles Symposium | Abbreviated Journal | |
Volume | Issue | Pages | 2176 - 2181 | ||
Keywords | |||||
Abstract | Depth estimation provides essential information to perform autonomous driving and driver assistance. Especially, Monocular Depth Estimation is interesting from a practical point of view, since using a single camera is cheaper than many other options and avoids the need for continuous calibration strategies as required by stereo-vision approaches. State-of-the-art methods for Monocular Depth Estimation are based on Convolutional Neural Networks (CNNs). A promising line of work consists of introducing additional semantic information about the traffic scene when training CNNs for depth estimation. In practice, this means that the depth data used for CNN training is complemented with images having pixel-wise semantic labels, which usually are difficult to annotate (eg crowded urban images). Moreover, so far it is common practice to assume that the same raw training data is associated with both types of ground truth, ie, depth and semantic labels. The main contribution of this paper is to show that this hard constraint can be circumvented, ie, that we can train CNNs for depth estimation by leveraging the depth and semantic information coming from heterogeneous datasets. In order to illustrate the benefits of our approach, we combine KITTI depth and Cityscapes semantic segmentation datasets, outperforming state-of-the-art results on Monocular Depth Estimation. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IV | ||
Notes | ADAS; 600.124; 600.116; 600.118 | Approved | no | ||
Call Number | Admin @ si @ GUH2018 | Serial | 3183 | ||
Permanent link to this record | |||||
Author | Jiaolong Xu; David Vazquez; Antonio Lopez; Javier Marin; Daniel Ponsa | ||||
Title | Learning a Multiview Part-based Model in Virtual World for Pedestrian Detection | Type | Conference Article | ||
Year | 2013 | Publication | IEEE Intelligent Vehicles Symposium | Abbreviated Journal | |
Volume | Issue | Pages | 467 - 472 | ||
Keywords | Pedestrian Detection; Virtual World; Part based | ||||
Abstract | State-of-the-art deformable part-based models based on latent SVM have shown excellent results on human detection. In this paper, we propose to train a multiview deformable part-based model with automatically generated part examples from virtual-world data. The method is efficient as: (i) the part detectors are trained with precisely extracted virtual examples, thus no latent learning is needed, (ii) the multiview pedestrian detector enhances the performance of the pedestrian root model, (iii) a top-down approach is used for part detection which reduces the searching space. We evaluate our model on Daimler and Karlsruhe Pedestrian Benchmarks with publicly available Caltech pedestrian detection evaluation framework and the result outperforms the state-of-the-art latent SVM V4.0, on both average miss rate and speed (our detector is ten times faster). | ||||
Address | Gold Coast; Australia; June 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | IEEE | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1931-0587 | ISBN | 978-1-4673-2754-1 | Medium | |
Area | Expedition | Conference | IV | ||
Notes | ADAS; 600.054; 600.057 | Approved | no | ||
Call Number | XVL2013; ADAS @ adas @ xvl2013a | Serial | 2214 | ||
Permanent link to this record | |||||
Author | Naveen Onkarappa; Angel Sappa | ||||
Title | An Empirical Study on Optical Flow Accuracy Depending on Vehicle Speed | Type | Conference Article | ||
Year | 2012 | Publication | IEEE Intelligent Vehicles Symposium | Abbreviated Journal | |
Volume | Issue | Pages | 1138-1143 | ||
Keywords | |||||
Abstract | Driver assistance and safety systems are getting attention nowadays towards automatic navigation and safety. Optical flow as a motion estimation technique has got major roll in making these systems a reality. Towards this, in the current paper, the suitability of polar representation for optical flow estimation in such systems is demonstrated. Furthermore, the influence of individual regularization terms on the accuracy of optical flow on image sequences of different speeds is empirically evaluated. Also a new synthetic dataset of image sequences with different speeds is generated along with the ground-truth optical flow. | ||||
Address | Alcalá de Henares | ||||
Corporate Author | Thesis | ||||
Publisher | IEEE Xplore | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1931-0587 | ISBN | 978-1-4673-2119-8 | Medium | |
Area | Expedition | Conference | IV | ||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ NaS2012 | Serial | 2020 | ||
Permanent link to this record | |||||
Author | Miguel Oliveira; Angel Sappa; V. Santos | ||||
Title | Color Correction for Onboard Multi-camera Systems using 3D Gaussian Mixture Models | Type | Conference Article | ||
Year | 2012 | Publication | IEEE Intelligent Vehicles Symposium | Abbreviated Journal | |
Volume | Issue | Pages | 299-303 | ||
Keywords | |||||
Abstract | The current paper proposes a novel color correction approach for onboard multi-camera systems. It works by segmenting the given images into several regions. A probabilistic segmentation framework, using 3D Gaussian Mixture Models, is proposed. Regions are used to compute local color correction functions, which are then combined to obtain the final corrected image. An image data set of road scenarios is used to establish a performance comparison of the proposed method with other seven well known color correction algorithms. Results show that the proposed approach is the highest scoring color correction method. Also, the proposed single step 3D color space probabilistic segmentation reduces processing time over similar approaches. | ||||
Address | Alcalá de Henares | ||||
Corporate Author | Thesis | ||||
Publisher | IEEE Xplore | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1931-0587 | ISBN | 978-1-4673-2119-8 | Medium | |
Area | Expedition | Conference | IV | ||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ OSS2012b | Serial | 2021 | ||
Permanent link to this record | |||||
Author | Diego Cheda; Daniel Ponsa; Antonio Lopez | ||||
Title | Pedestrian Candidates Generation using Monocular Cues | Type | Conference Article | ||
Year | 2012 | Publication | IEEE Intelligent Vehicles Symposium | Abbreviated Journal | |
Volume | Issue | Pages | 7-12 | ||
Keywords | pedestrian detection | ||||
Abstract | Common techniques for pedestrian candidates generation (e.g., sliding window approaches) are based on an exhaustive search over the image. This implies that the number of windows produced is huge, which translates into a significant time consumption in the classification stage. In this paper, we propose a method that significantly reduces the number of windows to be considered by a classifier. Our method is a monocular one that exploits geometric and depth information available on single images. Both representations of the world are fused together to generate pedestrian candidates based on an underlying model which is focused only on objects standing vertically on the ground plane and having certain height, according with their depths on the scene. We evaluate our algorithm on a challenging dataset and demonstrate its application for pedestrian detection, where a considerable reduction in the number of candidate windows is reached. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | IEEE Xplore | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1931-0587 | ISBN | 978-1-4673-2119-8 | Medium | |
Area | Expedition | Conference | IV | ||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ CPL2012c; ADAS @ adas @ cpl2012d | Serial | 2013 | ||
Permanent link to this record | |||||
Author | Diego Alejandro Cheda; Daniel Ponsa; Antonio Lopez | ||||
Title | Camera Egomotion Estimation in the ADAS Context | Type | Conference Article | ||
Year | 2010 | Publication | 13th International IEEE Annual Conference on Intelligent Transportation Systems | Abbreviated Journal | |
Volume | Issue | Pages | 1415–1420 | ||
Keywords | |||||
Abstract | Camera-based Advanced Driver Assistance Systems (ADAS) have concentrated many research efforts in the last decades. Proposals based on monocular cameras require the knowledge of the camera pose with respect to the environment, in order to reach an efficient and robust performance. A common assumption in such systems is considering the road as planar, and the camera pose with respect to it as approximately known. However, in real situations, the camera pose varies along time due to the vehicle movement, the road slope, and irregularities on the road surface. Thus, the changes in the camera position and orientation (i.e., the egomotion) are critical information that must be estimated at every frame to avoid poor performances. This work focuses on egomotion estimation from a monocular camera under the ADAS context. We review and compare egomotion methods with simulated and real ADAS-like sequences. Basing on the results of our experiments, we show which of the considered nonlinear and linear algorithms have the best performance in this domain. | ||||
Address | Madeira Island (Portugal) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2153-0009 | ISBN | 978-1-4244-7657-2 | Medium | |
Area | Expedition | Conference | ITSC | ||
Notes | ADAS | Approved | no | ||
Call Number | ADAS @ adas @ CPL2010 | Serial | 1425 | ||
Permanent link to this record | |||||
Author | Ferran Diego; Daniel Ponsa; Joan Serrat; Antonio Lopez | ||||
Title | Vehicle geolocalization based on video synchronization | Type | Conference Article | ||
Year | 2010 | Publication | 13th Annual International Conference on Intelligent Transportation Systems | Abbreviated Journal | |
Volume | Issue | Pages | 1511–1516 | ||
Keywords | video alignment | ||||
Abstract | TC8.6
This paper proposes a novel method for estimating the geospatial localization of a vehicle. I uses as input a georeferenced video sequence recorded by a forward-facing camera attached to the windscreen. The core of the proposed method is an on-line video synchronization which finds out the corresponding frame in the georeferenced video sequence to the one recorded at each time by the camera on a second drive through the same track. Once found the corresponding frame in the georeferenced video sequence, we transfer its geospatial information of this frame. The key advantages of this method are: 1) the increase of the update rate and the geospatial accuracy with regard to a standard low-cost GPS and 2) the ability to localize a vehicle even when a GPS is not available or is not reliable enough, like in certain urban areas. Experimental results for an urban environments are presented, showing an average of relative accuracy of 1.5 meters. |
||||
Address | Madeira Island (Portugal) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2153-0009 | ISBN | 978-1-4244-7657-2 | Medium | |
Area | Expedition | Conference | ITSC | ||
Notes | ADAS | Approved | no | ||
Call Number | ADAS @ adas @ DPS2010 | Serial | 1423 | ||
Permanent link to this record | |||||
Author | Ferran Diego; Jose Manuel Alvarez; Joan Serrat; Antonio Lopez | ||||
Title | Vision-based road detection via on-line video registration | Type | Conference Article | ||
Year | 2010 | Publication | 13th Annual International Conference on Intelligent Transportation Systems | Abbreviated Journal | |
Volume | Issue | Pages | 1135–1140 | ||
Keywords | video alignment; road detection | ||||
Abstract | TB6.2
Road segmentation is an essential functionality for supporting advanced driver assistance systems (ADAS) such as road following and vehicle and pedestrian detection. Significant efforts have been made in order to solve this task using vision-based techniques. The major challenge is to deal with lighting variations and the presence of objects on the road surface. In this paper, we propose a new road detection method to infer the areas of the image depicting road surfaces without performing any image segmentation. The idea is to previously segment manually or semi-automatically the road region in a traffic-free reference video record on a first drive. And then to transfer these regions to the frames of a second video sequence acquired later in a second drive through the same road, in an on-line manner. This is possible because we are able to automatically align the two videos in time and space, that is, to synchronize them and warp each frame of the first video to its corresponding frame in the second one. The geometric transform can thus transfer the road region to the present frame on-line. In order to reduce the different lighting conditions which are present in outdoor scenarios, our approach incorporates a shadowless feature space which represents an image in an illuminant-invariant feature space. Furthermore, we propose a dynamic background subtraction algorithm which removes the regions containing vehicles in the observed frames which are within the transferred road region. |
||||
Address | Madeira Island (Portugal) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2153-0009 | ISBN | 978-1-4244-7657-2 | Medium | |
Area | Expedition | Conference | ITSC | ||
Notes | ADAS | Approved | no | ||
Call Number | ADAS @ adas @ DAS2010 | Serial | 1424 | ||
Permanent link to this record | |||||
Author | Sergio Vera; Miguel Angel Gonzalez Ballester; Debora Gil | ||||
Title | A medial map capturing the essential geometry of organs | Type | Conference Article | ||
Year | 2012 | Publication | ISBI Workshop on Open Source Medical Image Analysis software | Abbreviated Journal | |
Volume | Issue | Pages | 1691 - 1694 | ||
Keywords | Medial Surface Representation, Volume Reconstruction,Geometry , Image reconstruction , Liver , Manifolds , Shape , Surface morphology , Surface reconstruction | ||||
Abstract | Medial representations are powerful tools for describing and parameterizing the volumetric shape of anatomical structures. Accurate computation of one pixel wide medial surfaces is mandatory. Those surfaces must represent faithfully the geometry of the volume. Although morphological methods produce excellent results in 2D, their complexity and quality drops across dimensions, due to a more complex description of pixel neighborhoods. This paper introduces a continuous operator for accurate and efficient computation of medial structures of arbitrary dimension. Our experiments show its higher performance for medical imaging applications in terms of simplicity of medial structures and capability for reconstructing the anatomical volume | ||||
Address | Barcelona,Spain | ||||
Corporate Author | Thesis | ||||
Publisher | IEEE | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1945-7928 | ISBN | 978-1-4577-1857-1 | Medium | |
Area | Expedition | Conference | ISBI | ||
Notes | IAM | Approved | no | ||
Call Number | IAM @ iam @ VGG2012a | Serial | 1989 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Oriol Pujol; Eric Laciar; Jordi Vitria; Esther Pueyo; Petia Radeva | ||||
Title | Coronary Damage Classification of Patients with the Chagas Disease with Error-Correcting Output Codes | Type | Conference Article | ||
Year | 2008 | Publication | Intelligent Systems, 4th International IEEE Conference, 6–8 setembre 2008. | Abbreviated Journal | |
Volume | 2 | Issue | Pages | 12–17 | |
Keywords | |||||
Abstract | The Chagaspsila disease is endemic in all Latin America, affecting millions of people in the continent. In order to diagnose and treat the Chagaspsila disease, it is important to detect and measure the coronary damage of the patient. In this paper, we analyze and categorize patients into different groups based on the coronary damage produced by the disease. Based on the features of the heart cycle extracted using high resolution ECG, a multi-class scheme of error-correcting output codes (ECOC) is formulated and successfully applied. The results show that the proposed scheme obtains significant performance improvements compared to previous works and state-of-the-art ECOC designs. | ||||
Address | Varna (Bulgaria) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IS’08 | ||
Notes | MILAB; OR;HuPBA;MV | Approved | no | ||
Call Number | BCNPCL @ bcnpcl @ EPL2008 | Serial | 1042 | ||
Permanent link to this record | |||||
Author | Kamal Nasrollahi; Sergio Escalera; P. Rasti; Gholamreza Anbarjafari; Xavier Baro; Hugo Jair Escalante; Thomas B. Moeslund | ||||
Title | Deep Learning based Super-Resolution for Improved Action Recognition | Type | Conference Article | ||
Year | 2015 | Publication | 5th International Conference on Image Processing Theory, Tools and Applications IPTA2015 | Abbreviated Journal | |
Volume | Issue | Pages | 67 - 72 | ||
Keywords | |||||
Abstract | Action recognition systems mostly work with videos of proper quality and resolution. Even most challenging benchmark databases for action recognition, hardly include videos of low-resolution from, e.g., surveillance cameras. In videos recorded by such cameras, due to the distance between people and cameras, people are pictured very small and hence challenge action recognition algorithms. Simple upsampling methods, like bicubic interpolation, cannot retrieve all the detailed information that can help the recognition. To deal with this problem, in this paper we combine results of bicubic interpolation with results of a state-ofthe-art deep learning-based super-resolution algorithm, through an alpha-blending approach. The experimental results obtained on down-sampled version of a large subset of Hoolywood2 benchmark database show the importance of the proposed system in increasing the recognition rate of a state-of-the-art action recognition system for handling low-resolution videos. | ||||
Address | Orleans; France; November 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IPTA | ||
Notes | HuPBA;MV | Approved | no | ||
Call Number | Admin @ si @ NER2015 | Serial | 2648 | ||
Permanent link to this record |