Home | [1–10] << 11 12 13 14 15 16 17 18 19 20 >> [21–30] |
Records | |||||
---|---|---|---|---|---|
Author | Alejandro Cartas; Jordi Luque; Petia Radeva; Carlos Segura; Mariella Dimiccoli | ||||
Title | How Much Does Audio Matter to Recognize Egocentric Object Interactions? | Type | Miscellaneous | ||
Year | 2019 | Publication | Arxiv | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | CoRR abs/1906.00634
Sounds are an important source of information on our daily interactions with objects. For instance, a significant amount of people can discern the temperature of water that it is being poured just by using the sense of hearing. However, only a few works have explored the use of audio for the classification of object interactions in conjunction with vision or as single modality. In this preliminary work, we propose an audio model for egocentric action recognition and explore its usefulness on the parts of the problem (noun, verb, and action classification). Our model achieves a competitive result in terms of verb classification (34.26% accuracy) on a standard benchmark with respect to vision-based state of the art systems, using a comparatively lighter architecture. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ CLR2019 | Serial | 3383 | ||
Permanent link to this record | |||||
Author | Alejandro Cartas; Jordi Luque; Petia Radeva; Carlos Segura; Mariella Dimiccoli | ||||
Title | Seeing and Hearing Egocentric Actions: How Much Can We Learn? | Type | Conference Article | ||
Year | 2019 | Publication | IEEE International Conference on Computer Vision Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 4470-4480 | ||
Keywords | |||||
Abstract | Our interaction with the world is an inherently multimodal experience. However, the understanding of human-to-object interactions has historically been addressed focusing on a single modality. In particular, a limited number of works have considered to integrate the visual and audio modalities for this purpose. In this work, we propose a multimodal approach for egocentric action recognition in a kitchen environment that relies on audio and visual information. Our model combines a sparse temporal sampling strategy with a late fusion of audio, spatial, and temporal streams. Experimental results on the EPIC-Kitchens dataset show that multimodal integration leads to better performance than unimodal approaches. In particular, we achieved a 5.18% improvement over the state of the art on verb classification. | ||||
Address | Seul; Korea; October 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCVW | ||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ CLR2019b | Serial | 3385 | ||
Permanent link to this record | |||||
Author | Alejandro Cartas; Juan Marin; Petia Radeva; Mariella Dimiccoli | ||||
Title | Batch-based activity recognition from egocentric photo-streams revisited | Type | Journal Article | ||
Year | 2018 | Publication | Pattern Analysis and Applications | Abbreviated Journal | PAA |
Volume | 21 | Issue | 4 | Pages | 953–965 |
Keywords | Egocentric vision; Lifelogging; Activity recognition; Deep learning; Recurrent neural networks | ||||
Abstract | Wearable cameras can gather large amounts of image data that provide rich visual information about the daily activities of the wearer. Motivated by the large number of health applications that could be enabled by the automatic recognition of daily activities, such as lifestyle characterization for habit improvement, context-aware personal assistance and tele-rehabilitation services, we propose a system to classify 21 daily activities from photo-streams acquired by a wearable photo-camera. Our approach combines the advantages of a late fusion ensemble strategy relying on convolutional neural networks at image level with the ability of recurrent neural networks to account for the temporal evolution of high-level features in photo-streams without relying on event boundaries. The proposed batch-based approach achieved an overall accuracy of 89.85%, outperforming state-of-the-art end-to-end methodologies. These results were achieved on a dataset consists of 44,902 egocentric pictures from three persons captured during 26 days in average. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ CMR2018 | Serial | 3186 | ||
Permanent link to this record | |||||
Author | Alejandro Cartas; Mariella Dimiccoli; Petia Radeva | ||||
Title | Batch-based activity recognition from egocentric photo-streams | Type | Conference Article | ||
Year | 2017 | Publication | 1st International workshop on Egocentric Perception, Interaction and Computing | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Activity recognition from long unstructured egocentric photo-streams has several applications in assistive technology such as health monitoring and frailty detection, just to name a few. However, one of its main technical challenges is to deal with the low frame rate of wearable photo-cameras, which causes abrupt appearance changes between consecutive frames. In consequence, important discriminatory low-level features from motion such as optical flow cannot be estimated. In this paper, we present a batch-driven approach for training a deep learning architecture that strongly rely on Long short-term units to tackle this problem. We propose two different implementations of the same approach that process a photo-stream sequence using batches of fixed size with the goal of capturing the temporal evolution of high-level features. The main difference between these implementations is that one explicitly models consecutive batches by overlapping them. Experimental results over a public dataset acquired by three users demonstrate the validity of the proposed architectures to exploit the temporal evolution of convolutional features over time without relying on event boundaries. | ||||
Address | Venice; Italy; October 2017; | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCV - EPIC | ||
Notes | MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ CDR2017 | Serial | 3023 | ||
Permanent link to this record | |||||
Author | Alejandro Cartas; Petia Radeva; Mariella Dimiccoli | ||||
Title | Activities of Daily Living Monitoring via a Wearable Camera: Toward Real-World Applications | Type | Journal Article | ||
Year | 2020 | Publication | IEEE Access | Abbreviated Journal | ACCESS |
Volume | 8 | Issue | Pages | 77344 - 77363 | |
Keywords | |||||
Abstract | Activity recognition from wearable photo-cameras is crucial for lifestyle characterization and health monitoring. However, to enable its wide-spreading use in real-world applications, a high level of generalization needs to be ensured on unseen users. Currently, state-of-the-art methods have been tested only on relatively small datasets consisting of data collected by a few users that are partially seen during training. In this paper, we built a new egocentric dataset acquired by 15 people through a wearable photo-camera and used it to test the generalization capabilities of several state-of-the-art methods for egocentric activity recognition on unseen users and daily image sequences. In addition, we propose several variants to state-of-the-art deep learning architectures, and we show that it is possible to achieve 79.87% accuracy on users unseen during training. Furthermore, to show that the proposed dataset and approach can be useful in real-world applications, where data can be acquired by different wearable cameras and labeled data are scarcely available, we employed a domain adaptation strategy on two egocentric activity recognition benchmark datasets. These experiments show that the model learned with our dataset, can easily be transferred to other domains with a very small amount of labeled data. Taken together, those results show that activity recognition from wearable photo-cameras is mature enough to be tested in real-world applications. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ CRD2020 | Serial | 3436 | ||
Permanent link to this record | |||||
Author | Alejandro Cartas; Petia Radeva; Mariella Dimiccoli | ||||
Title | Modeling long-term interactions to enhance action recognition | Type | Conference Article | ||
Year | 2021 | Publication | 25th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 10351-10358 | ||
Keywords | |||||
Abstract | In this paper, we propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels. At the frame level, we use a region-based approach that takes as input a primary region roughly corresponding to the user hands and a set of secondary regions potentially corresponding to the interacting objects and calculates the action score through a CNN formulation. This information is then fed to a Hierarchical LongShort-Term Memory Network (HLSTM) that captures temporal dependencies between actions within and across shots. Ablation studies thoroughly validate the proposed approach, showing in particular that both levels of the HLSTM architecture contribute to performance improvement. Furthermore, quantitative comparisons show that the proposed approach outperforms the state-of-the-art in terms of action recognition on standard benchmarks,without relying on motion information | ||||
Address | January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | MILAB; | Approved | no | ||
Call Number | Admin @ si @ CRD2021 | Serial | 3626 | ||
Permanent link to this record | |||||
Author | Alejandro Gonzalez Alzate | ||||
Title | Evaluation of spatiotemporal descriptors for pedestrian detection in video sequences | Type | Report | ||
Year | 2011 | Publication | CVC Technical Report | Abbreviated Journal | |
Volume | 166 | Issue | Pages | ||
Keywords | |||||
Abstract | |||||
Address | Bellaterra (Spain) | ||||
Corporate Author | Computer Vision Center | Thesis | Master's thesis | ||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ Gon2011 | Serial | 1932 | ||
Permanent link to this record | |||||
Author | Alejandro Gonzalez Alzate | ||||
Title | Multi-modal Pedestrian Detection | Type | Book Whole | ||
Year | 2015 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Pedestrian detection continues to be an extremely challenging problem in real scenarios, in which situations like illumination changes, noisy images, unexpected objects, uncontrolled scenarios and variant appearance of objects occur constantly. All these problems force the development of more robust detectors for relevant applications like vision-based autonomous vehicles, intelligent surveillance, and pedestrian tracking for behavior analysis. Most reliable vision-based pedestrian detectors base their decision on features extracted using a single sensor capturing complementary features, e.g., appearance, and texture. These features usually are extracted from the current frame, ignoring temporal information, or including it in a post process step e.g., tracking or temporal coherence. Taking into account these issues we formulate the following question: can we generate more robust pedestrian detectors by introducing new information sources in the feature extraction step?
In order to answer this question we develop different approaches for introducing new information sources to well-known pedestrian detectors. We start by the inclusion of temporal information following the Stacked Sequential Learning (SSL) paradigm which suggests that information extracted from the neighboring samples in a sequence can improve the accuracy of a base classifier. We then focus on the inclusion of complementary information from different sensors like 3D point clouds (LIDAR – depth), far infrared images (FIR), or disparity maps (stereo pair cameras). For this end we develop a multi-modal framework in which information from different sensors is used for increasing detection accuracy (by increasing information redundancy). Finally we propose a multi-view pedestrian detector, this multi-view approach splits the detection problem in n sub-problems. Each sub-problem will detect objects in a given specific view reducing in that way the variability problem faced when a single detectors is used for the whole problem. We show that these approaches obtain competitive results with other state-of-the-art methods but instead of design new features, we reuse existing ones boosting their performance. |
||||
Address | November 2015 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | David Vazquez;Antonio Lopez; | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-943427-7-6 | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.076 | Approved | no | ||
Call Number | Admin @ si @ Gon2015 | Serial | 2706 | ||
Permanent link to this record | |||||
Author | Alejandro Gonzalez Alzate; David Vazquez; Antonio Lopez; Jaume Amores | ||||
Title | On-Board Object Detection: Multicue, Multimodal, and Multiview Random Forest of Local Experts | Type | Journal Article | ||
Year | 2017 | Publication | IEEE Transactions on cybernetics | Abbreviated Journal | Cyber |
Volume | 47 | Issue | 11 | Pages | 3980 - 3990 |
Keywords | Multicue; multimodal; multiview; object detection | ||||
Abstract | Despite recent significant advances, object detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities, and a strong multiview (MV) classifier that accounts for different object views and poses. In this paper, we provide an extensive evaluation that gives insight into how each of these aspects (multicue, multimodality, and strong MV classifier) affect accuracy both individually and when integrated together. In the multimodality component, we explore the fusion of RGB and depth maps obtained by high-definition light detection and ranging, a type of modality that is starting to receive increasing attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the accuracy, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2168-2267 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.085; 600.082; 600.076; 600.118 | Approved | no | ||
Call Number | Admin @ si @ | Serial | 2810 | ||
Permanent link to this record | |||||
Author | Alejandro Gonzalez Alzate; Gabriel Villalonga; German Ros; David Vazquez; Antonio Lopez | ||||
Title | 3D-Guided Multiscale Sliding Window for Pedestrian Detection | Type | Conference Article | ||
Year | 2015 | Publication | Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 | Abbreviated Journal | |
Volume | 9117 | Issue | Pages | 560-568 | |
Keywords | Pedestrian Detection | ||||
Abstract | The most relevant modules of a pedestrian detector are the candidate generation and the candidate classification. The former aims at presenting image windows to the latter so that they are classified as containing a pedestrian or not. Much attention has being paid to the classification module, while candidate generation has mainly relied on (multiscale) sliding window pyramid. However, candidate generation is critical for achieving real-time. In this paper we assume a context of autonomous driving based on stereo vision. Accordingly, we evaluate the effect of taking into account the 3D information (derived from the stereo) in order to prune the hundred of thousands windows per image generated by classical pyramidal sliding window. For our study we use a multimodal (RGB, disparity) and multi-descriptor (HOG, LBP, HOG+LBP) holistic ensemble based on linear SVM. Evaluation on data from the challenging KITTI benchmark suite shows the effectiveness of using 3D information to dramatically reduce the number of candidate windows, even improving the overall pedestrian detection accuracy. | ||||
Address | Santiago de Compostela; España; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | ACDC | Expedition | Conference | IbPRIA | |
Notes | ADAS; 600.076; 600.057; 600.054 | Approved | no | ||
Call Number | ADAS @ adas @ GVR2015 | Serial | 2585 | ||
Permanent link to this record | |||||
Author | Alejandro Gonzalez Alzate; Gabriel Villalonga; Jiaolong Xu; David Vazquez; Jaume Amores; Antonio Lopez | ||||
Title | Multiview Random Forest of Local Experts Combining RGB and LIDAR data for Pedestrian Detection | Type | Conference Article | ||
Year | 2015 | Publication | IEEE Intelligent Vehicles Symposium IV2015 | Abbreviated Journal | |
Volume | Issue | Pages | 356-361 | ||
Keywords | Pedestrian Detection | ||||
Abstract | Despite recent significant advances, pedestrian detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities and a strong multi-view classifier that accounts for different pedestrian views and poses. In this paper we provide an extensive evaluation that gives insight into how each of these aspects (multi-cue, multimodality and strong multi-view classifier) affect performance both individually and when integrated together. In the multimodality component we explore the fusion of RGB and depth maps obtained by high-definition LIDAR, a type of modality that is only recently starting to receive attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the performance, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient. These simple blocks can be easily replaced with more sophisticated ones recently proposed, such as the use of convolutional neural networks for feature representation, to further improve the accuracy. | ||||
Address | Seoul; Corea; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | ACDC | Expedition | Conference | IV | |
Notes | ADAS; 600.076; 600.057; 600.054 | Approved | no | ||
Call Number | ADAS @ adas @ GVX2015 | Serial | 2625 | ||
Permanent link to this record | |||||
Author | Alejandro Gonzalez Alzate; Sebastian Ramos; David Vazquez; Antonio Lopez; Jaume Amores | ||||
Title | Spatiotemporal Stacked Sequential Learning for Pedestrian Detection | Type | Conference Article | ||
Year | 2015 | Publication | Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 | Abbreviated Journal | |
Volume | Issue | Pages | 3-12 | ||
Keywords | SSL; Pedestrian Detection | ||||
Abstract | Pedestrian classifiers decide which image windows contain a pedestrian. In practice, such classifiers provide a relatively high response at neighbor windows overlapping a pedestrian, while the responses around potential false positives are expected to be lower. An analogous reasoning applies for image sequences. If there is a pedestrian located within a frame, the same pedestrian is expected to appear close to the same location in neighbor frames. Therefore, such a location has chances of receiving high classification scores during several frames, while false positives are expected to be more spurious. In this paper we propose to exploit such correlations for improving the accuracy of base pedestrian classifiers. In particular, we propose to use two-stage classifiers which not only rely on the image descriptors required by the base classifiers but also on the response of such base classifiers in a given spatiotemporal neighborhood. More specifically, we train pedestrian classifiers using a stacked sequential learning (SSL) paradigm. We use a new pedestrian dataset we have acquired from a car to evaluate our proposal at different frame rates. We also test on a well known dataset: Caltech. The obtained results show that our SSL proposal boosts detection accuracy significantly with a minimal impact on the computational cost. Interestingly, SSL improves more the accuracy at the most dangerous situations, i.e. when a pedestrian is close to the camera. | ||||
Address | Santiago de Compostela; España; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | ACDC | Expedition | Conference | IbPRIA | |
Notes | ADAS; 600.057; 600.054; 600.076 | Approved | no | ||
Call Number | GRV2015; ADAS @ adas @ GRV2015 | Serial | 2454 | ||
Permanent link to this record | |||||
Author | Alejandro Gonzalez Alzate; Zhijie Fang; Yainuvis Socarras; Joan Serrat; David Vazquez; Jiaolong Xu; Antonio Lopez | ||||
Title | Pedestrian Detection at Day/Night Time with Visible and FIR Cameras: A Comparison | Type | Journal Article | ||
Year | 2016 | Publication | Sensors | Abbreviated Journal | SENS |
Volume | 16 | Issue | 6 | Pages | 820 |
Keywords | Pedestrian Detection; FIR | ||||
Abstract | Despite all the significant advances in pedestrian detection brought by computer vision for driving assistance, it is still a challenging problem. One reason is the extremely varying lighting conditions under which such a detector should operate, namely day and night time. Recent research has shown that the combination of visible and non-visible imaging modalities may increase detection accuracy, where the infrared spectrum plays a critical role. The goal of this paper is to assess the accuracy gain of different pedestrian models (holistic, part-based, patch-based) when training with images in the far infrared spectrum. Specifically, we want to compare detection accuracy on test images recorded at day and nighttime if trained (and tested) using (a) plain color images, (b) just infrared images and (c) both of them. In order to obtain results for the last item we propose an early fusion approach to combine features from both modalities. We base the evaluation on a new dataset we have built for this purpose as well as on the publicly available KAIST multispectral dataset. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1424-8220 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.085; 600.076; 600.082; 601.281 | Approved | no | ||
Call Number | ADAS @ adas @ GFS2016 | Serial | 2754 | ||
Permanent link to this record | |||||
Author | Alejandro Tabas; Emili Balaguer-Ballester; Laura Igual | ||||
Title | Spatial Discriminant ICA for RS-fMRI characterisation | Type | Conference Article | ||
Year | 2014 | Publication | 4th International Workshop on Pattern Recognition in Neuroimaging | Abbreviated Journal | |
Volume | Issue | Pages | 1-4 | ||
Keywords | |||||
Abstract | Resting-State fMRI (RS-fMRI) is a brain imaging technique useful for exploring functional connectivity. A major point of interest in RS-fMRI analysis is to isolate connectivity patterns characterising disorders such as for instance ADHD. Such characterisation is usually performed in two steps: first, all connectivity patterns in the data are extracted by means of Independent Component Analysis (ICA); second, standard statistical tests are performed over the extracted patterns to find differences between control and clinical groups. In this work we introduce a novel, single-step, approach for this problem termed Spatial Discriminant ICA. The algorithm can efficiently isolate networks of functional connectivity characterising a clinical group by combining ICA and a new variant of the Fisher’s Linear Discriminant also introduced in this work. As the characterisation is carried out in a single step, it potentially provides for a richer characterisation of inter-class differences. The algorithm is tested using synthetic and real fMRI data, showing promising results in both experiments. | ||||
Address | Tübingen; June 2014 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4799-4150-6 | Medium | ||
Area | Expedition | Conference | PRNI | ||
Notes | OR;MILAB | Approved | no | ||
Call Number | Admin @ si @ TBI2014 | Serial | 2493 | ||
Permanent link to this record | |||||
Author | Aleksandr Setkov; Fabio Martinez Carillo; Michele Gouiffes; Christian Jacquemin; Maria Vanrell; Ramon Baldrich | ||||
Title | DAcImPro: A Novel Database of Acquired Image Projections and Its Application to Object Recognition | Type | Conference Article | ||
Year | 2015 | Publication | Advances in Visual Computing. Proceedings of 11th International Symposium, ISVC 2015 Part II | Abbreviated Journal | |
Volume | 9475 | Issue | Pages | 463-473 | |
Keywords | Projector-camera systems; Feature descriptors; Object recognition | ||||
Abstract | Projector-camera systems are designed to improve the projection quality by comparing original images with their captured projections, which is usually complicated due to high photometric and geometric variations. Many research works address this problem using their own test data which makes it extremely difficult to compare different proposals. This paper has two main contributions. Firstly, we introduce a new database of acquired image projections (DAcImPro) that, covering photometric and geometric conditions and providing data for ground-truth computation, can serve to evaluate different algorithms in projector-camera systems. Secondly, a new object recognition scenario from acquired projections is presented, which could be of a great interest in such domains, as home video projections and public presentations. We show that the task is more challenging than the classical recognition problem and thus requires additional pre-processing, such as color compensation or projection area selection. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer International Publishing | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-319-27862-9 | Medium | |
Area | Expedition | Conference | ISVC | ||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ SMG2015 | Serial | 2736 | ||
Permanent link to this record |