Home | [71–80] << 81 82 83 84 85 86 87 88 89 90 >> [91–100] |
Records | |||||
---|---|---|---|---|---|
Author | Alejandro Cartas; Juan Marin; Petia Radeva; Mariella Dimiccoli | ||||
Title | Batch-based activity recognition from egocentric photo-streams revisited | Type | Journal Article | ||
Year | 2018 | Publication | Pattern Analysis and Applications | Abbreviated Journal | PAA |
Volume | 21 | Issue | 4 | Pages | 953–965 |
Keywords | Egocentric vision; Lifelogging; Activity recognition; Deep learning; Recurrent neural networks | ||||
Abstract | Wearable cameras can gather large amounts of image data that provide rich visual information about the daily activities of the wearer. Motivated by the large number of health applications that could be enabled by the automatic recognition of daily activities, such as lifestyle characterization for habit improvement, context-aware personal assistance and tele-rehabilitation services, we propose a system to classify 21 daily activities from photo-streams acquired by a wearable photo-camera. Our approach combines the advantages of a late fusion ensemble strategy relying on convolutional neural networks at image level with the ability of recurrent neural networks to account for the temporal evolution of high-level features in photo-streams without relying on event boundaries. The proposed batch-based approach achieved an overall accuracy of 89.85%, outperforming state-of-the-art end-to-end methodologies. These results were achieved on a dataset consists of 44,902 egocentric pictures from three persons captured during 26 days in average. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ CMR2018 | Serial | 3186 | ||
Permanent link to this record | |||||
Author | Mariella Dimiccoli; Cathal Gurrin; David J. Crandall; Xavier Giro; Petia Radeva | ||||
Title | Introduction to the special issue: Egocentric Vision and Lifelogging | Type | Journal Article | ||
Year | 2018 | Publication | Journal of Visual Communication and Image Representation | Abbreviated Journal | JVCIR |
Volume | 55 | Issue | Pages | 352-353 | |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ DGC2018 | Serial | 3187 | ||
Permanent link to this record | |||||
Author | Sumit K. Banchhor; Narendra D. Londhe; Tadashi Araki; Luca Saba; Petia Radeva; Narendra N. Khanna; Jasjit S. Suri | ||||
Title | Calcium detection, its quantification, and grayscale morphology-based risk stratification using machine learning in multimodality big data coronary and carotid scans: A review. | Type | Journal Article | ||
Year | 2018 | Publication | Computers in Biology and Medicine | Abbreviated Journal | CBM |
Volume | 101 | Issue | Pages | 184-198 | |
Keywords | Heart disease; Stroke; Atherosclerosis; Intravascular; Coronary; Carotid; Calcium; Morphology; Risk stratification | ||||
Abstract | Purpose of review
Atherosclerosis is the leading cause of cardiovascular disease (CVD) and stroke. Typically, atherosclerotic calcium is found during the mature stage of the atherosclerosis disease. It is therefore often a challenge to identify and quantify the calcium. This is due to the presence of multiple components of plaque buildup in the arterial walls. The American College of Cardiology/American Heart Association guidelines point to the importance of calcium in the coronary and carotid arteries and further recommend its quantification for the prevention of heart disease. It is therefore essential to stratify the CVD risk of the patient into low- and high-risk bins. Recent finding Calcium formation in the artery walls is multifocal in nature with sizes at the micrometer level. Thus, its detection requires high-resolution imaging. Clinical experience has shown that even though optical coherence tomography offers better resolution, intravascular ultrasound still remains an important imaging modality for coronary wall imaging. For a computer-based analysis system to be complete, it must be scientifically and clinically validated. This study presents a state-of-the-art review (condensation of 152 publications after examining 200 articles) covering the methods for calcium detection and its quantification for coronary and carotid arteries, the pros and cons of these methods, and the risk stratification strategies. The review also presents different kinds of statistical models and gold standard solutions for the evaluation of software systems useful for calcium detection and quantification. Finally, the review concludes with a possible vision for designing the next-generation system for better clinical outcomes. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ BLA2018 | Serial | 3188 | ||
Permanent link to this record | |||||
Author | L. Rothacker; Marçal Rusiñol; Josep Llados; G.A. Fink | ||||
Title | A Two-stage Approach to Segmentation-Free Query-by-example Word Spotting | Type | Journal | ||
Year | 2014 | Publication | Manuscript Cultures | Abbreviated Journal | |
Volume | 7 | Issue | Pages | 47-58 | |
Keywords | |||||
Abstract | With the ongoing progress in digitization, huge document collections and archives have become available to a broad audience. Scanned document images can be transmitted electronically and studied simultaneously throughout the world. While this is very beneficial, it is often impossible to perform automated searches on these document collections. Optical character recognition usually fails when it comes to handwritten or historic documents. In order to address the need for exploring document collections rapidly, researchers are working on word spotting. In query-by-example word spotting scenarios, the user selects an exemplary occurrence of the query word in a document image. The word spotting system then retrieves all regions in the collection that are visually similar to the given example of the query word. The best matching regions are presented to the user and no actual transcription is required.
An important property of a word spotting system is the computational speed with which queries can be executed. In our previous work, we presented a relatively slow but high-precision method. In the present work, we will extend this baseline system to an integrated two-stage approach. In a coarse-grained first stage, we will filter document images efficiently in order to identify regions that are likely to contain the query word. In the fine-grained second stage, these regions will be analyzed with our previously presented high-precision method. Finally, we will report recognition results and query times for the well-known George Washington benchmark in our evaluation. We achieve state-of-the-art recognition results while the query times can be reduced to 50% in comparison with our baseline. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.061; 600.077 | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3190 | ||
Permanent link to this record | |||||
Author | Cristhian A. Aguilera-Carrasco; C. Aguilera; Angel Sappa | ||||
Title | Melamine Faced Panels Defect Classification beyond the Visible Spectrum | Type | Journal Article | ||
Year | 2018 | Publication | Sensors | Abbreviated Journal | SENS |
Volume | 18 | Issue | 11 | Pages | 1-10 |
Keywords | industrial application; infrared; machine learning | ||||
Abstract | In this work, we explore the use of images from different spectral bands to classify defects in melamine faced panels, which could appear through the production process. Through experimental evaluation, we evaluate the use of images from the visible (VS), near-infrared (NIR), and long wavelength infrared (LWIR), to classify the defects using a feature descriptor learning approach together with a support vector machine classifier. Two descriptors were evaluated, Extended Local Binary Patterns (E-LBP) and SURF using a Bag of Words (BoW) representation. The evaluation was carried on with an image set obtained during this work, which contained five different defect categories that currently occurs in the industry. Results show that using images from beyond the visual spectrum helps to improve classification performance in contrast with a single visible spectrum solution. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MSIAU; 600.122 | Approved | no | ||
Call Number | Admin @ si @ AAS2018 | Serial | 3191 | ||
Permanent link to this record | |||||
Author | Razieh Rastgoo; Kourosh Kiani; Sergio Escalera | ||||
Title | Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine | Type | Journal Article | ||
Year | 2018 | Publication | Entropy | Abbreviated Journal | ENTROPY |
Volume | 20 | Issue | 11 | Pages | 809 |
Keywords | hand sign language; deep learning; restricted Boltzmann machine (RBM); multi-modal; profoundly deaf; noisy image | ||||
Abstract | In this paper, a deep learning approach, Restricted Boltzmann Machine (RBM), is used to perform automatic hand sign language recognition from visual data. We evaluate how RBM, as a deep generative model, is capable of generating the distribution of the input data for an enhanced recognition of unseen data. Two modalities, RGB and Depth, are considered in the model input in three forms: original image, cropped image, and noisy cropped image. Five crops of the input image are used and the hand of these cropped images are detected using Convolutional Neural Network (CNN). After that, three types of the detected hand images are generated for each modality and input to RBMs. The outputs of the RBMs for two modalities are fused in another RBM in order to recognize the output sign label of the input image. The proposed multi-modal model is trained on all and part of the American alphabet and digits of four publicly available datasets. We also evaluate the robustness of the proposal against noise. Experimental results show that the proposed multi-modal model, using crops and the RBM fusing methodology, achieves state-of-the-art results on Massey University Gesture Dataset 2012, American Sign Language (ASL). and Fingerspelling Dataset from the University of Surrey’s Center for Vision, Speech and Signal Processing, NYU, and ASL Fingerspelling A datasets. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ RKE2018 | Serial | 3198 | ||
Permanent link to this record | |||||
Author | Md. Mostafa Kamal Sarker; Hatem A. Rashwan; Farhan Akram; Vivek Kumar Singh; Syeda Furruka Banu; Forhad U H Chowdhury; Kabir Ahmed Choudhury; Sylvie Chambon; Petia Radeva; Domenec Puig; Mohamed Abdel-Nasser | ||||
Title | SLSNet: Skin lesion segmentation using a lightweight generative adversarial network | Type | Journal Article | ||
Year | 2021 | Publication | Expert Systems With Applications | Abbreviated Journal | ESWA |
Volume | 183 | Issue | Pages | 115433 | |
Keywords | |||||
Abstract | The determination of precise skin lesion boundaries in dermoscopic images using automated methods faces many challenges, most importantly, the presence of hair, inconspicuous lesion edges and low contrast in dermoscopic images, and variability in the color, texture and shapes of skin lesions. Existing deep learning-based skin lesion segmentation algorithms are expensive in terms of computational time and memory. Consequently, running such segmentation algorithms requires a powerful GPU and high bandwidth memory, which are not available in dermoscopy devices. Thus, this article aims to achieve precise skin lesion segmentation with minimum resources: a lightweight, efficient generative adversarial network (GAN) model called SLSNet, which combines 1-D kernel factorized networks, position and channel attention, and multiscale aggregation mechanisms with a GAN model. The 1-D kernel factorized network reduces the computational cost of 2D filtering. The position and channel attention modules enhance the discriminative ability between the lesion and non-lesion feature representations in spatial and channel dimensions, respectively. A multiscale block is also used to aggregate the coarse-to-fine features of input skin images and reduce the effect of the artifacts. SLSNet is evaluated on two publicly available datasets: ISBI 2017 and the ISIC 2018. Although SLSNet has only 2.35 million parameters, the experimental results demonstrate that it achieves segmentation results on a par with the state-of-the-art skin lesion segmentation methods with an accuracy of 97.61%, and Dice and Jaccard similarity coefficients of 90.63% and 81.98%, respectively. SLSNet can run at more than 110 frames per second (FPS) in a single GTX1080Ti GPU, which is faster than well-known deep learning-based image segmentation models, such as FCN. Therefore, SLSNet can be used for practical dermoscopic applications. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ SRA2021 | Serial | 3633 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Markus Weimer; Mikhail Burtsev; Valentin Malykh; Varvara Logacheva; Ryan Lowe; Iulian Vlad Serban; Yoshua Bengio; Alexander Rudnicky; Alan W. Black; Shrimai Prabhumoye; Łukasz Kidzinski; Mohanty Sharada; Carmichael Ong; Jennifer Hicks; Sergey Levine; Marcel Salathe; Scott Delp; Iker Huerga; Alexander Grigorenko; Leifur Thorbergsson; Anasuya Das; Kyla Nemitz; Jenna Sandker; Stephen King; Alexander S. Ecker; Leon A. Gatys; Matthias Bethge; Jordan Boyd Graber; Shi Feng; Pedro Rodriguez; Mohit Iyyer; He He; Hal Daume III; Sean McGregor; Amir Banifatemi; Alexey Kurakin; Ian Goodfellow; Samy Bengio | ||||
Title | Introduction to NIPS 2017 Competition Track | Type | Book Chapter | ||
Year | 2018 | Publication | The NIPS ’17 Competition: Building Intelligent Systems | Abbreviated Journal | |
Volume | Issue | Pages | 1-23 | ||
Keywords | |||||
Abstract | Competitions have become a popular tool in the data science community to solve hard problems, assess the state of the art and spur new research directions. Companies like Kaggle and open source platforms like Codalab connect people with data and a data science problem to those with the skills and means to solve it. Hence, the question arises: What, if anything, could NIPS add to this rich ecosystem?
In 2017, we embarked to find out. We attracted 23 potential competitions, of which we selected five to be NIPS 2017 competitions. Our final selection features competitions advancing the state of the art in other sciences such as “Classifying Clinically Actionable Genetic Mutations” and “Learning to Run”. Others, like “The Conversational Intelligence Challenge” and “Adversarial Attacks and Defences” generated new data sets that we expect to impact the progress in their respective communities for years to come. And “Human-Computer Question Answering Competition” showed us just how far we as a field have come in ability and efficiency since the break-through performance of Watson in Jeopardy. Two additional competitions, DeepArt and AI XPRIZE Milestions, were also associated to the NIPS 2017 competition track, whose results are also presented within this chapter. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer | Place of Publication | Editor | Sergio Escalera; Markus Weimer | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-319-94042-7 | Medium | ||
Area | Expedition | Conference | |||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ EWB2018 | Serial | 3200 | ||
Permanent link to this record | |||||
Author | Meysam Madadi; Sergio Escalera; Alex Carruesco Llorens; Carlos Andujar; Xavier Baro; Jordi Gonzalez | ||||
Title | Top-down model fitting for hand pose recovery in sequences of depth images | Type | Journal Article | ||
Year | 2018 | Publication | Image and Vision Computing | Abbreviated Journal | IMAVIS |
Volume | 79 | Issue | Pages | 63-75 | |
Keywords | |||||
Abstract | State-of-the-art approaches on hand pose estimation from depth images have reported promising results under quite controlled considerations. In this paper we propose a two-step pipeline for recovering the hand pose from a sequence of depth images. The pipeline has been designed to deal with images taken from any viewpoint and exhibiting a high degree of finger occlusion. In a first step we initialize the hand pose using a part-based model, fitting a set of hand components in the depth images. In a second step we consider temporal data and estimate the parameters of a trained bilinear model consisting of shape and trajectory bases. We evaluate our approach on a new created synthetic hand dataset along with NYU and MSRA real datasets. Results demonstrate that the proposed method outperforms the most recent pose recovering approaches, including those based on CNNs. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; 600.098 | Approved | no | ||
Call Number | Admin @ si @ MEC2018 | Serial | 3203 | ||
Permanent link to this record | |||||
Author | Giuseppe Pezzano; Oliver Diaz; Vicent Ribas Ripoll; Petia Radeva | ||||
Title | CoLe-CNN+: Context learning – Convolutional neural network for COVID-19-Ground-Glass-Opacities detection and segmentation | Type | Journal Article | ||
Year | 2021 | Publication | Computers in Biology and Medicine | Abbreviated Journal | CBM |
Volume | 136 | Issue | Pages | 104689 | |
Keywords | |||||
Abstract | The most common tool for population-wide COVID-19 identification is the Reverse Transcription-Polymerase Chain Reaction test that detects the presence of the virus in the throat (or sputum) in swab samples. This test has a sensitivity between 59% and 71%. However, this test does not provide precise information regarding the extension of the pulmonary infection. Moreover, it has been proven that through the reading of a computed tomography (CT) scan, a clinician can provide a more complete perspective of the severity of the disease. Therefore, we propose a comprehensive system for fully-automated COVID-19 detection and lesion segmentation from CT scans, powered by deep learning strategies to support decision-making process for the diagnosis of COVID-19. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ PDR2021 | Serial | 3635 | ||
Permanent link to this record | |||||
Author | Yagmur Gucluturk; Umut Guclu; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera; Marcel A. J. van Gerven; Rob van Lier | ||||
Title | Multimodal First Impression Analysis with Deep Residual Networks | Type | Journal Article | ||
Year | 2018 | Publication | IEEE Transactions on Affective Computing | Abbreviated Journal | TAC |
Volume | 8 | Issue | 3 | Pages | 316-329 |
Keywords | |||||
Abstract | People form first impressions about the personalities of unfamiliar individuals even after very brief interactions with them. In this study we present and evaluate several models that mimic this automatic social behavior. Specifically, we present several models trained on a large dataset of short YouTube video blog posts for predicting apparent Big Five personality traits of people and whether they seem suitable to be recommended to a job interview. Along with presenting our audiovisual approach and results that won the third place in the ChaLearn First Impressions Challenge, we investigate modeling in different modalities including audio only, visual only, language only, audiovisual, and combination of audiovisual and language. Our results demonstrate that the best performance could be obtained using a fusion of all data modalities. Finally, in order to promote explainability in machine learning and to provide an example for the upcoming ChaLearn challenges, we present a simple approach for explaining the predictions for job interview recommendations | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ GGB2018 | Serial | 3210 | ||
Permanent link to this record | |||||
Author | Rain Eric Haamer; Eka Rusadze; Iiris Lusi; Tauseef Ahmed; Sergio Escalera; Gholamreza Anbarjafari | ||||
Title | Review on Emotion Recognition Databases | Type | Book Chapter | ||
Year | 2018 | Publication | Human-Robot Interaction: Theory and Application | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | emotion; computer vision; databases | ||||
Abstract | Over the past few decades human-computer interaction has become more important in our daily lives and research has developed in many directions: memory research, depression detection, and behavioural deficiency detection, lie detection, (hidden) emotion recognition etc. Because of that, the number of generic emotion and face databases or those tailored to specific needs have grown immensely large. Thus, a comprehensive yet compact guide is needed to help researchers find the most suitable database and understand what types of databases already exist. In this paper, different elicitation methods are discussed and the databases are primarily organized into neat and informative tables based on the format. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-78923-316-2 | Medium | ||
Area | Expedition | Conference | |||
Notes | HUPBA; 602.133 | Approved | no | ||
Call Number | Admin @ si @ HRL2018 | Serial | 3212 | ||
Permanent link to this record | |||||
Author | Ester Fornells; Manuel De Armas; Maria Teresa Anguera; Sergio Escalera; Marcos Antonio Catalán; Josep Moya | ||||
Title | Desarrollo del proyecto del Consell Comarcal del Baix Llobregat “Buen Trato a las personas mayores y aquellas en situación de fragilidad con sufrimiento emocional: Hacia un envejecimiento saludable” | Type | Journal | ||
Year | 2018 | Publication | Informaciones Psiquiatricas | Abbreviated Journal | |
Volume | 232 | Issue | Pages | 47-59 | |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0210-7279 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | HUPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ FAA2018 | Serial | 3214 | ||
Permanent link to this record | |||||
Author | Gholamreza Anbarjafari; Sergio Escalera | ||||
Title | Human-Robot Interaction: Theory and Application | Type | Book Whole | ||
Year | 2018 | Publication | Human-Robot Interaction: Theory and Application | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-78923-316-2 | Medium | ||
Area | Expedition | Conference | |||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ AnE2018 | Serial | 3216 | ||
Permanent link to this record | |||||
Author | Mikhail Mozerov; Joost Van de Weijer | ||||
Title | One-view occlusion detection for stereo matching with a fully connected CRF model | Type | Journal Article | ||
Year | 2019 | Publication | IEEE Transactions on Image Processing | Abbreviated Journal | TIP |
Volume | 28 | Issue | 6 | Pages | 2936-2947 |
Keywords | Stereo matching; energy minimization; fully connected MRF model; geodesic distance filter | ||||
Abstract | In this paper, we extend the standard belief propagation (BP) sequential technique proposed in the tree-reweighted sequential method [15] to the fully connected CRF models with the geodesic distance affinity. The proposed method has been applied to the stereo matching problem. Also a new approach to the BP marginal solution is proposed that we call one-view occlusion detection (OVOD). In contrast to the standard winner takes all (WTA) estimation, the proposed OVOD solution allows to find occluded regions in the disparity map and simultaneously improve the matching result. As a result we can perform only
one energy minimization process and avoid the cost calculation for the second view and the left-right check procedure. We show that the OVOD approach considerably improves results for cost augmentation and energy minimization techniques in comparison with the standard one-view affinity space implementation. We apply our method to the Middlebury data set and reach state-ofthe-art especially for median, average and mean squared error metrics. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.098; 600.109; 602.133; 600.120 | Approved | no | ||
Call Number | Admin @ si @ MoW2019 | Serial | 3221 | ||
Permanent link to this record |