Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–20] |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Alejandro Cartas; Juan Marin; Petia Radeva; Mariella Dimiccoli | ||||
Title ![]() |
Batch-based activity recognition from egocentric photo-streams revisited | Type | Journal Article | ||
Year | 2018 | Publication | Pattern Analysis and Applications | Abbreviated Journal | PAA |
Volume | 21 | Issue | 4 | Pages | 953–965 |
Keywords | Egocentric vision; Lifelogging; Activity recognition; Deep learning; Recurrent neural networks | ||||
Abstract | Wearable cameras can gather large amounts of image data that provide rich visual information about the daily activities of the wearer. Motivated by the large number of health applications that could be enabled by the automatic recognition of daily activities, such as lifestyle characterization for habit improvement, context-aware personal assistance and tele-rehabilitation services, we propose a system to classify 21 daily activities from photo-streams acquired by a wearable photo-camera. Our approach combines the advantages of a late fusion ensemble strategy relying on convolutional neural networks at image level with the ability of recurrent neural networks to account for the temporal evolution of high-level features in photo-streams without relying on event boundaries. The proposed batch-based approach achieved an overall accuracy of 89.85%, outperforming state-of-the-art end-to-end methodologies. These results were achieved on a dataset consists of 44,902 egocentric pictures from three persons captured during 26 days in average. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ CMR2018 | Serial | 3186 | ||
Permanent link to this record | |||||
Author | Eduardo Aguilar; Bhalaji Nagarajan; Beatriz Remeseiro; Petia Radeva | ||||
Title ![]() |
Bayesian deep learning for semantic segmentation of food images | Type | Journal Article | ||
Year | 2022 | Publication | Computers and Electrical Engineering | Abbreviated Journal | CEE |
Volume | 103 | Issue | Pages | 108380 | |
Keywords | Deep learning; Uncertainty quantification; Bayesian inference; Image segmentation; Food analysis | ||||
Abstract | Deep learning has provided promising results in various applications; however, algorithms tend to be overconfident in their predictions, even though they may be entirely wrong. Particularly for critical applications, the model should provide answers only when it is very sure of them. This article presents a Bayesian version of two different state-of-the-art semantic segmentation methods to perform multi-class segmentation of foods and estimate the uncertainty about the given predictions. The proposed methods were evaluated on three public pixel-annotated food datasets. As a result, we can conclude that Bayesian methods improve the performance achieved by the baseline architectures and, in addition, provide information to improve decision-making. Furthermore, based on the extracted uncertainty map, we proposed three measures to rank the images according to the degree of noisy annotations they contained. Note that the top 135 images ranked by one of these measures include more than half of the worst-labeled food images. | ||||
Address | October 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Science Direct | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ ANR2022 | Serial | 3763 | ||
Permanent link to this record | |||||
Author | Weiqing Min; Shuqiang Jiang; Jitao Sang; Huayang Wang; Xinda Liu; Luis Herranz | ||||
Title ![]() |
Being a Supercook: Joint Food Attributes and Multimodal Content Modeling for Recipe Retrieval and Exploration | Type | Journal Article | ||
Year | 2017 | Publication | IEEE Transactions on Multimedia | Abbreviated Journal | TMM |
Volume | 19 | Issue | 5 | Pages | 1100 - 1113 |
Keywords | |||||
Abstract | This paper considers the problem of recipe-oriented image-ingredient correlation learning with multi-attributes for recipe retrieval and exploration. Existing methods mainly focus on food visual information for recognition while we model visual information, textual content (e.g., ingredients), and attributes (e.g., cuisine and course) together to solve extended recipe-oriented problems, such as multimodal cuisine classification and attribute-enhanced food image retrieval. As a solution, we propose a multimodal multitask deep belief network (M3TDBN) to learn joint image-ingredient representation regularized by different attributes. By grouping ingredients into visible ingredients (which are visible in the food image, e.g., “chicken” and “mushroom”) and nonvisible ingredients (e.g., “salt” and “oil”), M3TDBN is capable of learning both midlevel visual representation between images and visible ingredients and nonvisual representation. Furthermore, in order to utilize different attributes to improve the intermodality correlation, M3TDBN incorporates multitask learning to make different attributes collaborate each other. Based on the proposed M3TDBN, we exploit the derived deep features and the discovered correlations for three extended novel applications: 1) multimodal cuisine classification; 2) attribute-augmented cross-modal recipe image retrieval; and 3) ingredient and attribute inference from food images. The proposed approach is evaluated on the constructed Yummly dataset and the evaluation results have validated the effectiveness of the proposed approach. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ MJS2017 | Serial | 2964 | ||
Permanent link to this record | |||||
Author | Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal | ||||
Title ![]() |
Beyond Document Object Detection: Instance-Level Segmentation of Complex Layouts | Type | Journal Article | ||
Year | 2021 | Publication | International Journal on Document Analysis and Recognition | Abbreviated Journal | IJDAR |
Volume | 24 | Issue | Pages | 269–281 | |
Keywords | |||||
Abstract | Information extraction is a fundamental task of many business intelligence services that entail massive document processing. Understanding a document page structure in terms of its layout provides contextual support which is helpful in the semantic interpretation of the document terms. In this paper, inspired by the progress of deep learning methodologies applied to the task of object recognition, we transfer these models to the specific case of document object detection, reformulating the traditional problem of document layout analysis. Moreover, we importantly contribute to prior arts by defining the task of instance segmentation on the document image domain. An instance segmentation paradigm is especially important in complex layouts whose contents should interact for the proper rendering of the page, i.e., the proper text wrapping around an image. Finally, we provide an extensive evaluation, both qualitative and quantitative, that demonstrates the superior performance of the proposed methodology over the current state of the art. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.121; 600.140; 110.312 | Approved | no | ||
Call Number | Admin @ si @ BRL2021b | Serial | 3574 | ||
Permanent link to this record | |||||
Author | Lu Yu; Lichao Zhang; Joost Van de Weijer; Fahad Shahbaz Khan; Yongmei Cheng; C. Alejandro Parraga | ||||
Title ![]() |
Beyond Eleven Color Names for Image Understanding | Type | Journal Article | ||
Year | 2018 | Publication | Machine Vision and Applications | Abbreviated Journal | MVAP |
Volume | 29 | Issue | 2 | Pages | 361-373 |
Keywords | Color name; Discriminative descriptors; Image classification; Re-identification; Tracking | ||||
Abstract | Color description is one of the fundamental problems of image understanding. One of the popular ways to represent colors is by means of color names. Most existing work on color names focuses on only the eleven basic color terms of the English language. This could be limiting the discriminative power of these representations, and representations based on more color names are expected to perform better. However, there exists no clear strategy to choose additional color names. We collect a dataset of 28 additional color names. To ensure that the resulting color representation has high discriminative power we propose a method to order the additional color names according to their complementary nature with the basic color names. This allows us to compute color name representations with high discriminative power of arbitrary length. In the experiments we show that these new color name descriptors outperform the existing color name descriptor on the task of visual tracking, person re-identification and image classification. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; NEUROBIT; 600.068; 600.109; 600.120 | Approved | no | ||
Call Number | Admin @ si @ YYW2018 | Serial | 3087 | ||
Permanent link to this record | |||||
Author | Pau Rodriguez; Miguel Angel Bautista; Sergio Escalera; Jordi Gonzalez | ||||
Title ![]() |
Beyond Oneshot Encoding: lower dimensional target embedding | Type | Journal Article | ||
Year | 2018 | Publication | Image and Vision Computing | Abbreviated Journal | IMAVIS |
Volume | 75 | Issue | Pages | 21-31 | |
Keywords | Error correcting output codes; Output embeddings; Deep learning; Computer vision | ||||
Abstract | Target encoding plays a central role when learning Convolutional Neural Networks. In this realm, one-hot encoding is the most prevalent strategy due to its simplicity. However, this so widespread encoding schema assumes a flat label space, thus ignoring rich relationships existing among labels that can be exploited during training. In large-scale datasets, data does not span the full label space, but instead lies in a low-dimensional output manifold. Following this observation, we embed the targets into a low-dimensional space, drastically improving convergence speed while preserving accuracy. Our contribution is two fold: (i) We show that random projections of the label space are a valid tool to find such lower dimensional embeddings, boosting dramatically convergence rates at zero computational cost; and (ii) we propose a normalized eigenrepresentation of the class manifold that encodes the targets with minimal information loss, improving the accuracy of random projections encoding while enjoying the same convergence rates. Experiments on CIFAR-100, CUB200-2011, Imagenet, and MIT Places demonstrate that the proposed approach drastically improves convergence speed while reaching very competitive accuracy rates. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE; HuPBA; 600.098; 602.133; 602.121; 600.119 | Approved | no | ||
Call Number | Admin @ si @ RBE2018 | Serial | 3120 | ||
Permanent link to this record | |||||
Author | Arka Ujjal Dey; Suman Ghosh; Ernest Valveny; Gaurav Harit | ||||
Title ![]() |
Beyond Visual Semantics: Exploring the Role of Scene Text in Image Understanding | Type | Journal Article | ||
Year | 2021 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 149 | Issue | Pages | 164-171 | |
Keywords | |||||
Abstract | Images with visual and scene text content are ubiquitous in everyday life. However, current image interpretation systems are mostly limited to using only the visual features, neglecting to leverage the scene text content. In this paper, we propose to jointly use scene text and visual channels for robust semantic interpretation of images. We do not only extract and encode visual and scene text cues, but also model their interplay to generate a contextual joint embedding with richer semantics. The contextual embedding thus generated is applied to retrieval and classification tasks on multimedia images, with scene text content, to demonstrate its effectiveness. In the retrieval framework, we augment our learned text-visual semantic representation with scene text cues, to mitigate vocabulary misses that may have occurred during the semantic embedding. To deal with irrelevant or erroneous recognition of scene text, we also apply query-based attention to our text channel. We show how the multi-channel approach, involving visual semantics and scene text, improves upon state of the art. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ DGV2021 | Serial | 3364 | ||
Permanent link to this record | |||||
Author | Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Matthieu Molinier; Jorma Laaksonen | ||||
Title ![]() |
Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification | Type | Journal Article | ||
Year | 2018 | Publication | ISPRS Journal of Photogrammetry and Remote Sensing | Abbreviated Journal | ISPRS J |
Volume | 138 | Issue | Pages | 74-85 | |
Keywords | Remote sensing; Deep learning; Scene classification; Local Binary Patterns; Texture analysis | ||||
Abstract | Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.109; 600.106; 600.120 | Approved | no | ||
Call Number | Admin @ si @ RKW2018 | Serial | 3158 | ||
Permanent link to this record | |||||
Author | Sophie Wuerger; Kaida Xiao; Dimitris Mylonas; Q. Huang; Dimosthenis Karatzas; Galina Paramei | ||||
Title ![]() |
Blue green color categorization in mandarin english speakers | Type | Journal Article | ||
Year | 2012 | Publication | Journal of the Optical Society of America A | Abbreviated Journal | JOSA A |
Volume | 29 | Issue | 2 | Pages | A102-A1207 |
Keywords | |||||
Abstract | Observers are faster to detect a target among a set of distracters if the targets and distracters come from different color categories. This cross-boundary advantage seems to be limited to the right visual field, which is consistent with the dominance of the left hemisphere for language processing [Gilbert et al., Proc. Natl. Acad. Sci. USA 103, 489 (2006)]. Here we study whether a similar visual field advantage is found in the color identification task in speakers of Mandarin, a language that uses a logographic system. Forty late Mandarin-English bilinguals performed a blue-green color categorization task, in a blocked design, in their first language (L1: Mandarin) or second language (L2: English). Eleven color singletons ranging from blue to green were presented for 160 ms, randomly in the left visual field (LVF) or right visual field (RVF). Color boundary and reaction times (RTs) at the color boundary were estimated in L1 and L2, for both visual fields. We found that the color boundary did not differ between the languages; RTs at the color boundary, however, were on average more than 100 ms shorter in the English compared to the Mandarin sessions, but only when the stimuli were presented in the RVF. The finding may be explained by the script nature of the two languages: Mandarin logographic characters are analyzed visuospatially in the right hemisphere, which conceivably facilitates identification of color presented to the LVF. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ WXM2012 | Serial | 2007 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Alicia Fornes; O. Pujol; Petia Radeva; Gemma Sanchez; Josep Llados | ||||
Title ![]() |
Blurred Shape Model for Binary and Grey-level Symbol Recognition | Type | Journal Article | ||
Year | 2009 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 30 | Issue | 15 | Pages | 1424–1433 |
Keywords | |||||
Abstract | Many symbol recognition problems require the use of robust descriptors in order to obtain rich information of the data. However, the research of a good descriptor is still an open issue due to the high variability of symbols appearance. Rotation, partial occlusions, elastic deformations, intra-class and inter-class variations, or high variability among symbols due to different writing styles, are just a few problems. In this paper, we introduce a symbol shape description to deal with the changes in appearance that these types of symbols suffer. The shape of the symbol is aligned based on principal components to make the recognition invariant to rotation and reflection. Then, we present the Blurred Shape Model descriptor (BSM), where new features encode the probability of appearance of each pixel that outlines the symbols shape. Moreover, we include the new descriptor in a system to deal with multi-class symbol categorization problems. Adaboost is used to train the binary classifiers, learning the BSM features that better split symbol classes. Then, the binary problems are embedded in an Error-Correcting Output Codes framework (ECOC) to deal with the multi-class case. The methodology is evaluated on different synthetic and real data sets. State-of-the-art descriptors and classifiers are compared, showing the robustness and better performance of the present scheme to classify symbols with high variability of appearance. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; DAG; MILAB | Approved | no | ||
Call Number | BCNPCL @ bcnpcl @ EFP2009a | Serial | 1180 | ||
Permanent link to this record | |||||
Author | David Masip; Agata Lapedriza; Jordi Vitria | ||||
Title ![]() |
Boosted Online Learning for Face Recognition | Type | Journal Article | ||
Year | 2009 | Publication | IEEE Transactions on Systems, Man and Cybernetics part B | Abbreviated Journal | TSMCB |
Volume | 39 | Issue | 2 | Pages | 530–538 |
Keywords | |||||
Abstract | Face recognition applications commonly suffer from three main drawbacks: a reduced training set, information lying in high-dimensional subspaces, and the need to incorporate new people to recognize. In the recent literature, the extension of a face classifier in order to include new people in the model has been solved using online feature extraction techniques. The most successful approaches of those are the extensions of the principal component analysis or the linear discriminant analysis. In the current paper, a new online boosting algorithm is introduced: a face recognition method that extends a boosting-based classifier by adding new classes while avoiding the need of retraining the classifier each time a new person joins the system. The classifier is learned using the multitask learning principle where multiple verification tasks are trained together sharing the same feature space. The new classes are added taking advantage of the structure learned previously, being the addition of new classes not computationally demanding. The present proposal has been (experimentally) validated with two different facial data sets by comparing our approach with the current state-of-the-art techniques. The results show that the proposed online boosting algorithm fares better in terms of final accuracy. In addition, the global performance does not decrease drastically even when the number of classes of the base problem is multiplied by eight. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1083–4419 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | OR;MV | Approved | no | ||
Call Number | BCNPCL @ bcnpcl @ MLV2009 | Serial | 1155 | ||
Permanent link to this record | |||||
Author | Jaume Amores; N. Sebe; Petia Radeva | ||||
Title ![]() |
Boosting the distance estimation: Application to the K-Nearest Neighbor Classifier | Type | Journal Article | ||
Year | 2006 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 27 | Issue | 3 | Pages | 201–209 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS;MILAB | Approved | no | ||
Call Number | ADAS @ adas @ ASR2006 | Serial | 643 | ||
Permanent link to this record | |||||
Author | Marçal Rusiñol; Josep Llados | ||||
Title ![]() |
Boosting the Handwritten Word Spotting Experience by Including the User in the Loop | Type | Journal Article | ||
Year | 2014 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 47 | Issue | 3 | Pages | 1063–1072 |
Keywords | Handwritten word spotting; Query by example; Relevance feedback; Query fusion; Multidimensional scaling | ||||
Abstract | In this paper, we study the effect of taking the user into account in a query-by-example handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in the handwritten word spotting context. The increase in terms of precision when the user is included in the loop is assessed using two datasets of historical handwritten documents and two baseline word spotting approaches both based on the bag-of-visual-words model. We finally present two alternative ways of presenting the results to the user that might be more attractive and suitable to the user's needs than the classic ranked list. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.045; 600.061; 600.077 | Approved | no | ||
Call Number | Admin @ si @ RuL2013 | Serial | 2343 | ||
Permanent link to this record | |||||
Author | Ole Larsen; Petia Radeva; Enric Marti | ||||
Title ![]() |
Bounds on the optimal elasticity parameters for a snake | Type | Journal Article | ||
Year | 1995 | Publication | Image Analysis and Processing | Abbreviated Journal | |
Volume | Issue | Pages | 37-42 | ||
Keywords | |||||
Abstract | This paper develops a formalism by which an estimate for the upper and lower bounds for the elasticity parameters for a snake can be obtained. Objects different in size and shape give rise to different bounds. The bounds can be obtained based on an analysis of the shape of the object of interest. Experiments on synthetic images show a good correlation between the estimated behaviour of the snake and the one actually observed. Experiments on real X-ray images show that the parameters for optimal segmentation lie within the estimated bounds. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB;IAM | Approved | no | ||
Call Number | IAM @ iam @ LRM1995a | Serial | 1559 | ||
Permanent link to this record | |||||
Author | F. Javier Sanchez; Jorge Bernal; Cristina Sanchez Montes; Cristina Rodriguez de Miguel; Gloria Fernandez Esparrach | ||||
Title ![]() |
Bright spot regions segmentation and classification for specular highlights detection in colonoscopy videos | Type | Journal Article | ||
Year | 2017 | Publication | Machine Vision and Applications | Abbreviated Journal | MVAP |
Volume | Issue | Pages | 1-20 | ||
Keywords | Specular highlights; bright spot regions segmentation; region classification; colonoscopy | ||||
Abstract | A novel specular highlights detection method in colonoscopy videos is presented. The method is based on a model of appearance dening specular
highlights as bright spots which are highly contrasted with respect to adjacent regions. Our approach proposes two stages; segmentation, and then classication of bright spot regions. The former denes a set of candidate regions obtained through a region growing process with local maxima as initial region seeds. This process creates a tree structure which keeps track, at each growing iteration, of the region frontier contrast; nal regions provided depend on restrictions over contrast value. Non-specular regions are ltered through a classication stage performed by a linear SVM classier using model-based features from each region. We introduce a new validation database with more than 25; 000 regions along with their corresponding pixel-wise annotations. We perform a comparative study against other approaches. Results show that our method is superior to other approaches, with our segmented regions being closer to actual specular regions in the image. Finally, we also present how our methodology can also be used to obtain an accurate prediction of polyp histology. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MV; 600.096; 600.175 | Approved | no | ||
Call Number | Admin @ si @ SBS2017 | Serial | 2975 | ||
Permanent link to this record |