Records |
Author |
Andreas Fischer; Volkmar Frinken; Horst Bunke; Ching Y. Suen |
Title |
Improving HMM-Based Keyword Spotting with Character Language Models |
Type |
Conference Article |
Year |
2013 |
Publication |
12th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
506-510 |
Keywords |
|
Abstract |
Facing high error rates and slow recognition speed for full text transcription of unconstrained handwriting images, keyword spotting is a promising alternative to locate specific search terms within scanned document images. We have previously proposed a learning-based method for keyword spotting using character hidden Markov models that showed a high performance when compared with traditional template image matching. In the lexicon-free approach pursued, only the text appearance was taken into account for recognition. In this paper, we integrate character n-gram language models into the spotting system in order to provide an additional language context. On the modern IAM database as well as the historical George Washington database, we demonstrate that character language models significantly improve the spotting performance. |
Address |
Washington; USA; August 2013 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1520-5363 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICDAR |
Notes |
DAG; 600.045; 605.203 |
Approved |
no |
Call Number |
Admin @ si @ FFB2013 |
Serial |
2295 |
Permanent link to this record |
|
|
|
Author |
Christophe Rigaud; Dimosthenis Karatzas; Joost Van de Weijer; Jean-Christophe Burie; Jean-Marc Ogier |
Title |
An active contour model for speech balloon detection in comics |
Type |
Conference Article |
Year |
2013 |
Publication |
12th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
1240-1244 |
Keywords |
|
Abstract |
Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent comic book understanding would enable a variety of new applications, including content-based retrieval and content retargeting. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. Few studies have been done in this direction. In this work we detail a novel approach for closed and non-closed speech balloon localization in scanned comic book pages, an essential step towards a fully automatic comic book understanding. The approach is compared with existing methods for closed balloon localization found in the literature and results are presented. |
Address |
washington; USA; August 2013 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1520-5363 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICDAR |
Notes |
DAG; CIC; 600.056 |
Approved |
no |
Call Number |
Admin @ si @ RKW2013a |
Serial |
2260 |
Permanent link to this record |
|
|
|
Author |
Alicia Fornes; Xavier Otazu; Josep Llados |
Title |
Show through cancellation and image enhancement by multiresolution contrast processing |
Type |
Conference Article |
Year |
2013 |
Publication |
12th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
200-204 |
Keywords |
|
Abstract |
Historical documents suffer from different types of degradation and noise such as background variation, uneven illumination or dark spots. In case of double-sided documents, another common problem is that the back side of the document usually interferes with the front side because of the transparency of the document or ink bleeding. This effect is called the show through phenomenon. Many methods are developed to solve these problems, and in the case of show-through, by scanning and matching both the front and back sides of the document. In contrast, our approach is designed to use only one side of the scanned document. We hypothesize that show-trough are low contrast components, while foreground components are high contrast ones. A Multiresolution Contrast (MC) decomposition is presented in order to estimate the contrast of features at different spatial scales. We cancel the show-through phenomenon by thresholding these low contrast components. This decomposition is also able to enhance the image removing shadowed areas by weighting spatial scales. Results show that the enhanced images improve the readability of the documents, allowing scholars both to recover unreadable words and to solve ambiguities. |
Address |
Washington; USA; August 2013 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1520-5363 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICDAR |
Notes |
DAG; 602.006; 600.045; 600.061; 600.052;CIC |
Approved |
no |
Call Number |
Admin @ si @ FOL2013 |
Serial |
2241 |
Permanent link to this record |
|
|
|
Author |
David Aldavert; Marçal Rusiñol; Ricardo Toledo; Josep Llados |
Title |
Integrating Visual and Textual Cues for Query-by-String Word Spotting |
Type |
Conference Article |
Year |
2013 |
Publication |
12th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
511 - 515 |
Keywords |
|
Abstract |
In this paper, we present a word spotting framework that follows the query-by-string paradigm where word images are represented both by textual and visual representations. The textual representation is formulated in terms of character $n$-grams while the visual one is based on the bag-of-visual-words scheme. These two representations are merged together and projected to a sub-vector space. This transform allows to, given a textual query, retrieve word instances that were only represented by the visual modality. Moreover, this statistical representation can be used together with state-of-the-art indexation structures in order to deal with large-scale scenarios. The proposed method is evaluated using a collection of historical documents outperforming state-of-the-art performances. |
Address |
Washington; USA; August 2013 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1520-5363 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICDAR |
Notes |
DAG; ADAS; 600.045; 600.055; 600.061 |
Approved |
no |
Call Number |
Admin @ si @ ART2013 |
Serial |
2224 |
Permanent link to this record |
|
|
|
Author |
Ferran Diego; Joan Serrat; Antonio Lopez |
Title |
Joint spatio-temporal alignment of sequences |
Type |
Journal Article |
Year |
2013 |
Publication |
IEEE Transactions on Multimedia |
Abbreviated Journal |
TMM |
Volume |
15 |
Issue |
6 |
Pages |
1377-1387 |
Keywords |
video alignment |
Abstract |
Video alignment is important in different areas of computer vision such as wide baseline matching, action recognition, change detection, video copy detection and frame dropping prevention. Current video alignment methods usually deal with a relatively simple case of fixed or rigidly attached cameras or simultaneous acquisition. Therefore, in this paper we propose a joint video alignment for bringing two video sequences into a spatio-temporal alignment. Specifically, the novelty of the paper is to formulate the video alignment to fold the spatial and temporal alignment into a single alignment framework. This simultaneously satisfies a frame-correspondence and frame-alignment similarity; exploiting the knowledge among neighbor frames by a standard pairwise Markov random field (MRF). This new formulation is able to handle the alignment of sequences recorded at different times by independent moving cameras that follows a similar trajectory, and also generalizes the particular cases that of fixed geometric transformation and/or linear temporal mapping. We conduct experiments on different scenarios such as sequences recorded simultaneously or by moving cameras to validate the robustness of the proposed approach. The proposed method provides the highest video alignment accuracy compared to the state-of-the-art methods on sequences recorded from vehicles driving along the same track at different times. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1520-9210 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
ADAS |
Approved |
no |
Call Number |
Admin @ si @ DSL2013; ADAS @ adas @ |
Serial |
2228 |
Permanent link to this record |
|
|
|
Author |
Jose Manuel Alvarez; Theo Gevers; Ferran Diego; Antonio Lopez |
Title |
Road Geometry Classification by Adaptative Shape Models |
Type |
Journal Article |
Year |
2013 |
Publication |
IEEE Transactions on Intelligent Transportation Systems |
Abbreviated Journal |
TITS |
Volume |
14 |
Issue |
1 |
Pages |
459-468 |
Keywords |
road detection |
Abstract |
Vision-based road detection is important for different applications in transportation, such as autonomous driving, vehicle collision warning, and pedestrian crossing detection. Common approaches to road detection are based on low-level road appearance (e.g., color or texture) and neglect of the scene geometry and context. Hence, using only low-level features makes these algorithms highly depend on structured roads, road homogeneity, and lighting conditions. Therefore, the aim of this paper is to classify road geometries for road detection through the analysis of scene composition and temporal coherence. Road geometry classification is proposed by building corresponding models from training images containing prototypical road geometries. We propose adaptive shape models where spatial pyramids are steered by the inherent spatial structure of road images. To reduce the influence of lighting variations, invariant features are used. Large-scale experiments show that the proposed road geometry classifier yields a high recognition rate of 73.57% ± 13.1, clearly outperforming other state-of-the-art methods. Including road shape information improves road detection results over existing appearance-based methods. Finally, it is shown that invariant features and temporal information provide robustness against disturbing imaging conditions. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1524-9050 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
ADAS;ISE |
Approved |
no |
Call Number |
Admin @ si @ AGD2013;; ADAS @ adas @ |
Serial |
2269 |
Permanent link to this record |
|
|
|
Author |
Javier Marin; David Vazquez; Antonio Lopez; Jaume Amores; Bastian Leibe |
Title |
Random Forests of Local Experts for Pedestrian Detection |
Type |
Conference Article |
Year |
2013 |
Publication |
15th IEEE International Conference on Computer Vision |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
2592 - 2599 |
Keywords |
ADAS; Random Forest; Pedestrian Detection |
Abstract |
Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one. |
Address |
Sydney; Australia; December 2013 |
Corporate Author |
|
Thesis |
|
Publisher |
IEEE |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1550-5499 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICCV |
Notes |
ADAS; 600.057; 600.054 |
Approved |
no |
Call Number |
ADAS @ adas @ MVL2013 |
Serial |
2333 |
Permanent link to this record |
|
|
|
Author |
Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny |
Title |
Handwritten Word Spotting with Corrected Attributes |
Type |
Conference Article |
Year |
2013 |
Publication |
15th IEEE International Conference on Computer Vision |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
1017-1024 |
Keywords |
|
Abstract |
We propose an approach to multi-writer word spotting, where the goal is to find a query word in a dataset comprised of document images. We propose an attributes-based approach that leads to a low-dimensional, fixed-length representation of the word images that is fast to compute and, especially, fast to compare. This approach naturally leads to an unified representation of word images and strings, which seamlessly allows one to indistinctly perform query-by-example, where the query is an image, and query-by-string, where the query is a string. We also propose a calibration scheme to correct the attributes scores based on Canonical Correlation Analysis that greatly improves the results on a challenging dataset. We test our approach on two public datasets showing state-of-the-art results. |
Address |
Sydney; Australia; December 2013 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1550-5499 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICCV |
Notes |
DAG |
Approved |
no |
Call Number |
Admin @ si @ AGF2013 |
Serial |
2327 |
Permanent link to this record |
|
|
|
Author |
Gemma Roig; Xavier Boix; R. de Nijs; Sebastian Ramos; K. Kühnlenz; Luc Van Gool |
Title |
Active MAP Inference in CRFs for Efficient Semantic Segmentation |
Type |
Conference Article |
Year |
2013 |
Publication |
15th IEEE International Conference on Computer Vision |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
2312 - 2319 |
Keywords |
Semantic Segmentation |
Abstract |
Most MAP inference algorithms for CRFs optimize an energy function knowing all the potentials. In this paper, we focus on CRFs where the computational cost of instantiating the potentials is orders of magnitude higher than MAP inference. This is often the case in semantic image segmentation, where most potentials are instantiated by slow classifiers fed with costly features. We introduce Active MAP inference 1) to on-the-fly select a subset of potentials to be instantiated in the energy function, leaving the rest of the parameters of the potentials unknown, and 2) to estimate the MAP labeling from such incomplete energy function. Results for semantic segmentation benchmarks, namely PASCAL VOC 2010 [5] and MSRC-21 [19], show that Active MAP inference achieves similar levels of accuracy but with major efficiency gains. |
Address |
Sydney; Australia; December 2013 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1550-5499 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICCV |
Notes |
ADAS; 600.057 |
Approved |
no |
Call Number |
ADAS @ adas @ RBN2013 |
Serial |
2377 |
Permanent link to this record |
|
|
|
Author |
Jorge Bernal; F. Javier Sanchez; Fernando Vilariño |
Title |
Impact of Image Preprocessing Methods on Polyp Localization in Colonoscopy Frames |
Type |
Conference Article |
Year |
2013 |
Publication |
35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
7350 - 7354 |
Keywords |
|
Abstract |
In this paper we present our image preprocessing methods as a key part of our automatic polyp localization scheme. These methods are used to assess the impact of different endoluminal scene elements when characterizing polyps. More precisely we tackle the influence of specular highlights, blood vessels and black mask surrounding the scene. Experimental results prove that the appropriate handling of these elements leads to a great improvement in polyp localization results. |
Address |
Osaka; Japan; July 2013 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1557-170X |
ISBN |
|
Medium |
|
Area |
800 |
Expedition |
|
Conference |
EMBC |
Notes |
MV; 600.047; 600.060;SIAI |
Approved |
no |
Call Number |
Admin @ si @ BSV2013 |
Serial |
2286 |
Permanent link to this record |
|
|
|
Author |
Mariella Dimiccoli; Benoît Girard; Alain Berthoz; Daniel Bennequin |
Title |
Striola Magica: a functional explanation of otolith organs |
Type |
Journal Article |
Year |
2013 |
Publication |
Journal of Computational Neuroscience |
Abbreviated Journal |
JCN |
Volume |
35 |
Issue |
2 |
Pages |
125-154 |
Keywords |
Otolith organs ;Striola; Vestibular pathway |
Abstract |
Otolith end organs of vertebrates sense linear accelerations of the head and gravitation. The hair cells on their epithelia are responsible for transduction. In mammals, the striola, parallel to the line where hair cells reverse their polarization, is a narrow region centered on a curve with curvature and torsion. It has been shown that the striolar region is functionally different from the rest, being involved in a phasic vestibular pathway. We propose a mathematical and computational model that explains the necessity of this amazing geometry for the striola to be able to carry out its function. Our hypothesis, related to the biophysics of the hair cells and to the physiology of their afferent neurons, is that striolar afferents collect information from several type I hair cells to detect the jerk in a large domain of acceleration directions. This predicts a mean number of two calyces for afferent neurons, as measured in rodents. The domain of acceleration directions sensed by our striolar model is compatible with the experimental results obtained on monkeys considering all afferents. Therefore, the main result of our study is that phasic and tonic vestibular afferents cover the same geometrical fields, but at different dynamical and frequency domains. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
Springer US |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1573-6873. 2013 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MILAB |
Approved |
no |
Call Number |
Admin @ si @DBG2013 |
Serial |
2787 |
Permanent link to this record |
|
|
|
Author |
Ernest Valveny; Oriol Ramos Terrades; Joan Mas; Marçal Rusiñol |
Title |
Interactive Document Retrieval and Classification. |
Type |
Book Chapter |
Year |
2013 |
Publication |
Multimodal Interaction in Image and Video Applications |
Abbreviated Journal |
|
Volume |
48 |
Issue |
|
Pages |
17-30 |
Keywords |
|
Abstract |
In this chapter we describe a system for document retrieval and classification following the interactive-predictive framework. In particular, the system addresses two different scenarios of document analysis: document classification based on visual appearance and logo detection. These two classical problems of document analysis are formulated following the interactive-predictive model, taking the user interaction into account to make easier the process of annotating and labelling the documents. A system implementing this model in a real scenario is presented and analyzed. This system also takes advantage of active learning techniques to speed up the task of labelling the documents. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
Angel Sappa; Jordi Vitria |
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1868-4394 |
ISBN |
978-3-642-35931-6 |
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG |
Approved |
no |
Call Number |
Admin @ si @ VRM2013 |
Serial |
2341 |
Permanent link to this record |
|
|
|
Author |
Joost Van de Weijer; Fahad Shahbaz Khan; Marc Masana |
Title |
Interactive Visual and Semantic Image Retrieval |
Type |
Book Chapter |
Year |
2013 |
Publication |
Multimodal Interaction in Image and Video Applications |
Abbreviated Journal |
|
Volume |
48 |
Issue |
|
Pages |
31-35 |
Keywords |
|
Abstract |
One direct consequence of recent advances in digital visual data generation and the direct availability of this information through the World-Wide Web, is a urgent demand for efficient image retrieval systems. The objective of image retrieval is to allow users to efficiently browse through this abundance of images. Due to the non-expert nature of the majority of the internet users, such systems should be user friendly, and therefore avoid complex user interfaces. In this chapter we investigate how high-level information provided by recently developed object recognition techniques can improve interactive image retrieval. Wel apply a bagof- word based image representation method to automatically classify images in a number of categories. These additional labels are then applied to improve the image retrieval system. Next to these high-level semantic labels, we also apply a low-level image description to describe the composition and color scheme of the scene. Both descriptions are incorporated in a user feedback image retrieval setting. The main objective is to show that automatic labeling of images with semantic labels can improve image retrieval results. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
Angel Sappa; Jordi Vitria |
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1868-4394 |
ISBN |
978-3-642-35931-6 |
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
CIC; 605.203; 600.048 |
Approved |
no |
Call Number |
Admin @ si @ WKC2013 |
Serial |
2284 |
Permanent link to this record |
|
|
|
Author |
Abel Gonzalez-Garcia; Robert Benavente; Olivier Penacchio; Javier Vazquez; Maria Vanrell; C. Alejandro Parraga |
Title |
Coloresia: An Interactive Colour Perception Device for the Visually Impaired |
Type |
Book Chapter |
Year |
2013 |
Publication |
Multimodal Interaction in Image and Video Applications |
Abbreviated Journal |
|
Volume |
48 |
Issue |
|
Pages |
47-66 |
Keywords |
|
Abstract |
A significative percentage of the human population suffer from impairments in their capacity to distinguish or even see colours. For them, everyday tasks like navigating through a train or metro network map becomes demanding. We present a novel technique for extracting colour information from everyday natural stimuli and presenting it to visually impaired users as pleasant, non-invasive sound. This technique was implemented inside a Personal Digital Assistant (PDA) portable device. In this implementation, colour information is extracted from the input image and categorised according to how human observers segment the colour space. This information is subsequently converted into sound and sent to the user via speakers or headphones. In the original implementation, it is possible for the user to send its feedback to reconfigure the system, however several features such as these were not implemented because the current technology is limited.We are confident that the full implementation will be possible in the near future as PDA technology improves. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1868-4394 |
ISBN |
978-3-642-35931-6 |
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
CIC; 600.052; 605.203 |
Approved |
no |
Call Number |
Admin @ si @ GBP2013 |
Serial |
2266 |
Permanent link to this record |
|
|
|
Author |
Michal Drozdzal; Santiago Segui; Petia Radeva; Carolina Malagelada; Fernando Azpiroz; Jordi Vitria |
Title |
An Application for Efficient Error-Free Labeling of Medical Images |
Type |
Book Chapter |
Year |
2013 |
Publication |
Multimodal Interaction in Image and Video Applications |
Abbreviated Journal |
|
Volume |
48 |
Issue |
|
Pages |
1-16 |
Keywords |
|
Abstract |
In this chapter we describe an application for efficient error-free labeling of medical images. In this scenario, the compilation of a complete training set for building a realistic model of a given class of samples is not an easy task, making the process tedious and time consuming. For this reason, there is a need for interactive labeling applications that minimize the effort of the user while providing error-free labeling. We propose a new algorithm that is based on data similarity in feature space. This method actively explores data in order to find the best label-aligned clustering and exploits it to reduce the labeler effort, that is measured by the number of “clicks. Moreover, error-free labeling is guaranteed by the fact that all data and their labels proposals are visually revised by en expert. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1868-4394 |
ISBN |
978-3-642-35931-6 |
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MILAB; OR;MV |
Approved |
no |
Call Number |
Admin @ si @ DSR2013 |
Serial |
2235 |
Permanent link to this record |