Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–12] |
Records | |||||
---|---|---|---|---|---|
Author | David Vazquez | ||||
Title | Domain Adaptation of Virtual and Real Worlds for Pedestrian Detection | Type | Book Whole | ||
Year | 2013 | Publication | PhD Thesis, Universitat de Barcelona-CVC | Abbreviated Journal | |
Volume | 1 | Issue | 1 | Pages | 1-105 |
Keywords | Pedestrian Detection; Domain Adaptation | ||||
Abstract | Pedestrian detection is of paramount interest for many applications, e.g. Advanced Driver Assistance Systems, Intelligent Video Surveillance and Multimedia systems. Most promising pedestrian detectors rely on appearance-based classifiers trained with annotated data. However, the required annotation step represents an intensive and subjective task for humans, what makes worth to minimize their intervention in this process by using computational tools like realistic virtual worlds. The reason to use these kind of tools relies in the fact that they allow the automatic generation of precise and rich annotations of visual information. Nevertheless, the use of this kind of data comes with the following question: can a pedestrian appearance model learnt with virtual-world data work successfully for pedestrian detection in real-world scenarios?. To answer this question, we conduct different experiments that suggest a positive answer. However, the pedestrian classifiers trained with virtual-world data can suffer the so called dataset shift problem as real-world based classifiers does. Accordingly, we have designed different domain adaptation techniques to face this problem, all of them integrated in a same framework (V-AYLA). We have explored different methods to train a domain adapted pedestrian classifiers by collecting a few pedestrian samples from the target domain (real world) and combining them with many samples of the source domain (virtual world). The extensive experiments we present show that pedestrian detectors developed within the V-AYLA framework do achieve domain adaptation. Ideally, we would like to adapt our system without any human intervention. Therefore, as a first proof of concept we also propose an unsupervised domain adaptation technique that avoids human intervention during the adaptation process. To the best of our knowledge, this Thesis work is the first demonstrating adaptation of virtual and real worlds for developing an object detector. Last but not least, we also assessed a different strategy to avoid the dataset shift that consists in collecting real-world samples and retrain with them in such a way that no bounding boxes of real-world pedestrians have to be provided. We show that the generated classifier is competitive with respect to the counterpart trained with samples collected by manually annotating pedestrian bounding boxes. The results presented on this Thesis not only end with a proposal for adapting a virtual-world pedestrian detector to the real world, but also it goes further by pointing out a new methodology that would allow the system to adapt to different situations, which we hope will provide the foundations for future research in this unexplored area. | ||||
Address | Barcelona | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Barcelona | Editor | Antonio Lopez;Daniel Ponsa |
Language | English | Summary Language | Original Title | ||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-940530-1-6 | Medium | ||
Area | Expedition | Conference | |||
Notes | adas | Approved | yes | ||
Call Number | ADAS @ adas @ Vaz2013 | Serial | 2276 | ||
Permanent link to this record | |||||
Author | German Ros; J. Guerrero; Angel Sappa; Daniel Ponsa; Antonio Lopez | ||||
Title | Fast and Robust l1-averaging-based Pose Estimation for Driving Scenarios | Type | Conference Article | ||
Year | 2013 | Publication | 24th British Machine Vision Conference | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | SLAM | ||||
Abstract | Robust visual pose estimation is at the core of many computer vision applications, being fundamental for Visual SLAM and Visual Odometry problems. During the last decades, many approaches have been proposed to solve these problems, being RANSAC one of the most accepted and used. However, with the arrival of new challenges, such as large driving scenarios for autonomous vehicles, along with the improvements in the data gathering frameworks, new issues must be considered. One of these issues is the capability of a technique to deal with very large amounts of data while meeting the realtime
constraint. With this purpose in mind, we present a novel technique for the problem of robust camera-pose estimation that is more suitable for dealing with large amount of data, which additionally, helps improving the results. The method is based on a combination of a very fast coarse-evaluation function and a robust ℓ1-averaging procedure. Such scheme leads to high-quality results while taking considerably less time than RANSAC. Experimental results on the challenging KITTI Vision Benchmark Suite are provided, showing the validity of the proposed approach. |
||||
Address | Bristol; UK; September 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | BMVC | ||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ RGS2013b; ADAS @ adas @ | Serial | 2274 | ||
Permanent link to this record | |||||
Author | Jaume Amores | ||||
Title | Multiple Instance Classification: review, taxonomy and comparative study | Type | Journal Article | ||
Year | 2013 | Publication | Artificial Intelligence | Abbreviated Journal | AI |
Volume | 201 | Issue | Pages | 81-105 | |
Keywords | Multi-instance learning; Codebook; Bag-of-Words | ||||
Abstract | Multiple Instance Learning (MIL) has become an important topic in the pattern recognition community, and many solutions to this problemhave been proposed until now. Despite this fact, there is a lack of comparative studies that shed light into the characteristics and behavior of the different methods. In this work we provide such an analysis focused on the classification task (i.e.,leaving out other learning tasks such as regression). In order to perform our study, we implemented
fourteen methods grouped into three different families. We analyze the performance of the approaches across a variety of well-known databases, and we also study their behavior in synthetic scenarios in order to highlight their characteristics. As a result of this analysis, we conclude that methods that extract global bag-level information show a clearly superior performance in general. In this sense, the analysis permits us to understand why some types of methods are more successful than others, and it permits us to establish guidelines in the design of new MIL methods. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier Science Publishers Ltd. Essex, UK | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0004-3702 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 601.042; 600.057 | Approved | no | ||
Call Number | Admin @ si @ Amo2013 | Serial | 2273 | ||
Permanent link to this record | |||||
Author | Muhammad Muzzamil Luqman; Jean-Yves Ramel; Josep Llados | ||||
Title | Multilevel Analysis of Attributed Graphs for Explicit Graph Embedding in Vector Spaces | Type | Book Chapter | ||
Year | 2013 | Publication | Graph Embedding for Pattern Analysis | Abbreviated Journal | |
Volume | Issue | Pages | 1-26 | ||
Keywords | |||||
Abstract | Ability to recognize patterns is among the most crucial capabilities of human beings for their survival, which enables them to employ their sophisticated neural and cognitive systems [1], for processing complex audio, visual, smell, touch, and taste signals. Man is the most complex and the best existing system of pattern recognition. Without any explicit thinking, we continuously compare, classify, and identify huge amount of signal data everyday [2], starting from the time we get up in the morning till the last second we fall asleep. This includes recognizing the face of a friend in a crowd, a spoken word embedded in noise, the proper key to lock the door, smell of coffee, the voice of a favorite singer, the recognition of alphabetic characters, and millions of more tasks that we perform on regular basis. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer New York | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4614-4456-5 | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ LRL2013b | Serial | 2271 | ||
Permanent link to this record | |||||
Author | Muhammad Muzzamil Luqman; Jean-Yves Ramel; Josep Llados; Thierry Brouard | ||||
Title | Fuzzy Multilevel Graph Embedding | Type | Journal Article | ||
Year | 2013 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 46 | Issue | 2 | Pages | 551-565 |
Keywords | Pattern recognition; Graphics recognition; Graph clustering; Graph classification; Explicit graph embedding; Fuzzy logic | ||||
Abstract | Structural pattern recognition approaches offer the most expressive, convenient, powerful but computational expensive representations of underlying relational information. To benefit from mature, less expensive and efficient state-of-the-art machine learning models of statistical pattern recognition they must be mapped to a low-dimensional vector space. Our method of explicit graph embedding bridges the gap between structural and statistical pattern recognition. We extract the topological, structural and attribute information from a graph and encode numeric details by fuzzy histograms and symbolic details by crisp histograms. The histograms are concatenated to achieve a simple and straightforward embedding of graph into a low-dimensional numeric feature vector. Experimentation on standard public graph datasets shows that our method outperforms the state-of-the-art methods of graph embedding for richly attributed graphs. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.042; 600.045; 605.203 | Approved | no | ||
Call Number | Admin @ si @ LRL2013a | Serial | 2270 | ||
Permanent link to this record | |||||
Author | Jose Manuel Alvarez; Theo Gevers; Ferran Diego; Antonio Lopez | ||||
Title | Road Geometry Classification by Adaptative Shape Models | Type | Journal Article | ||
Year | 2013 | Publication | IEEE Transactions on Intelligent Transportation Systems | Abbreviated Journal | TITS |
Volume | 14 | Issue | 1 | Pages | 459-468 |
Keywords | road detection | ||||
Abstract | Vision-based road detection is important for different applications in transportation, such as autonomous driving, vehicle collision warning, and pedestrian crossing detection. Common approaches to road detection are based on low-level road appearance (e.g., color or texture) and neglect of the scene geometry and context. Hence, using only low-level features makes these algorithms highly depend on structured roads, road homogeneity, and lighting conditions. Therefore, the aim of this paper is to classify road geometries for road detection through the analysis of scene composition and temporal coherence. Road geometry classification is proposed by building corresponding models from training images containing prototypical road geometries. We propose adaptive shape models where spatial pyramids are steered by the inherent spatial structure of road images. To reduce the influence of lighting variations, invariant features are used. Large-scale experiments show that the proposed road geometry classifier yields a high recognition rate of 73.57% ± 13.1, clearly outperforming other state-of-the-art methods. Including road shape information improves road detection results over existing appearance-based methods. Finally, it is shown that invariant features and temporal information provide robustness against disturbing imaging conditions. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1524-9050 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS;ISE | Approved | no | ||
Call Number | Admin @ si @ AGD2013;; ADAS @ adas @ | Serial | 2269 | ||
Permanent link to this record | |||||
Author | Isabel Guitart; Jordi Conesa; Luis Villarejo; Agata Lapedriza; David Masip; Antoni Perez; Elena Planas | ||||
Title | Opinion Mining on Educational Resources at the Open University of Catalonia | Type | Conference Article | ||
Year | 2013 | Publication | 3rd International Workshop on Adaptive Learning via Interactive, Collaborative and Emotional approaches. In conjunction with CISIS 2013: The 7th International Conference on Complex, Intelligent, and Software Intensive Systems | Abbreviated Journal | |
Volume | Issue | Pages | 385 - 390 | ||
Keywords | |||||
Abstract | In order to make improvements to teaching, it is vital to know what students think of the way they are taught. With that purpose in mind, exhaustively analyzing the forums associated with the subjects taught at the Universitat Oberta de Cataluya (UOC) would be extremely helpful, as the university's students often post comments on their learning experiences in them. Exploiting the content of such forums is not a simple undertaking. The volume of data involved is very large, and performing the task manually would require a great deal of effort from lecturers. As a first step to solve this problem, we propose a tool to automatically analyze the posts in forums of communities of UOC students and teachers, with a view to systematically mining the opinions they contain. This article defines the architecture of such tool and explains how lexical-semantic and language technology resources can be used to that end. For pilot testing purposes, the tool has been used to identify students' opinions on the UOC's Business Intelligence master's degree course during the last two years. The paper discusses the results of such test. The contribution of this paper is twofold. Firstly, it demonstrates the feasibility of using natural language parsing techniques to help teachers to make decisions. Secondly, it introduces a simple tool that can be refined and adapted to a virtual environment for the purpose in question. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-0-7695-4992-7 | Medium | ||
Area | Expedition | Conference | ALICE | ||
Notes | OR;MV | Approved | no | ||
Call Number | GCV2013 | Serial | 2268 | ||
Permanent link to this record | |||||
Author | Shida Beigpour | ||||
Title | Illumination and object reflectance modeling | Type | Book Whole | ||
Year | 2013 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | More realistic and accurate models of the scene illumination and object reflectance can greatly improve the quality of many computer vision and computer graphics tasks. Using such model, a more profound knowledge about the interaction of light with object surfaces can be established which proves crucial to a variety of computer vision applications. In the current work, we investigate the various existing approaches to illumination and reflectance modeling and form an analysis on their shortcomings in capturing the complexity of real-world scenes. Based on this analysis we propose improvements to different aspects of reflectance and illumination estimation in order to more realistically model the real-world scenes in the presence of complex lighting phenomena (i.e, multiple illuminants, interreflections and shadows). Moreover, we captured our own multi-illuminant dataset which consists of complex scenes and illumination conditions both outdoor and in laboratory conditions. In addition we investigate the use of synthetic data to facilitate the construction of datasets and improve the process of obtaining ground-truth information. | ||||
Address | Barcelona | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Joost Van de Weijer;Ernest Valveny | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ Bei2013 | Serial | 2267 | ||
Permanent link to this record | |||||
Author | Abel Gonzalez-Garcia; Robert Benavente; Olivier Penacchio; Javier Vazquez; Maria Vanrell; C. Alejandro Parraga | ||||
Title | Coloresia: An Interactive Colour Perception Device for the Visually Impaired | Type | Book Chapter | ||
Year | 2013 | Publication | Multimodal Interaction in Image and Video Applications | Abbreviated Journal | |
Volume | 48 | Issue | Pages | 47-66 | |
Keywords | |||||
Abstract | A significative percentage of the human population suffer from impairments in their capacity to distinguish or even see colours. For them, everyday tasks like navigating through a train or metro network map becomes demanding. We present a novel technique for extracting colour information from everyday natural stimuli and presenting it to visually impaired users as pleasant, non-invasive sound. This technique was implemented inside a Personal Digital Assistant (PDA) portable device. In this implementation, colour information is extracted from the input image and categorised according to how human observers segment the colour space. This information is subsequently converted into sound and sent to the user via speakers or headphones. In the original implementation, it is possible for the user to send its feedback to reconfigure the system, however several features such as these were not implemented because the current technology is limited.We are confident that the full implementation will be possible in the near future as PDA technology improves. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1868-4394 | ISBN | 978-3-642-35931-6 | Medium | |
Area | Expedition | Conference | |||
Notes | CIC; 600.052; 605.203 | Approved | no | ||
Call Number | Admin @ si @ GBP2013 | Serial | 2266 | ||
Permanent link to this record | |||||
Author | Rahat Khan; Joost Van de Weijer; Dimosthenis Karatzas; Damien Muselet | ||||
Title | Towards multispectral data acquisition with hand-held devices | Type | Conference Article | ||
Year | 2013 | Publication | 20th IEEE International Conference on Image Processing | Abbreviated Journal | |
Volume | Issue | Pages | 2053 - 2057 | ||
Keywords | Multispectral; mobile devices; color measurements | ||||
Abstract | We propose a method to acquire multispectral data with handheld devices with front-mounted RGB cameras. We propose to use the display of the device as an illuminant while the camera captures images illuminated by the red, green and
blue primaries of the display. Three illuminants and three response functions of the camera lead to nine response values which are used for reflectance estimation. Results are promising and show that the accuracy of the spectral reconstruction improves in the range from 30-40% over the spectral reconstruction based on a single illuminant. Furthermore, we propose to compute sensor-illuminant aware linear basis by discarding the part of the reflectances that falls in the sensorilluminant null-space. We show experimentally that optimizing reflectance estimation on these new basis functions decreases the RMSE significantly over basis functions that are independent to sensor-illuminant. We conclude that, multispectral data acquisition is potentially possible with consumer hand-held devices such as tablets, mobiles, and laptops, opening up applications which are currently considered to be unrealistic. |
||||
Address | Melbourne; Australia; September 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICIP | ||
Notes | CIC; DAG; 600.048 | Approved | no | ||
Call Number | Admin @ si @ KWK2013b | Serial | 2265 | ||
Permanent link to this record | |||||
Author | Shida Beigpour; Marc Serra; Joost Van de Weijer; Robert Benavente; Maria Vanrell; Olivier Penacchio; Dimitris Samaras | ||||
Title | Intrinsic Image Evaluation On Synthetic Complex Scenes | Type | Conference Article | ||
Year | 2013 | Publication | 20th IEEE International Conference on Image Processing | Abbreviated Journal | |
Volume | Issue | Pages | 285 - 289 | ||
Keywords | |||||
Abstract | Scene decomposition into its illuminant, shading, and reflectance intrinsic images is an essential step for scene understanding. Collecting intrinsic image groundtruth data is a laborious task. The assumptions on which the ground-truth
procedures are based limit their application to simple scenes with a single object taken in the absence of indirect lighting and interreflections. We investigate synthetic data for intrinsic image research since the extraction of ground truth is straightforward, and it allows for scenes in more realistic situations (e.g, multiple illuminants and interreflections). With this dataset we aim to motivate researchers to further explore intrinsic image decomposition in complex scenes. |
||||
Address | Melbourne; Australia; September 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICIP | ||
Notes | CIC; 600.048; 600.052; 600.051 | Approved | no | ||
Call Number | Admin @ si @ BSW2013 | Serial | 2264 | ||
Permanent link to this record | |||||
Author | Fahad Shahbaz Khan; Joost Van de Weijer; Sadiq Ali; Michael Felsberg | ||||
Title | Evaluating the impact of color on texture recognition | Type | Conference Article | ||
Year | 2013 | Publication | 15th International Conference on Computer Analysis of Images and Patterns | Abbreviated Journal | |
Volume | 8047 | Issue | Pages | 154-162 | |
Keywords | Color; Texture; image representation | ||||
Abstract | State-of-the-art texture descriptors typically operate on grey scale images while ignoring color information. A common way to obtain a joint color-texture representation is to combine the two visual cues at the pixel level. However, such an approach provides sub-optimal results for texture categorisation task.
In this paper we investigate how to optimally exploit color information for texture recognition. We evaluate a variety of color descriptors, popular in image classification, for texture categorisation. In addition we analyze different fusion approaches to combine color and texture cues. Experiments are conducted on the challenging scenes and 10 class texture datasets. Our experiments clearly suggest that in all cases color names provide the best performance. Late fusion is the best strategy to combine color and texture. By selecting the best color descriptor with optimal fusion strategy provides a gain of 5% to 8% compared to texture alone on scenes and texture datasets. |
||||
Address | York; UK; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-40260-9 | Medium | |
Area | Expedition | Conference | CAIP | ||
Notes | CIC; 600.048 | Approved | no | ||
Call Number | Admin @ si @ KWA2013 | Serial | 2263 | ||
Permanent link to this record | |||||
Author | Rahat Khan; Joost Van de Weijer; Fahad Shahbaz Khan; Damien Muselet; christophe Ducottet; Cecile Barat | ||||
Title | Discriminative Color Descriptors | Type | Conference Article | ||
Year | 2013 | Publication | IEEE Conference on Computer Vision and Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 2866 - 2873 | ||
Keywords | |||||
Abstract | Color description is a challenging task because of large variations in RGB values which occur due to scene accidental events, such as shadows, shading, specularities, illuminant color changes, and changes in viewing geometry. Traditionally, this challenge has been addressed by capturing the variations in physics-based models, and deriving invariants for the undesired variations. The drawback of this approach is that sets of distinguishable colors in the original color space are mapped to the same value in the photometric invariant space. This results in a drop of discriminative power of the color description. In this paper we take an information theoretic approach to color description. We cluster color values together based on their discriminative power in a classification problem. The clustering has the explicit objective to minimize the drop of mutual information of the final representation. We show that such a color description automatically learns a certain degree of photometric invariance. We also show that a universal color representation, which is based on other data sets than the one at hand, can obtain competing performance. Experiments show that the proposed descriptor outperforms existing photometric invariants. Furthermore, we show that combined with shape description these color descriptors obtain excellent results on four challenging datasets, namely, PASCAL VOC 2007, Flowers-102, Stanford dogs-120 and Birds-200. | ||||
Address | Portland; Oregon; June 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1063-6919 | ISBN | Medium | ||
Area | Expedition | Conference | CVPR | ||
Notes | CIC; 600.048 | Approved | no | ||
Call Number | Admin @ si @ KWK2013a | Serial | 2262 | ||
Permanent link to this record | |||||
Author | Christophe Rigaud; Dimosthenis Karatzas; Joost Van de Weijer; Jean-Christophe Burie; Jean-Marc Ogier | ||||
Title | Automatic text localisation in scanned comic books | Type | Conference Article | ||
Year | 2013 | Publication | Proceedings of the International Conference on Computer Vision Theory and Applications | Abbreviated Journal | |
Volume | Issue | Pages | 814-819 | ||
Keywords | Text localization; comics; text/graphic separation; complex background; unstructured document | ||||
Abstract | Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent document understanding enable direct content-based search as opposed to metadata only search (e.g. album title or author name). Few studies have been done in this direction. In this work we detail a novel approach for the automatic text localization in scanned comics book pages, an essential step towards a fully automatic comics book understanding. We focus on speech text as it is semantically important and represents the majority of the text present in comics. The approach is compared with existing methods of text localization found in the literature and results are presented. | ||||
Address | Barcelona; February 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | VISAPP | ||
Notes | DAG; CIC; 600.056 | Approved | no | ||
Call Number | Admin @ si @ RKW2013b | Serial | 2261 | ||
Permanent link to this record | |||||
Author | Christophe Rigaud; Dimosthenis Karatzas; Joost Van de Weijer; Jean-Christophe Burie; Jean-Marc Ogier | ||||
Title | An active contour model for speech balloon detection in comics | Type | Conference Article | ||
Year | 2013 | Publication | 12th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1240-1244 | ||
Keywords | |||||
Abstract | Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent comic book understanding would enable a variety of new applications, including content-based retrieval and content retargeting. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. Few studies have been done in this direction. In this work we detail a novel approach for closed and non-closed speech balloon localization in scanned comic book pages, an essential step towards a fully automatic comic book understanding. The approach is compared with existing methods for closed balloon localization found in the literature and results are presented. | ||||
Address | washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-5363 | ISBN | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; CIC; 600.056 | Approved | no | ||
Call Number | Admin @ si @ RKW2013a | Serial | 2260 | ||
Permanent link to this record |