Home | << 1 2 3 4 5 6 7 8 9 10 >> |
Records | |||||
---|---|---|---|---|---|
Author | Aura Hernandez-Sabate; Lluis Albarracin; Daniel Calvo; Nuria Gorgorio | ||||
Title | EyeMath: Identifying Mathematics Problem Solving Processes in a RTS Video Game | Type | Conference Article | ||
Year | 2016 | Publication | 5th International Conference Games and Learning Alliance | Abbreviated Journal | |
Volume | 10056 | Issue | Pages | 50-59 | |
Keywords | Simulation environment; Automated Driving; Driver-Vehicle interaction | ||||
Abstract | Photorealistic virtual environments are crucial for developing and testing automated driving systems in a safe way during trials. As commercially available simulators are expensive and bulky, this paper presents a low-cost, extendable, and easy-to-use (LEE) virtual environment with the aim to highlight its utility for level 3 driving automation. In particular, an experiment is performed using the presented simulator to explore the influence of different variables regarding control transfer of the car after the system was driving autonomously in a highway scenario. The results show that the speed of the car at the time when the system needs to transfer the control to the human driver is critical. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | GALA | ||
Notes | ADAS;IAM; | Approved | no | ||
Call Number | HAC2016 | Serial | 2864 | ||
Permanent link to this record | |||||
Author | Saad Minhas; Aura Hernandez-Sabate; Shoaib Ehsan; Katerine Diaz; Ales Leonardis; Antonio Lopez; Klaus McDonald Maier | ||||
Title | LEE: A photorealistic Virtual Environment for Assessing Driver-Vehicle Interactions in Self-Driving Mode | Type | Conference Article | ||
Year | 2016 | Publication | 14th European Conference on Computer Vision Workshops | Abbreviated Journal | |
Volume | 9915 | Issue | Pages | 894-900 | |
Keywords | Simulation environment; Automated Driving; Driver-Vehicle interaction | ||||
Abstract | Photorealistic virtual environments are crucial for developing and testing automated driving systems in a safe way during trials. As commercially available simulators are expensive and bulky, this paper presents a low-cost, extendable, and easy-to-use (LEE) virtual environment with the aim to highlight its utility for level 3 driving automation. In particular, an experiment is performed using the presented simulator to explore the influence of different variables regarding control transfer of the car after the system was driving autonomously in a highway scenario. The results show that the speed of the car at the time when the system needs to transfer the control to the human driver is critical. | ||||
Address | Amsterdam; The Netherlands; October 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECCVW | ||
Notes | ADAS;IAM; 600.085; 600.076 | Approved | no | ||
Call Number | MHE2016 | Serial | 2865 | ||
Permanent link to this record | |||||
Author | Arash Akbarinia; C. Alejandro Parraga | ||||
Title | Biologically plausible boundary detection | Type | Conference Article | ||
Year | 2016 | Publication | 27th British Machine Vision Conference | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Edges are key components of any visual scene to the extent that we can recognise objects merely by their silhouettes. The human visual system captures edge information through neurons in the visual cortex that are sensitive to both intensity discontinuities and particular orientations. The “classical approach” assumes that these cells are only responsive to the stimulus present within their receptive fields, however, recent studies demonstrate that surrounding regions and inter-areal feedback connections influence their responses significantly. In this work we propose a biologically-inspired edge detection model in which orientation selective neurons are represented through the first derivative of a Gaussian function resembling double-opponent cells in the primary visual cortex (V1). In our model we account for four kinds of surround, i.e. full, far, iso- and orthogonal-orientation, whose contributions are contrast-dependant. The output signal from V1 is pooled in its perpendicular direction by larger V2 neurons employing a contrast-variant centre-surround kernel. We further introduce a feedback connection from higher-level visual areas to the lower ones. The results of our model on two benchmark datasets show a big improvement compared to the current non-learning and biologically-inspired state-of-the-art algorithms while being competitive to the learning-based methods. | ||||
Address | York; UK; September 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | BMVC | ||
Notes | NEUROBIT; 600.068; 600.072 | Approved | no | ||
Call Number | Admin @ si @ AkP2016a | Serial | 2867 | ||
Permanent link to this record | |||||
Author | Azadeh S. Mozafari; David Vazquez; Mansour Jamzad; Antonio Lopez | ||||
Title | Node-Adapt, Path-Adapt and Tree-Adapt:Model-Transfer Domain Adaptation for Random Forest | Type | Miscellaneous | ||
Year | 2016 | Publication | Arxiv | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Domain Adaptation; Pedestrian detection; Random Forest | ||||
Abstract | Random Forest (RF) is a successful paradigm for learning classifiers due to its ability to learn from large feature spaces and seamlessly integrate multi-class classification, as well as the achieved accuracy and processing efficiency. However, as many other classifiers, RF requires domain adaptation (DA) provided that there is a mismatch between the training (source) and testing (target) domains which provokes classification degradation. Consequently, different RF-DA methods have been proposed, which not only require target-domain samples but revisiting the source-domain ones, too. As novelty, we propose three inherently different methods (Node-Adapt, Path-Adapt and Tree-Adapt) that only require the learned source-domain RF and a relatively few target-domain samples for DA, i.e. source-domain samples do not need to be available. To assess the performance of our proposals we focus on image-based object detection, using the pedestrian detection problem as challenging proof-of-concept. Moreover, we use the RF with expert nodes because it is a competitive patch-based pedestrian model. We test our Node-, Path- and Tree-Adapt methods in standard benchmarks, showing that DA is largely achieved. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | ADAS @ adas @ MVJ2016 | Serial | 2868 | ||
Permanent link to this record | |||||
Author | Youssef El Rhabi; Simon Loic; Brun Luc; Josep Llados; Felipe Lumbreras | ||||
Title | Information Theoretic Rotationwise Robust Binary Descriptor Learning | Type | Conference Article | ||
Year | 2016 | Publication | Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) | Abbreviated Journal | |
Volume | Issue | Pages | 368-378 | ||
Keywords | |||||
Abstract | In this paper, we propose a new data-driven approach for binary descriptor selection. In order to draw a clear analysis of common designs, we present a general information-theoretic selection paradigm. It encompasses several standard binary descriptor construction schemes, including a recent state-of-the-art one named BOLD. We pursue the same endeavor to increase the stability of the produced descriptors with respect to rotations. To achieve this goal, we have designed a novel offline selection criterion which is better adapted to the online matching procedure. The effectiveness of our approach is demonstrated on two standard datasets, where our descriptor is compared to BOLD and to several classical descriptors. In particular, it emerges that our approach can reproduce equivalent if not better performance as BOLD while relying on twice shorter descriptors. Such an improvement can be influential for real-time applications. | ||||
Address | Mérida; Mexico; November 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | S+SSPR | ||
Notes | DAG; ADAS; 600.097; 600.086 | Approved | no | ||
Call Number | Admin @ si @ RLL2016 | Serial | 2871 | ||
Permanent link to this record | |||||
Author | Anjan Dutta; Umapada Pal; Josep Llados | ||||
Title | Compact Correlated Features for Writer Independent Signature Verification | Type | Conference Article | ||
Year | 2016 | Publication | 23rd International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | This paper considers the offline signature verification problem which is considered to be an important research line in the field of pattern recognition. In this work we propose hybrid features that consider the local features and their global statistics in the signature image. This has been done by creating a vocabulary of histogram of oriented gradients (HOGs). We impose weights on these local features based on the height information of water reservoirs obtained from the signature. Spatial information between local features are thought to play a vital role in considering the geometry of the signatures which distinguishes the originals from the forged ones. Nevertheless, learning a condensed set of higher order neighbouring features based on visual words, e.g., doublets and triplets, continues to be a challenging problem as possible combinations of visual words grow exponentially. To avoid this explosion of size, we create a code of local pairwise features which are represented as joint descriptors. Local features are paired based on the edges of a graph representation built upon the Delaunay triangulation. We reveal the advantage of combining both type of visual codebooks (order one and pairwise) for signature verification task. This is validated through an encouraging result on two benchmark datasets viz. CEDAR and GPDS300. | ||||
Address | Cancun; Mexico; December 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.097 | Approved | no | ||
Call Number | Admin @ si @ DPL2016 | Serial | 2875 | ||
Permanent link to this record | |||||
Author | Sounak Dey; Anguelos Nicolaou; Josep Llados; Umapada Pal | ||||
Title | Local Binary Pattern for Word Spotting in Handwritten Historical Document | Type | Conference Article | ||
Year | 2016 | Publication | Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) | Abbreviated Journal | |
Volume | Issue | Pages | 574-583 | ||
Keywords | Local binary patterns; Spatial sampling; Learning-free; Word spotting; Handwritten; Historical document analysis; Large-scale data | ||||
Abstract | Digital libraries store images which can be highly degraded and to index this kind of images we resort to word spotting as our information retrieval system. Information retrieval for handwritten document images is more challenging due to the difficulties in complex layout analysis, large variations of writing styles, and degradation or low quality of historical manuscripts. This paper presents a simple innovative learning-free method for word spotting from large scale historical documents combining Local Binary Pattern (LBP) and spatial sampling. This method offers three advantages: firstly, it operates in completely learning free paradigm which is very different from unsupervised learning methods, secondly, the computational time is significantly low because of the LBP features, which are very fast to compute, and thirdly, the method can be used in scenarios where annotations are not available. Finally, we compare the results of our proposed retrieval method with other methods in the literature and we obtain the best results in the learning free paradigm. | ||||
Address | Merida; Mexico; December 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | S+SSPR | ||
Notes | DAG; 600.097; 602.006; 603.053 | Approved | no | ||
Call Number | Admin @ si @ DNL2016 | Serial | 2876 | ||
Permanent link to this record | |||||
Author | Juan Ignacio Toledo; Sebastian Sudholt; Alicia Fornes; Jordi Cucurull; A. Fink; Josep Llados | ||||
Title | Handwritten Word Image Categorization with Convolutional Neural Networks and Spatial Pyramid Pooling | Type | Conference Article | ||
Year | 2016 | Publication | Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) | Abbreviated Journal | |
Volume | 10029 | Issue | Pages | 543-552 | |
Keywords | Document image analysis; Word image categorization; Convolutional neural networks; Named entity detection | ||||
Abstract | The extraction of relevant information from historical document collections is one of the key steps in order to make these documents available for access and searches. The usual approach combines transcription and grammars in order to extract semantically meaningful entities. In this paper, we describe a new method to obtain word categories directly from non-preprocessed handwritten word images. The method can be used to directly extract information, being an alternative to the transcription. Thus it can be used as a first step in any kind of syntactical analysis. The approach is based on Convolutional Neural Networks with a Spatial Pyramid Pooling layer to deal with the different shapes of the input images. We performed the experiments on a historical marriage record dataset, obtaining promising results. | ||||
Address | Merida; Mexico; December 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer International Publishing | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-319-49054-0 | Medium | ||
Area | Expedition | Conference | S+SSPR | ||
Notes | DAG; 600.097; 602.006 | Approved | no | ||
Call Number | Admin @ si @ TSF2016 | Serial | 2877 | ||
Permanent link to this record | |||||
Author | Antoni Gurgui; Debora Gil; Enric Marti; Vicente Grau | ||||
Title | Left-Ventricle Basal Region Constrained Parametric Mapping to Unitary Domain | Type | Conference Article | ||
Year | 2016 | Publication | 7th International Workshop on Statistical Atlases & Computational Modelling of the Heart | Abbreviated Journal | |
Volume | 10124 | Issue | Pages | 163-171 | |
Keywords | Laplacian; Constrained maps; Parameterization; Basal ring | ||||
Abstract | Due to its complex geometry, the basal ring is often omitted when putting different heart geometries into correspondence. In this paper, we present the first results on a new mapping of the left ventricle basal rings onto a normalized coordinate system using a fold-over free approach to the solution to the Laplacian. To guarantee correspondences between different basal rings, we imposed some internal constrained positions at anatomical landmarks in the normalized coordinate system. To prevent internal fold-overs, constraints are handled by cutting the volume into regions defined by anatomical features and mapping each piece of the volume separately. Initial results presented in this paper indicate that our method is able to handle internal constrains without introducing fold-overs and thus guarantees one-to-one mappings between different basal ring geometries. | ||||
Address | Athens; October 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | STACOM | ||
Notes | IAM; | Approved | no | ||
Call Number | Admin @ si @ GGM2016 | Serial | 2884 | ||
Permanent link to this record | |||||
Author | Carles Sanchez; Debora Gil; Jorge Bernal; F. Javier Sanchez; Marta Diez-Ferrer; Antoni Rosell | ||||
Title | Navigation Path Retrieval from Videobronchoscopy using Bronchial Branches | Type | Conference Article | ||
Year | 2016 | Publication | 19th International Conference on Medical Image Computing and Computer Assisted Intervention Workshops | Abbreviated Journal | |
Volume | 9401 | Issue | Pages | 62-70 | |
Keywords | Bronchoscopy navigation; Lumen center; Brochial branches; Navigation path; Videobronchoscopy | ||||
Abstract | Bronchoscopy biopsy can be used to diagnose lung cancer without risking complications of other interventions like transthoracic needle aspiration. During bronchoscopy, the clinician has to navigate through the bronchial tree to the target lesion. A main drawback is the difficulty to check whether the exploration is following the correct path. The usual guidance using fluoroscopy implies repeated radiation of the clinician, while alternative systems (like electromagnetic navigation) require specific equipment that increases intervention costs. We propose to compute the navigated path using anatomical landmarks extracted from the sole analysis of videobronchoscopy images. Such landmarks allow matching the current exploration to the path previously planned on a CT to indicate clinician whether the planning is being correctly followed or not. We present a feasibility study of our landmark based CT-video matching using bronchoscopic videos simulated on a virtual bronchoscopy interactive interface. | ||||
Address | Quebec; Canada; September 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | MICCAIW | ||
Notes | IAM; MV; 600.060; 600.075 | Approved | no | ||
Call Number | Admin @ si @ SGB2016 | Serial | 2885 | ||
Permanent link to this record | |||||
Author | Juan A. Carvajal Ayala; Dennis Romero; Angel Sappa | ||||
Title | Fine-tuning based deep convolutional networks for lepidopterous genus recognition | Type | Conference Article | ||
Year | 2016 | Publication | 21st Ibero American Congress on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 467-475 | ||
Keywords | |||||
Abstract | This paper describes an image classification approach oriented to identify specimens of lepidopterous insects at Ecuadorian ecological reserves. This work seeks to contribute to studies in the area of biology about genus of butterflies and also to facilitate the registration of unrecognized specimens. The proposed approach is based on the fine-tuning of three widely used pre-trained Convolutional Neural Networks (CNNs). This strategy is intended to overcome the reduced number of labeled images. Experimental results with a dataset labeled by expert biologists is presented, reaching a recognition accuracy above 92%. | ||||
Address | Lima; Perú; November 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CIARP | ||
Notes | ADAS; 600.086 | Approved | no | ||
Call Number | Admin @ si @ CRS2016 | Serial | 2913 | ||
Permanent link to this record | |||||
Author | H. Martin Kjer; Jens Fagertun; Sergio Vera; Debora Gil; Miguel Angel Gonzalez Ballester; Rasmus R. Paulsena | ||||
Title | Free-form image registration of human cochlear uCT data using skeleton similarity as anatomical prior | Type | Journal Article | ||
Year | 2016 | Publication | Patter Recognition Letters | Abbreviated Journal | PRL |
Volume | 76 | Issue | 1 | Pages | 76-82 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; 600.060 | Approved | no | ||
Call Number | Admin @ si @ MFV2017b | Serial | 2941 | ||
Permanent link to this record | |||||
Author | Albert Berenguel; Oriol Ramos Terrades; Josep Llados; Cristina Cañero | ||||
Title | Banknote counterfeit detection through background texture printing analysis | Type | Conference Article | ||
Year | 2016 | Publication | 12th IAPR Workshop on Document Analysis Systems | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | This paper is focused on the detection of counterfeit photocopy banknotes. The main difficulty is to work on a real industrial scenario without any constraint about the acquisition device and with a single image. The main contributions of this paper are twofold: first the adaptation and performance evaluation of existing approaches to classify the genuine and photocopy banknotes using background texture printing analysis, which have not been applied into this context before. Second, a new dataset of Euro banknotes images acquired with several cameras under different luminance conditions to evaluate these methods. Experiments on the proposed algorithms show that mixing SIFT features and sparse coding dictionaries achieves quasi perfect classification using a linear SVM with the created dataset. Approaches using dictionaries to cover all possible texture variations have demonstrated to be robust and outperform the state-of-the-art methods using the proposed benchmark. | ||||
Address | Rumania; May 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | DAS | ||
Notes | DAG; 600.061; 601.269; 600.097 | Approved | no | ||
Call Number | Admin @ si @ BRL2016 | Serial | 2950 | ||
Permanent link to this record | |||||
Author | Marc Sunset Perez; Marc Comino Trinidad; Dimosthenis Karatzas; Antonio Chica Calaf; Pere Pau Vazquez Alcocer | ||||
Title | Development of general‐purpose projection‐based augmented reality systems | Type | Journal | ||
Year | 2016 | Publication | IADIs international journal on computer science and information systems | Abbreviated Journal | IADIs |
Volume | 11 | Issue | 2 | Pages | 1-18 |
Keywords | |||||
Abstract | Despite the large amount of methods and applications of augmented reality, there is little homogenizatio n on the software platforms that support them. An exception may be the low level control software that is provided by some high profile vendors such as Qualcomm and Metaio. However, these provide fine grain modules for e.g. element tracking. We are more co ncerned on the application framework, that includes the control of the devices working together for the development of the AR experience. In this paper we describe the development of a software framework for AR setups. We concentrate on the modular design of the framework, but also on some hard problems such as the calibration stage, crucial for projection – based AR. The developed framework is suitable and has been tested in AR applications using camera – projector pairs, for both fixed and nomadic setups | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.084 | Approved | no | ||
Call Number | Admin @ si @ SCK2016 | Serial | 2890 | ||
Permanent link to this record | |||||
Author | Lluis Gomez | ||||
Title | Exploiting Similarity Hierarchies for Multi-script Scene Text Understanding | Type | Book Whole | ||
Year | 2016 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | This thesis addresses the problem of automatic scene text understanding in unconstrained conditions. In particular, we tackle the tasks of multi-language and arbitrary-oriented text detection, tracking, and script identification in natural scenes.
For this we have developed a set of generic methods that build on top of the basic observation that text has always certain key visual and structural characteristics that are independent of the language or script in which it is written. Text instances in any language or script are always formed as groups of similar atomic parts, being them either individual characters, small stroke parts, or even whole words in the case of cursive text. This holistic (sumof-parts) and recursive perspective has lead us to explore different variants of the “segmentation and grouping” paradigm of computer vision. Scene text detection methodologies are usually based in classification of individual regions or patches, using a priory knowledge for a given script or language. Human perception of text, on the other hand, is based on perceptual organization through which text emerges as a perceptually significant group of atomic objects. In this thesis, we argue that the text detection problem must be posed as the detection of meaningful groups of regions. We address the problem of text detection in natural scenes from a hierarchical perspective, making explicit use of the recursive nature of text, aiming directly to the detection of region groupings corresponding to text within a hierarchy produced by an agglomerative similarity clustering process over individual regions. We propose an optimal way to construct such an hierarchy introducing a feature space designed to produce text group hypothese with high recall and a novel stopping rule combining a discriminative classifier and a probabilistic measure of group meaningfulness based in perceptual organization. Within this generic framework, we design a text-specific object proposals algorithm that, contrary to existing generic object proposals methods, aims directly to the detection of text regions groupings. For this, we abandon the rigid definition of “what is text” of traditional specialized text detectors, and move towards more fuzzy perspective of grouping-based object proposals methods. Then, we present a hybrid algorithm for detection and tracking of scene text where the notion of region groupings plays also a central role. By leveraging the structural arrangement of text group components between consecutive frames we can improve the overall tracking performance of the system. Finally, since our generic detection framework is inherently designed for multi-language environments, we focus on the problem of script identification in order to build a multi-language end-toend reading system. Facing this problem with state of the art CNN classifiers is not straightforward, as they fail to address a key characteristic of scene text instances: their extremely variable aspect ratio. Instead of resizing input images to a fixed size as in the typical use of holistic CNN classifiers, we propose a patch-based classification framework in order to preserve discriminative parts of the image that are characteristic of its class. We describe a novel method based on the use of ensembles of conjoined networks to jointly learn discriminative stroke-parts representations and their relative importance in a patch-based classification scheme. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Place of Publication | Editor | Dimosthenis Karatzas | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ Gom2016 | Serial | 2891 | ||
Permanent link to this record |