Home | [71–80] << 81 82 83 84 85 86 87 88 89 90 >> [91–100] |
Records | |||||
---|---|---|---|---|---|
Author | Jiaolong Xu; David Vazquez; Antonio Lopez; Javier Marin; Daniel Ponsa | ||||
Title | Learning a Part-based Pedestrian Detector in Virtual World | Type | Journal Article | ||
Year | 2014 | Publication | IEEE Transactions on Intelligent Transportation Systems | Abbreviated Journal | TITS |
Volume | 15 | Issue | 5 | Pages | 2121-2131 |
Keywords | Domain Adaptation; Pedestrian Detection; Virtual Worlds | ||||
Abstract | Detecting pedestrians with on-board vision systems is of paramount interest for assisting drivers to prevent vehicle-to-pedestrian accidents. The core of a pedestrian detector is its classification module, which aims at deciding if a given image window contains a pedestrian. Given the difficulty of this task, many classifiers have been proposed during the last fifteen years. Among them, the so-called (deformable) part-based classifiers including multi-view modeling are usually top ranked in accuracy. Training such classifiers is not trivial since a proper aspect clustering and spatial part alignment of the pedestrian training samples are crucial for obtaining an accurate classifier. In this paper, first we perform automatic aspect clustering and part alignment by using virtual-world pedestrians, i.e., human annotations are not required. Second, we use a mixture-of-parts approach that allows part sharing among different aspects. Third, these proposals are integrated in a learning framework which also allows to incorporate real-world training data to perform domain adaptation between virtual- and real-world cameras. Overall, the obtained results on four popular on-board datasets show that our proposal clearly outperforms the state-of-the-art deformable part-based detector known as latent SVM. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1931-0587 | ISBN | 978-1-4673-2754-1 | Medium | |
Area | Expedition | Conference | |||
Notes | ADAS; 600.076 | Approved | no | ||
Call Number | ADAS @ adas @ XVL2014 | Serial | 2433 | ||
Permanent link to this record | |||||
Author | Onur Ferhat; Fernando Vilariño; F. Javier Sanchez | ||||
Title | A cheap portable eye-tracker solution for common setups. | Type | Journal Article | ||
Year | 2014 | Publication | Journal of Eye Movement Research | Abbreviated Journal | JEMR |
Volume | 7 | Issue | 3 | Pages | 1-10 |
Keywords | |||||
Abstract | We analyze the feasibility of a cheap eye-tracker where the hardware consists of a single webcam and a Raspberry Pi device. Our aim is to discover the limits of such a system and to see whether it provides an acceptable performance. We base our work on the open source Opengazer (Zielinski, 2013) and we propose several improvements to create a robust, real-time system which can work on a computer with 30Hz sampling rate. After assessing the accuracy of our eye-tracker in elaborated experiments involving 12 subjects under 4 different system setups, we install it on a Raspberry Pi to create a portable stand-alone eye-tracker which achieves 1.42° horizontal accuracy with 3Hz refresh rate for a building cost of 70 Euros. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ;SIAI | Approved | no | ||
Call Number | Admin @ si @ FVS2014 | Serial | 2435 | ||
Permanent link to this record | |||||
Author | J.S. Cope; P.Remagnino; S.Mannan; Katerine Diaz; Francesc J. Ferri; P.Wilkin | ||||
Title | Reverse Engineering Expert Visual Observations: From Fixations To The Learning Of Spatial Filters With A Neural-Gas Algorithm | Type | Journal Article | ||
Year | 2013 | Publication | Expert Systems with Applications | Abbreviated Journal | EXWA |
Volume | 40 | Issue | 17 | Pages | 6707-6712 |
Keywords | Neural gas; Expert vision; Eye-tracking; Fixations | ||||
Abstract | Human beings can become experts in performing specific vision tasks, for example, doctors analysing medical images, or botanists studying leaves. With sufficient knowledge and experience, people can become very efficient at such tasks. When attempting to perform these tasks with a machine vision system, it would be highly beneficial to be able to replicate the process which the expert undergoes. Advances in eye-tracking technology can provide data to allow us to discover the manner in which an expert studies an image. This paper presents a first step towards utilizing these data for computer vision purposes. A growing-neural-gas algorithm is used to learn a set of Gabor filters which give high responses to image regions which a human expert fixated on. These filters can then be used to identify regions in other images which are likely to be useful for a given vision task. The algorithm is evaluated by learning filters for locating specific areas of plant leaves. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0957-4174 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ CRM2013 | Serial | 2438 | ||
Permanent link to this record | |||||
Author | Katerine Diaz; Francesc J. Ferri | ||||
Title | Extensiones del método de vectores comunes discriminantes Aplicadas a la clasificación de imágenes | Type | Book Whole | ||
Year | 2013 | Publication | Extensiones del método de vectores comunes discriminantes Aplicadas a la clasificación de imágenes | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Los métodos basados en subespacios son una herramienta muy utilizada en aplicaciones de visión por computador. Aquí se presentan y validan algunos algoritmos que hemos propuesto en este campo de investigación. El primer algoritmo está relacionado con una extensión del método de vectores comunes discriminantes con kernel, que reinterpreta el espacio nulo de la matriz de dispersión intra-clase del conjunto de entrenamiento para obtener las características discriminantes. Dentro de los métodos basados en subespacios existen diferentes tipos de entrenamiento. Uno de los más populares, pero no por ello uno de los más eficientes, es el aprendizaje por lotes. En este tipo de aprendizaje, todas las muestras del conjunto de entrenamiento tienen que estar disponibles desde el inicio. De este modo, cuando nuevas muestras se ponen a disposición del algoritmo, el sistema tiene que ser reentrenado de nuevo desde cero. Una alternativa a este tipo de entrenamiento es el aprendizaje incremental. Aquí se proponen diferentes algoritmos incrementales del método de vectores comunes discriminantes. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-639-55339-0 | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ DiF2013 | Serial | 2440 | ||
Permanent link to this record | |||||
Author | Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera | ||||
Title | Combining Local and Global Learners in the Pairwise Multiclass Classification | Type | Journal Article | ||
Year | 2015 | Publication | Pattern Analysis and Applications | Abbreviated Journal | PAA |
Volume | 18 | Issue | 4 | Pages | 845-860 |
Keywords | Multiclass classification; Pairwise approach; One-versus-one | ||||
Abstract | Pairwise classification is a well-known class binarization technique that converts a multiclass problem into a number of two-class problems, one problem for each pair of classes. However, in the pairwise technique, nuisance votes of many irrelevant classifiers may result in a wrong class prediction. To overcome this problem, a simple, but efficient method is proposed and evaluated in this paper. The proposed method is based on excluding some classes and focusing on the most probable classes in the neighborhood space, named Local Crossing Off (LCO). This procedure is performed by employing a modified version of standard K-nearest neighbor and large margin nearest neighbor algorithms. The LCO method takes advantage of nearest neighbor classification algorithm because of its local learning behavior as well as the global behavior of powerful binary classifiers to discriminate between two classes. Combining these two properties in the proposed LCO technique will avoid the weaknesses of each method and will increase the efficiency of the whole classification system. On several benchmark datasets of varying size and difficulty, we found that the LCO approach leads to significant improvements using different base learners. The experimental results show that the proposed technique not only achieves better classification accuracy in comparison to other standard approaches, but also is computationally more efficient for tackling classification problems which have a relatively large number of target classes. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer London | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1433-7541 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ BGE2014 | Serial | 2441 | ||
Permanent link to this record | |||||
Author | Oscar Lopes; Miguel Reyes; Sergio Escalera; Jordi Gonzalez | ||||
Title | Spherical Blurred Shape Model for 3-D Object and Pose Recognition: Quantitative Analysis and HCI Applications in Smart Environments | Type | Journal Article | ||
Year | 2014 | Publication | IEEE Transactions on Systems, Man and Cybernetics (Part B) | Abbreviated Journal | TSMCB |
Volume | 44 | Issue | 12 | Pages | 2379-2390 |
Keywords | |||||
Abstract | The use of depth maps is of increasing interest after the advent of cheap multisensor devices based on structured light, such as Kinect. In this context, there is a strong need of powerful 3-D shape descriptors able to generate rich object representations. Although several 3-D descriptors have been already proposed in the literature, the research of discriminative and computationally efficient descriptors is still an open issue. In this paper, we propose a novel point cloud descriptor called spherical blurred shape model (SBSM) that successfully encodes the structure density and local variabilities of an object based on shape voxel distances and a neighborhood propagation strategy. The proposed SBSM is proven to be rotation and scale invariant, robust to noise and occlusions, highly discriminative for multiple categories of complex objects like the human hand, and computationally efficient since the SBSM complexity is linear to the number of object voxels. Experimental evaluation in public depth multiclass object data, 3-D facial expressions data, and a novel hand poses data sets show significant performance improvements in relation to state-of-the-art approaches. Moreover, the effectiveness of the proposal is also proved for object spotting in 3-D scenes and for real-time automatic hand pose recognition in human computer interaction scenarios. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2168-2267 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | HuPBA; ISE; 600.078;MILAB | Approved | no | ||
Call Number | Admin @ si @ LRE2014 | Serial | 2442 | ||
Permanent link to this record | |||||
Author | Xavier Perez Sala; Sergio Escalera; Cecilio Angulo; Jordi Gonzalez | ||||
Title | A survey on model based approaches for 2D and 3D visual human pose recovery | Type | Journal Article | ||
Year | 2014 | Publication | Sensors | Abbreviated Journal | SENS |
Volume | 14 | Issue | 3 | Pages | 4189-4210 |
Keywords | human pose recovery; human body modelling; behavior analysis; computer vision | ||||
Abstract | Human Pose Recovery has been studied in the field of Computer Vision for the last 40 years. Several approaches have been reported, and significant improvements have been obtained in both data representation and model design. However, the problem of Human Pose Recovery in uncontrolled environments is far from being solved. In this paper, we define a general taxonomy to group model based approaches for Human Pose Recovery, which is composed of five main modules: appearance, viewpoint, spatial relations, temporal consistence, and behavior. Subsequently, a methodological comparison is performed following the proposed taxonomy, evaluating current SoA approaches in the aforementioned five group categories. As a result of this comparison, we discuss the main advantages and drawbacks of the reviewed literature. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; ISE; 600.046; 600.063; 600.078;MILAB | Approved | no | ||
Call Number | Admin @ si @ PEA2014 | Serial | 2443 | ||
Permanent link to this record | |||||
Author | Frederic Sampedro; Anna Domenech; Sergio Escalera | ||||
Title | Obtaining quantitative global tumoral state indicators based on whole-body PET/CT scans: A breast cancer case study | Type | Journal Article | ||
Year | 2014 | Publication | Nuclear Medicine Communications | Abbreviated Journal | NMC |
Volume | 35 | Issue | 4 | Pages | 362-371 |
Keywords | |||||
Abstract | Objectives: In this work we address the need for the computation of quantitative global tumoral state indicators from oncological whole-body PET/computed tomography scans. The combination of such indicators with other oncological information such as tumor markers or biopsy results would prove useful in oncological decision-making scenarios.
Materials and methods: From an ordering of 100 breast cancer patients on the basis of oncological state through visual analysis by a consensus of nuclear medicine specialists, a set of numerical indicators computed from image analysis of the PET/computed tomography scan is presented, which attempts to summarize a patient’s oncological state in a quantitative manner taking into consideration the total tumor volume, aggressiveness, and spread. Results: Results obtained by comparative analysis of the proposed indicators with respect to the experts’ evaluation show up to 87% Pearson’s correlation coefficient when providing expert-guided PET metabolic tumor volume segmentation and 64% correlation when using completely automatic image analysis techniques. Conclusion: Global quantitative tumor information obtained by whole-body PET/CT image analysis can prove useful in clinical nuclear medicine settings and oncological decision-making scenarios. The completely automatic computation of such indicators would improve its impact as time efficiency and specialist independence would be achieved. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | SDE2014a | Serial | 2444 | ||
Permanent link to this record | |||||
Author | Naveen Onkarappa | ||||
Title | Optical Flow in Driver Assistance Systems | Type | Book Whole | ||
Year | 2013 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Motion perception is one of the most important attributes of the human brain. Visual motion perception consists in inferring speed and direction of elements in a scene based on visual inputs. Analogously, computer vision is assisted by motion cues in the scene. Motion detection in computer vision is useful in solving problems such as segmentation, depth from motion, structure from motion, compression, navigation and many others. These problems are common in several applications, for instance, video surveillance, robot navigation and advanced driver assistance systems (ADAS). One of the most widely used techniques for motion detection is the optical flow estimation. The work in this thesis attempts to make optical flow suitable for the requirements and conditions of driving scenarios. In this context, a novel space-variant representation called reverse log-polar representation is proposed that is shown to be better than the traditional log-polar space-variant representation for ADAS. The space-variant representations reduce the amount of data to be processed. Another major contribution in this research is related to the analysis of the influence of specific characteristics from driving scenarios on the optical flow accuracy. Characteristics such as vehicle speed and
road texture are considered in the aforementioned analysis. From this study, it is inferred that the regularization weight has to be adapted according to the required error measure and for different speeds and road textures. It is also shown that polar represented optical flow suits driving scenarios where predominant motion is translation. Due to the requirements of such a study and by the lack of needed datasets a new synthetic dataset is presented; it contains: i) sequences of different speeds and road textures in an urban scenario; ii) sequences with complex motion of an on-board camera; and iii) sequences with additional moving vehicles in the scene. The ground-truth optical flow is generated by the ray-tracing technique. Further, few applications of optical flow in ADAS are shown. Firstly, a robust RANSAC based technique to estimate horizon line is proposed. Then, an egomotion estimation is presented to compare the proposed space-variant representation with the classical one. As a final contribution, a modification in the regularization term is proposed that notably improves the results in the ADAS applications. This adaptation is evaluated using a state of the art optical flow technique. The experiments on a public dataset (KITTI) validate the advantages of using the proposed modification. |
||||
Address | Bellaterra | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Angel Sappa | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-940902-1-9 | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ Nav2013 | Serial | 2447 | ||
Permanent link to this record | |||||
Author | Shida Beigpour; Christian Riess; Joost Van de Weijer; Elli Angelopoulou | ||||
Title | Multi-Illuminant Estimation with Conditional Random Fields | Type | Journal Article | ||
Year | 2014 | Publication | IEEE Transactions on Image Processing | Abbreviated Journal | TIP |
Volume | 23 | Issue | 1 | Pages | 83-95 |
Keywords | color constancy; CRF; multi-illuminant | ||||
Abstract | Most existing color constancy algorithms assume uniform illumination. However, in real-world scenes, this is not often the case. Thus, we propose a novel framework for estimating the colors of multiple illuminants and their spatial distribution in the scene. We formulate this problem as an energy minimization task within a conditional random field over a set of local illuminant estimates. In order to quantitatively evaluate the proposed method, we created a novel data set of two-dominant-illuminant images comprised of laboratory, indoor, and outdoor scenes. Unlike prior work, our database includes accurate pixel-wise ground truth illuminant information. The performance of our method is evaluated on multiple data sets. Experimental results show that our framework clearly outperforms single illuminant estimators as well as a recently proposed multi-illuminant estimation approach. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1057-7149 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | CIC; LAMP; 600.074; 600.079 | Approved | no | ||
Call Number | Admin @ si @ BRW2014 | Serial | 2451 | ||
Permanent link to this record | |||||
Author | David Masip; Michael S. North ; Alexander Todorov; Daniel N. Osherson | ||||
Title | Automated Prediction of Preferences Using Facial Expressions | Type | Journal Article | ||
Year | 2014 | Publication | PloS one | Abbreviated Journal | Plos |
Volume | 9 | Issue | 2 | Pages | e87434 |
Keywords | |||||
Abstract | We introduce a computer vision problem from social cognition, namely, the automated detection of attitudes from a person's spontaneous facial expressions. To illustrate the challenges, we introduce two simple algorithms designed to predict observers’ preferences between images (e.g., of celebrities) based on covert videos of the observers’ faces. The two algorithms are almost as accurate as human judges performing the same task but nonetheless far from perfect. Our approach is to locate facial landmarks, then predict preference on the basis of their temporal dynamics. The database contains 768 videos involving four different kinds of preferences. We make it publically available. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | OR;MV | Approved | no | ||
Call Number | Admin @ si @ MNT2014 | Serial | 2453 | ||
Permanent link to this record | |||||
Author | David Fernandez; Josep Llados; Alicia Fornes | ||||
Title | A graph-based approach for segmenting touching lines in historical handwritten documents | Type | Journal Article | ||
Year | 2014 | Publication | International Journal on Document Analysis and Recognition | Abbreviated Journal | IJDAR |
Volume | 17 | Issue | 3 | Pages | 293-312 |
Keywords | Text line segmentation; Handwritten documents; Document image processing; Historical document analysis | ||||
Abstract | Text line segmentation in handwritten documents is an important task in the recognition of historical documents. Handwritten document images contain text lines with multiple orientations, touching and overlapping characters between consecutive text lines and different document structures, making line segmentation a difficult task. In this paper, we present a new approach for handwritten text line segmentation solving the problems of touching components, curvilinear text lines and horizontally overlapping components. The proposed algorithm formulates line segmentation as finding the central path in the area between two consecutive lines. This is solved as a graph traversal problem. A graph is constructed using the skeleton of the image. Then, a path-finding algorithm is used to find the optimum path between text lines. The proposed algorithm has been evaluated on a comprehensive dataset consisting of five databases: ICDAR2009, ICDAR2013, UMD, the George Washington and the Barcelona Marriages Database. The proposed method outperforms the state-of-the-art considering the different types and difficulties of the benchmarking data. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1433-2833 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.056; 600.061; 602.006; 600.077 | Approved | no | ||
Call Number | Admin @ si @ FLF2014 | Serial | 2459 | ||
Permanent link to this record | |||||
Author | Monica Piñol | ||||
Title | Reinforcement Learning of Visual Descriptors for Object Recognition | Type | Book Whole | ||
Year | 2014 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | The human visual system is able to recognize the object in an image even if the object is partially occluded, from various points of view, in different colors, or with independence of the distance to the object. To do this, the eye obtains an image and extracts features that are sent to the brain, and then, in the brain the object is recognized. In computer vision, the object recognition branch tries to learns from the human visual system behaviour to achieve its goal. Hence, an algorithm is used to identify representative features of the scene (detection), then another algorithm is used to describe these points (descriptor) and finally the extracted information is used for classifying the object in the scene. The selection of this set of algorithms is a very complicated task and thus, a very active research field. In this thesis we are focused on the selection/learning of the best descriptor for a given image. In the state of the art there are several descriptors but we do not know how to choose the best descriptor because depends on scenes that we will use (dataset) and the algorithm chosen to do the classification. We propose a framework based on reinforcement learning and bag of features to choose the best descriptor according to the given image. The system can analyse the behaviour of different learning algorithms and descriptor sets. Furthermore the proposed framework for improving the classification/recognition ratio can be used with minor changes in other computer vision fields, such as video retrieval. | ||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Ricardo Toledo;Angel Sappa | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-940902-5-7 | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.076 | Approved | no | ||
Call Number | Admin @ si @ Piñ2014 | Serial | 2464 | ||
Permanent link to this record | |||||
Author | Anjan Dutta | ||||
Title | Inexact Subgraph Matching Applied to Symbol Spotting in Graphical Documents | Type | Book Whole | ||
Year | 2014 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | There is a resurgence in the use of structural approaches in the usual object recognition and retrieval problem. Graph theory, in particular, graph matching plays a relevant role in that. Specifically, the detection of an object (or a part of that) in an image in terms of structural features can be formulated as a subgraph matching. Subgraph matching is a challenging task. Specially due to the presence of outliers most of the graph matching algorithms do not perform well in subgraph matching scenario. Also exact subgraph isomorphism has proven to be an NP-complete problem. So naturally, in graph matching community, there are lot of efforts addressing the problem of subgraph matching within suboptimal bound. Most of them work with approximate algorithms that try to get an inexact solution in estimated way. In addition, usual recognition must cope with distortion. Inexact graph matching consists in finding the best isomorphism under a similarity measure. Theoretically this thesis proposes algorithms for solving subgraph matching in an approximate and inexact way.
We consider the symbol spotting problem on graphical documents or line drawings from application point of view. This is a well known problem in the graphics recognition community. It can be further applied for indexing and classification of documents based on their contents. The structural nature of this kind of documents easily motivates one for giving a graph based representation. So the symbol spotting problem on graphical documents can be considered as a subgraph matching problem. The main challenges in this application domain is the noise and distortions that might come during the usage, digitalization and raster to vector conversion of those documents. Apart from that computer vision nowadays is not any more confined within a limited number of images. So dealing a huge number of images with graph based method is a further challenge. In this thesis, on one hand, we have worked on efficient and robust graph representation to cope with the noise and distortions coming from documents. On the other hand, we have worked on different graph based methods and framework to solve the subgraph matching problem in a better approximated way, which can also deal with considerable number of images. Firstly, we propose a symbol spotting method by hashing serialized subgraphs. Graph serialization allows to create factorized substructures such as graph paths, which can be organized in hash tables depending on the structural similarities of the serialized subgraphs. The involvement of hashing techniques helps to reduce the search space substantially and speeds up the spotting procedure. Secondly, we introduce contextual similarities based on the walk based propagation on tensor product graph. These contextual similarities involve higher order information and more reliable than pairwise similarities. We use these higher order similarities to formulate subgraph matching as a node and edge selection problem in the tensor product graph. Thirdly, we propose near convex grouping to form near convex region adjacency graph which eliminates the limitations of traditional region adjacency graph representation for graphic recognition. Fourthly, we propose a hierarchical graph representation by simplifying/correcting the structural errors to create a hierarchical graph of the base graph. Later these hierarchical graph structures are matched with some graph matching methods. Apart from that, in this thesis we have provided an overall experimental comparison of all the methods and some of the state-of-the-art methods. Furthermore, some dataset models have also been proposed. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Josep Llados;Umapada Pal | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-940902-4-0 | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.077 | Approved | no | ||
Call Number | Admin @ si @ Dut2014 | Serial | 2465 | ||
Permanent link to this record | |||||
Author | Carlo Gatta; Francesco Ciompi | ||||
Title | Stacked Sequential Scale-Space Taylor Context | Type | Journal Article | ||
Year | 2014 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 36 | Issue | 8 | Pages | 1694-1700 |
Keywords | |||||
Abstract | We analyze sequential image labeling methods that sample the posterior label field in order to gather contextual information. We propose an effective method that extracts local Taylor coefficients from the posterior at different scales. Results show that our proposal outperforms state-of-the-art methods on MSRC-21, CAMVID, eTRIMS8 and KAIST2 data sets. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0162-8828 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | LAMP; MILAB; 601.160; 600.079 | Approved | no | ||
Call Number | Admin @ si @ GaC2014 | Serial | 2466 | ||
Permanent link to this record |