|
Records |
Links |
|
Author |
Pau Riba; Josep Llados; Alicia Fornes; Anjan Dutta |
|
|
Title |
Large-scale Graph Indexing using Binary Embeddings of Node Contexts |
Type |
Conference Article |
|
Year |
2015 |
Publication |
10th IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
9069 |
Issue |
|
Pages |
208-217 |
|
|
Keywords |
Graph matching; Graph indexing; Application in document analysis; Word spotting; Binary embedding |
|
|
Abstract |
Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations in terms of feature vectors. Retrieving a query graph from a large dataset of graphs has the drawback of the high computational complexity required to compare the query and the target graphs. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. In this paper we propose a fast indexation formalism for graph retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Hence, each attribute counts the length of a walk of order k originated in a vertex with label l. Each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in a handwritten word spotting scenario in images of historical documents. |
|
|
Address |
Beijing; China; May 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer International Publishing |
Place of Publication |
|
Editor |
C.-L.Liu; B.Luo; W.G.Kropatsch; J.Cheng |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-319-18223-0 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GbRPR |
|
|
Notes |
DAG; 600.061; 602.006; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RLF2015a |
Serial |
2618 |
|
Permanent link to this record |
|
|
|
|
Author |
Jaume Garcia; Debora Gil; Aura Hernandez-Sabate |
|
|
Title |
Endowing Canonical Geometries to Cardiac Structures |
Type |
Book Chapter |
|
Year |
2010 |
Publication |
Statistical Atlases And Computational Models Of The Heart |
Abbreviated Journal |
|
|
|
Volume |
6364 |
Issue |
|
Pages |
124-133 |
|
|
Keywords |
|
|
|
Abstract |
International conference on Cardiac electrophysiological simulation challenge
In this paper, we show that canonical (shape-based) geometries can be endowed to cardiac structures using tubular coordinates defined over their medial axis. We give an analytic formulation of these geometries by means of B-Splines. Since B-Splines present vector space structure PCA can be applied to their control points and statistical models relating boundaries and the interior of the anatomical structures can be derived. We demonstrate the applicability in two cardiac structures, the 3D Left Ventricular volume, and the 2D Left-Right ventricle set in 2D Short Axis view. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin / Heidelberg |
Place of Publication |
|
Editor |
Camara, O.; Pop, M.; Rhode, K.; Sermesant, M.; Smith, N.; Young, A. |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
Lecture Notes in Computer Science |
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ GGH2010b |
Serial |
1515 |
|
Permanent link to this record |
|
|
|
|
Author |
Adriana Romero |
|
|
Title |
Assisting the training of deep neural networks with applications to computer vision |
Type |
Book Whole |
|
Year |
2015 |
Publication |
PhD Thesis, Universitat de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Deep learning has recently been enjoying an increasing popularity due to its success in solving challenging tasks. In particular, deep learning has proven to be effective in a large variety of computer vision tasks, such as image classification, object recognition and image parsing. Contrary to previous research, which required engineered feature representations, designed by experts, in order to succeed, deep learning attempts to learn representation hierarchies automatically from data. More recently, the trend has been to go deeper with representation hierarchies.
Learning (very) deep representation hierarchies is a challenging task, which
involves the optimization of highly non-convex functions. Therefore, the search
for algorithms to ease the learning of (very) deep representation hierarchies from data is extensive and ongoing.
In this thesis, we tackle the challenging problem of easing the learning of (very) deep representation hierarchies. We present a hyper-parameter free, off-the-shelf, simple and fast unsupervised algorithm to discover hidden structure from the input data by enforcing a very strong form of sparsity. We study the applicability and potential of the algorithm to learn representations of varying depth in a handful of applications and domains, highlighting the ability of the algorithm to provide discriminative feature representations that are able to achieve top performance.
Yet, while emphasizing the great value of unsupervised learning methods when
labeled data is scarce, the recent industrial success of deep learning has revolved around supervised learning. Supervised learning is currently the focus of many recent research advances, which have shown to excel at many computer vision tasks. Top performing systems often involve very large and deep models, which are not well suited for applications with time or memory limitations. More in line with the current trends, we engage in making top performing models more efficient, by designing very deep and thin models. Since training such very deep models still appears to be a challenging task, we introduce a novel algorithm that guides the training of very thin and deep models by hinting their intermediate representations.
Very deep and thin models trained by the proposed algorithm end up extracting feature representations that are comparable or even better performing
than the ones extracted by large state-of-the-art models, while compellingly
reducing the time and memory consumption of the model. |
|
|
Address |
October 2015 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Carlo Gatta;Petia Radeva |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ Rom2015 |
Serial |
2707 |
|
Permanent link to this record |
|
|
|
|
Author |
Jaume Gibert; Ernest Valveny; Oriol Ramos Terrades; Horst Bunke |
|
|
Title |
Multiple Classifiers for Graph of Words Embedding |
Type |
Conference Article |
|
Year |
2011 |
Publication |
10th International Conference on Multiple Classifier Systems |
Abbreviated Journal |
|
|
|
Volume |
6713 |
Issue |
|
Pages |
36-45 |
|
|
Keywords |
|
|
|
Abstract |
During the last years, there has been an increasing interest in applying the multiple classifier framework to the domain of structural pattern recognition. Constructing base classifiers when the input patterns are graph based representations is not an easy problem. In this work, we make use of the graph embedding methodology in order to construct different feature vector representations for graphs. The graph of words embedding assigns a feature vector to every graph by counting unary and binary relations between node representatives and combining these pieces of information into a single vector. Selecting different node representatives leads to different vectorial representations and therefore to different base classifiers that can be combined. We experimentally show how this methodology significantly improves the classification of graphs with respect to single base classifiers. |
|
|
Address |
Napoles, Italy |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
Carlo Sansone; Josef Kittler; Fabio Roli |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-642-21556-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MCS |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @GVR2011 |
Serial |
1745 |
|
Permanent link to this record |
|
|
|
|
Author |
Pierluigi Casale; Oriol Pujol; Petia Radeva |
|
|
Title |
Approximate Convex Hulls Family for One-Class Cassification |
Type |
Conference Article |
|
Year |
2011 |
Publication |
10th International Workshop on Multiple Classifier Systems |
Abbreviated Journal |
|
|
|
Volume |
6713 |
Issue |
|
Pages |
106-115 |
|
|
Keywords |
|
|
|
Abstract |
In this work, a new method for one-class classification based on the Convex Hull geometric structure is proposed. The new method creates a family of convex hulls able to fit the geometrical shape of the training points. The increased computational cost due to the creation of the convex hull in multiple dimensions is circumvented using random projections. This provides an approximation of the original structure with multiple bi-dimensional views. In the projection planes, a mechanism for noisy points rejection has also been elaborated and evaluated. Results show that the approach performs considerably well with respect to the state the art in one-class classification. |
|
|
Address |
Napoli, Italy |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
Carlo Sansone; Josef Kittler; Fabio Roli |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-21556-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MCS |
|
|
Notes |
MILAB;HuPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ CPR2011b |
Serial |
1761 |
|
Permanent link to this record |
|
|
|
|
Author |
Miguel Angel Bautista; Oriol Pujol; Xavier Baro; Sergio Escalera |
|
|
Title |
Introducing the Separability Matrix for Error Correcting Output Codes Coding |
Type |
Conference Article |
|
Year |
2011 |
Publication |
10th International conference on Multiple Classifier Systems |
Abbreviated Journal |
|
|
|
Volume |
6713 |
Issue |
|
Pages |
227-236 |
|
|
Keywords |
|
|
|
Abstract |
Error Correcting Output Codes (ECOC) have demonstrate to be a powerful tool for treating multi-class problems. Nevertheless, predefined ECOC designs may not benefit from Error-correcting principles for particular multi-class data. In this paper, we introduce the Separability matrix as a tool to study and enhance designs for ECOC coding. In addition, a novel problem-dependent coding design based on the Separability matrix is tested over a wide set of challenging multi-class problems, obtaining very satisfactory results. |
|
|
Address |
Napoles, Italy |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer-Verlag Berlin Heidelberg |
Place of Publication |
|
Editor |
Carlo Sansone; Josef Kittler; Fabio Roli |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-642-21556-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MCS |
|
|
Notes |
MILAB; OR;HuPBA;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ BPB2011a |
Serial |
1771 |
|
Permanent link to this record |
|
|
|
|
Author |
Eloi Puertas; Sergio Escalera; Oriol Pujol |
|
|
Title |
Multi-Class Multi-Scale Stacked Sequential Learning |
Type |
Conference Article |
|
Year |
2011 |
Publication |
10th International Conference on Multiple Classifier Systems |
Abbreviated Journal |
|
|
|
Volume |
6713 |
Issue |
|
Pages |
197-206 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Napoles, Italy |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer |
Place of Publication |
|
Editor |
Carlo Sansone; Josef Kittler; Fabio Roli |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MCS |
|
|
Notes |
HuPBA;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ PEP2011b |
Serial |
1772 |
|
Permanent link to this record |
|
|
|
|
Author |
Miguel Angel Bautista; Oriol Pujol; Xavier Baro; Sergio Escalera |
|
|
Title |
Introducing the Separability Matrix for Error Correcting Output Codes Coding |
Type |
Conference Article |
|
Year |
2011 |
Publication |
10th International Conference on Multiple Classifier Systems |
Abbreviated Journal |
|
|
|
Volume |
6713 |
Issue |
|
Pages |
227-236 |
|
|
Keywords |
|
|
|
Abstract |
Error Correcting Output Codes (ECOC) have demonstrate to be a powerful tool for treating multi-class problems. Nevertheless, predefined ECOC designs may not benefit from Error-correcting principles for particular multi-class data. In this paper, we introduce the Separability matrix as a tool to study and enhance designs for ECOC coding. In addition, a novel problem-dependent coding design based on the Separability matrix is tested over a wide set of challenging multi-class problems, obtaining very satisfactory results. |
|
|
Address |
Napoles, Italy |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer-Verlag Berlin, Heidelberg |
Place of Publication |
|
Editor |
Carlo Sansone; Josef Kittler; Fabio Roli |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-21556-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MCS |
|
|
Notes |
MILAB; OR;HuPBA;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ BPB2011b |
Serial |
1887 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; Josep Llados; Gemma Sanchez; Horst Bunke |
|
|
Title |
Writer Identification in Old Handwritten Music Scores |
Type |
Book Chapter |
|
Year |
2012 |
Publication |
Pattern Recognition and Signal Processing in Archaeometry: Mathematical and Computational Solutions for Archaeology |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
27-63 |
|
|
Keywords |
|
|
|
Abstract |
The aim of writer identification is determining the writer of a piece of handwriting from a set of writers. In this paper we present a system for writer identification in old handwritten music scores. Even though an important amount of compositions contains handwritten text in the music scores, the aim of our work is to use only music notation to determine the author. The steps of the system proposed are the following. First of all, the music sheet is preprocessed and normalized for obtaining a single binarized music line, without the staff lines. Afterwards, 100 features are extracted for every music line, which are subsequently used in a k-NN classifier that compares every feature vector with prototypes stored in a database. By applying feature selection and extraction methods on the original feature set, the performance is increased. The proposed method has been tested on a database of old music scores from the 17th to 19th centuries, achieving a recognition rate of about 95%. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
IGI-Global |
Place of Publication |
|
Editor |
Copnstantin Papaodysseus |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ FLS2012 |
Serial |
1828 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Marçal Rusiñol |
|
|
Title |
Graphics Recognition Techniques |
Type |
Book Chapter |
|
Year |
2014 |
Publication |
Handbook of Document Image Processing and Recognition |
Abbreviated Journal |
|
|
|
Volume |
D |
Issue |
|
Pages |
489-521 |
|
|
Keywords |
Dimension recognition; Graphics recognition; Graphic-rich documents; Polygonal approximation; Raster-to-vector conversion; Texture-based primitive extraction; Text-graphics separation |
|
|
Abstract |
This chapter describes the most relevant approaches for the analysis of graphical documents. The graphics recognition pipeline can be splitted into three tasks. The low level or lexical task extracts the basic units composing the document. The syntactic level is focused on the structure, i.e., how graphical entities are constructed, and involves the location and classification of the symbols present in the document. The third level is a functional or semantic level, i.e., it models what the graphical symbols do and what they mean in the context where they appear. This chapter covers the lexical level, while the next two chapters are devoted to the syntactic and semantic level, respectively. The main problems reviewed in this chapter are raster-to-vector conversion (vectorization algorithms) and the separation of text and graphics components. The research and industrial communities have provided standard methods achieving reasonable performance levels. Hence, graphics recognition techniques can be considered to be in a mature state from a scientific point of view. Additionally this chapter provides insights on some related problems, namely, the extraction and recognition of dimensions in engineering drawings, and the recognition of hatched and tiled patterns. Both problems are usually associated, even integrated, in the vectorization process. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer London |
Place of Publication |
|
Editor |
D. Doermann; K. Tombre |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-0-85729-858-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ LlR2014 |
Serial |
2380 |
|
Permanent link to this record |
|
|
|
|
Author |
Salvatore Tabbone; Oriol Ramos Terrades |
|
|
Title |
An Overview of Symbol Recognition |
Type |
Book Chapter |
|
Year |
2014 |
Publication |
Handbook of Document Image Processing and Recognition |
Abbreviated Journal |
|
|
|
Volume |
D |
Issue |
|
Pages |
523-551 |
|
|
Keywords |
Pattern recognition; Shape descriptors; Structural descriptors; Symbolrecognition; Symbol spotting |
|
|
Abstract |
According to the Cambridge Dictionaries Online, a symbol is a sign, shape, or object that is used to represent something else. Symbol recognition is a subfield of general pattern recognition problems that focuses on identifying, detecting, and recognizing symbols in technical drawings, maps, or miscellaneous documents such as logos and musical scores. This chapter aims at providing the reader an overview of the different existing ways of describing and recognizing symbols and how the field has evolved to attain a certain degree of maturity. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer London |
Place of Publication |
|
Editor |
D. Doermann; K. Tombre |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-0-85729-858-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ TaT2014 |
Serial |
2489 |
|
Permanent link to this record |
|
|
|
|
Author |
A.Kesidis; Dimosthenis Karatzas |
|
|
Title |
Logo and Trademark Recognition |
Type |
Book Chapter |
|
Year |
2014 |
Publication |
Handbook of Document Image Processing and Recognition |
Abbreviated Journal |
|
|
|
Volume |
D |
Issue |
|
Pages |
591-646 |
|
|
Keywords |
Logo recognition; Logo removal; Logo spotting; Trademark registration; Trademark retrieval systems |
|
|
Abstract |
The importance of logos and trademarks in nowadays society is indisputable, variably seen under a positive light as a valuable service for consumers or a negative one as a catalyst of ever-increasing consumerism. This chapter discusses the technical approaches for enabling machines to work with logos, looking into the latest methodologies for logo detection, localization, representation, recognition, retrieval, and spotting in a variety of media. This analysis is presented in the context of three different applications covering the complete depth and breadth of state of the art techniques. These are trademark retrieval systems, logo recognition in document images, and logo detection and removal in images and videos. This chapter, due to the very nature of logos and trademarks, brings together various facets of document image analysis spanning graphical and textual content, while it links document image analysis to other computer vision domains, especially when it comes to the analysis of real-scene videos and images. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer London |
Place of Publication |
|
Editor |
D. Doermann; K. Tombre |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-0-85729-858-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ KeK2014 |
Serial |
2425 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; Gemma Sanchez |
|
|
Title |
Analysis and Recognition of Music Scores |
Type |
Book Chapter |
|
Year |
2014 |
Publication |
Handbook of Document Image Processing and Recognition |
Abbreviated Journal |
|
|
|
Volume |
E |
Issue |
|
Pages |
749-774 |
|
|
Keywords |
|
|
|
Abstract |
The analysis and recognition of music scores has attracted the interest of researchers for decades. Optical Music Recognition (OMR) is a classical research field of Document Image Analysis and Recognition (DIAR), whose aim is to extract information from music scores. Music scores contain both graphical and textual information, and for this reason, techniques are closely related to graphics recognition and text recognition. Since music scores use a particular diagrammatic notation that follow the rules of music theory, many approaches make use of context information to guide the recognition and solve ambiguities. This chapter overviews the main Optical Music Recognition (OMR) approaches. Firstly, the different methods are grouped according to the OMR stages, namely, staff removal, music symbol recognition, and syntactical analysis. Secondly, specific approaches for old and handwritten music scores are reviewed. Finally, online approaches and commercial systems are also commented. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer London |
Place of Publication |
|
Editor |
D. Doermann; K. Tombre |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-0-85729-860-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; ADAS; 600.076; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ FoS2014 |
Serial |
2484 |
|
Permanent link to this record |
|
|
|
|
Author |
Edgar Riba |
|
|
Title |
Geometric Computer Vision Techniques for Scene Reconstruction |
Type |
Book Whole |
|
Year |
2021 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
From the early stages of Computer Vision, scene reconstruction has been one of the most studied topics leading to a wide variety of new discoveries and applications. Object grasping and manipulation, localization and mapping, or even visual effect generation are different examples of applications in which scene reconstruction has taken an important role for industries such as robotics, factory automation, or audio visual production. However, scene reconstruction is an extensive topic that can be approached in many different ways with already existing solutions that effectively work in controlled environments. Formally, the problem of scene reconstruction can be formulated as a sequence of independent processes which compose a pipeline. In this thesis, we analyse some parts of the reconstruction pipeline from which we contribute with novel methods using Convolutional Neural Networks (CNN) proposing innovative solutions that consider the optimisation of the methods in an end-to-end fashion. First, we review the state of the art of classical local features detectors and descriptors and contribute with two novel methods that inherently improve pre-existing solutions in the scene reconstruction pipeline.
It is a fact that computer science and software engineering are two fields that usually go hand in hand and evolve according to mutual needs making easier the design of complex and efficient algorithms. For this reason, we contribute with Kornia, a library specifically designed to work with classical computer vision techniques along with deep neural networks. In essence, we created a framework that eases the design of complex pipelines for computer vision algorithms so that can be included within neural networks and be used to backpropagate gradients throw a common optimisation framework. Finally, in the last chapter of this thesis we develop the aforementioned concept of designing end-to-end systems with classical projective geometry. Thus, we contribute with a solution to the problem of synthetic view generation by hallucinating novel views from high deformable cloths objects using a geometry aware end-to-end system. To summarize, in this thesis we demonstrate that with a proper design that combine classical geometric computer vision methods with deep learning techniques can lead to improve pre-existing solutions for the problem of scene reconstruction. |
|
|
Address |
February 2021 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
|
Place of Publication |
|
Editor |
Daniel Ponsa |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MSIAU |
Approved |
no |
|
|
Call Number |
Admin @ si @ Rib2021 |
Serial |
3610 |
|
Permanent link to this record |
|
|
|
|
Author |
Diego Alejandro Cheda |
|
|
Title |
Monocular Depth Cues in Computer Vision Applications |
Type |
Book Whole |
|
Year |
2012 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Depth perception is a key aspect of human vision. It is a routine and essential visual task that the human do effortlessly in many daily activities. This has often been associated with stereo vision, but humans have an amazing ability to perceive depth relations even from a single image by using several monocular cues.
In the computer vision field, if image depth information were available, many tasks could be posed from a different perspective for the sake of higher performance and robustness. Nevertheless, given a single image, this possibility is usually discarded, since obtaining depth information has frequently been performed by three-dimensional reconstruction techniques, requiring two or more images of the same scene taken from different viewpoints. Recently, some proposals have shown the feasibility of computing depth information from single images. In essence, the idea is to take advantage of a priori knowledge of the acquisition conditions and the observed scene to estimate depth from monocular pictorial cues. These approaches try to precisely estimate the scene depth maps by employing computationally demanding techniques. However, to assist many computer vision algorithms, it is not really necessary computing a costly and detailed depth map of the image. Indeed, just a rough depth description can be very valuable in many problems.
In this thesis, we have demonstrated how coarse depth information can be integrated in different tasks following alternative strategies to obtain more precise and robust results. In that sense, we have proposed a simple, but reliable enough technique, whereby image scene regions are categorized into discrete depth ranges to build a coarse depth map. Based on this representation, we have explored the potential usefulness of our method in three application domains from novel viewpoints: camera rotation parameters estimation, background estimation and pedestrian candidate generation. In the first case, we have computed camera rotation mounted in a moving vehicle applying two novels methods based on distant elements in the image, where the translation component of the image flow vectors is negligible. In background estimation, we have proposed a novel method to reconstruct the background by penalizing close regions in a cost function, which integrates color, motion, and depth terms. Finally, we have benefited of geometric and depth information available on single images for pedestrian candidate generation to significantly reduce the number of generated windows to be further processed by a pedestrian classifier. In all cases, results have shown that our approaches contribute to better performances. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Daniel Ponsa;Antonio Lopez |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ Che2012 |
Serial |
2210 |
|
Permanent link to this record |