|
Records |
Links |
|
Author |
Dimosthenis Karatzas; Lluis Gomez; Anguelos Nicolaou; Suman Ghosh; Andrew Bagdanov; Masakazu Iwamura; J. Matas; L. Neumann; V. Ramaseshan; S. Lu ; Faisal Shafait; Seiichi Uchida; Ernest Valveny |
|
|
Title |
ICDAR 2015 Competition on Robust Reading |
Type |
Conference Article |
|
Year |
2015 |
Publication |
13th International Conference on Document Analysis and Recognition ICDAR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1156-1160 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.077; 600.084 |
Approved |
no |
|
|
Call Number |
Admin @ si @ KGN2015 |
Serial |
2690 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Gomez; Dimosthenis Karatzas |
|
|
Title |
Object Proposals for Text Extraction in the Wild |
Type |
Conference Article |
|
Year |
2015 |
Publication |
13th International Conference on Document Analysis and Recognition ICDAR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
206 - 210 |
|
|
Keywords |
|
|
|
Abstract |
Object Proposals is a recent computer vision technique receiving increasing interest from the research community. Its main objective is to generate a relatively small set of bounding box proposals that are most likely to contain objects of interest. The use of Object Proposals techniques in the scene text understanding field is innovative. Motivated by the success of powerful while expensive techniques to recognize words in a holistic way, Object Proposals techniques emerge as an alternative to the traditional text detectors. In this paper we study to what extent the existing generic Object Proposals methods may be useful for scene text understanding. Also, we propose a new Object Proposals algorithm that is specifically designed for text and compare it with other generic methods in the state of the art. Experiments show that our proposal is superior in its ability of producing good quality word proposals in an efficient way. The source code of our method is made publicly available |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.077; 600.084; 601.197 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GoK2015 |
Serial |
2691 |
|
Permanent link to this record |
|
|
|
|
Author |
Anguelos Nicolaou; Andrew Bagdanov; Marcus Liwicki; Dimosthenis Karatzas |
|
|
Title |
Sparse Radial Sampling LBP for Writer Identification |
Type |
Conference Article |
|
Year |
2015 |
Publication |
13th International Conference on Document Analysis and Recognition ICDAR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
716-720 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we present the use of Sparse Radial Sampling Local Binary Patterns, a variant of Local Binary Patterns (LBP) for text-as-texture classification. By adapting and extending the standard LBP operator to the particularities of text we get a generic text-as-texture classification scheme and apply it to writer identification. In experiments on CVL and ICDAR 2013 datasets, the proposed feature-set demonstrates State-Of-the-Art (SOA) performance. Among the SOA, the proposed method is the only one that is based on dense extraction of a single local feature descriptor. This makes it fast and applicable at the earliest stages in a DIA pipeline without the need for segmentation, binarization, or extraction of multiple features. |
|
|
Address |
Nancy; France; August 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ NBL2015 |
Serial |
2692 |
|
Permanent link to this record |
|
|
|
|
Author |
Suman Ghosh; Lluis Gomez; Dimosthenis Karatzas; Ernest Valveny |
|
|
Title |
Efficient indexing for Query By String text retrieval |
Type |
Conference Article |
|
Year |
2015 |
Publication |
6th IAPR International Workshop on Camera Based Document Analysis and Recognition CBDAR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1236 - 1240 |
|
|
Keywords |
|
|
|
Abstract |
This paper deals with Query By String word spotting in scene images. A hierarchical text segmentation algorithm based on text specific selective search is used to find text regions. These regions are indexed per character n-grams present in the text region. An attribute representation based on Pyramidal Histogram of Characters (PHOC) is used to compare text regions with the query text. For generation of the index a similar attribute space based Pyramidal Histogram of character n-grams is used. These attribute models are learned using linear SVMs over the Fisher Vector [1] representation of the images along with the PHOC labels of the corresponding strings. |
|
|
Address |
Nancy; France; August 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CBDAR |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GGK2015 |
Serial |
2693 |
|
Permanent link to this record |
|
|
|
|
Author |
J.Kuhn; A.Nussbaumer; J.Pirker; Dimosthenis Karatzas; A. Pagani; O.Conlan; M.Memmel; C.M.Steiner; C.Gutl; D.Albert; Andreas Dengel |
|
|
Title |
Advancing Physics Learning Through Traversing a Multi-Modal Experimentation Space |
Type |
Conference Article |
|
Year |
2015 |
Publication |
Workshop Proceedings on the 11th International Conference on Intelligent Environments |
Abbreviated Journal |
|
|
|
Volume |
19 |
Issue |
|
Pages |
373-380 |
|
|
Keywords |
|
|
|
Abstract |
Translating conceptual knowledge into real world experiences presents a significant educational challenge. This position paper presents an approach that supports learners in moving seamlessly between conceptual learning and their application in the real world by bringing physical and virtual experiments into everyday settings. Learners are empowered in conducting these situated experiments in a variety of physical settings by leveraging state of the art mobile, augmented reality, and virtual reality technology. A blend of mobile-based multi-sensory physical experiments, augmented reality and enabling virtual environments can allow learners to bridge their conceptual learning with tangible experiences in a completely novel manner. This approach focuses on the learner by applying self-regulated personalised learning techniques, underpinned by innovative pedagogical approaches and adaptation techniques, to ensure that the needs and preferences of each learner are catered for individually. |
|
|
Address |
Praga; Chzech Republic; July 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IE |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ KNP2015 |
Serial |
2694 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Pere de las Heras; Ernest Valveny; Gemma Sanchez |
|
|
Title |
Unsupervised and Notation-Independent Wall Segmentation in Floor Plans Using a Combination of Statistical and Structural Strategies |
Type |
Conference Article |
|
Year |
2013 |
Publication |
10th IAPR International Workshop on Graphics Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Bethlehem; PA; USA; August 2013 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GREC |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ HVS2013b |
Serial |
2696 |
|
Permanent link to this record |
|
|
|
|
Author |
Klaus Broelemann; Anjan Dutta; Xiaoyi Jiang; Josep Llados |
|
|
Title |
Hierarchical Plausibility-Graphs for Symbol Spotting in Graphical Documents |
Type |
Book Chapter |
|
Year |
2014 |
Publication |
Graphics Recognition. Current Trends and Challenges |
Abbreviated Journal |
|
|
|
Volume |
8746 |
Issue |
|
Pages |
25-37 |
|
|
Keywords |
|
|
|
Abstract |
Graph representation of graphical documents often suffers from noise such as spurious nodes and edges, and their discontinuity. In general these errors occur during the low-level image processing viz. binarization, skeletonization, vectorization etc. Hierarchical graph representation is a nice and efficient way to solve this kind of problem by hierarchically merging node-node and node-edge depending on the distance. But the creation of hierarchical graph representing the graphical information often uses hard thresholds on the distance to create the hierarchical nodes (next state) of the lower nodes (or states) of a graph. As a result, the representation often loses useful information. This paper introduces plausibilities to the nodes of hierarchical graph as a function of distance and proposes a modified algorithm for matching subgraphs of the hierarchical graphs. The plausibility-annotated nodes help to improve the performance of the matching algorithm on two hierarchical structures. To show the potential of this approach, we conduct an experiment with the SESYD dataset. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
Bart Lamiroy; Jean-Marc Ogier |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-662-44853-3 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.045; 600.056; 600.061; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BDJ2014 |
Serial |
2699 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados |
|
|
Title |
Spotting Graphical Symbols in Camera-Acquired Documents in Real Time |
Type |
Book Chapter |
|
Year |
2014 |
Publication |
Graphics Recognition. Current Trends and Challenges |
Abbreviated Journal |
|
|
|
Volume |
8746 |
Issue |
|
Pages |
3-10 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we present a system devoted to spot graphical symbols in camera-acquired document images. The system is based on the extraction and further matching of ORB compact local features computed over interest key-points. Then, the FLANN indexing framework based on approximate nearest neighbor search allows to efficiently match local descriptors between the captured scene and the graphical models. Finally, the RANSAC algorithm is used in order to compute the homography between the spotted symbol and its appearance in the document image. The proposed approach is efficient and is able to work in real time. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
Bart Lamiroy; Jean-Marc Ogier |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-662-44853-3 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.045; 600.055; 600.061; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RKL2014 |
Serial |
2700 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; V. Poulain d'Andecy; Dimosthenis Karatzas; Josep Llados |
|
|
Title |
Classification of Administrative Document Images by Logo Identification |
Type |
Book Chapter |
|
Year |
2014 |
Publication |
Graphics Recognition. Current Trends and Challenges |
Abbreviated Journal |
|
|
|
Volume |
8746 |
Issue |
|
Pages |
49-58 |
|
|
Keywords |
Administrative Document Classification; Logo Recognition; Logo Spotting |
|
|
Abstract |
This paper is focused on the categorization of administrative document images (such as invoices) based on the recognition of the supplier’s graphical logo. Two different methods are proposed, the first one uses a bag-of-visual-words model whereas the second one tries to locate logo images described by the blurred shape model descriptor within documents by a sliding-window technique. Preliminar results are reported with a dataset of real administrative documents. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
Bart Lamiroy; Jean-Marc Ogier |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-662-44853-3 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.056; 600.045; 605.203; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RPK2014 |
Serial |
2701 |
|
Permanent link to this record |
|
|
|
|
Author |
Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva |
|
|
Title |
Towards social interaction detection in egocentric photo-streams |
Type |
Conference Article |
|
Year |
2015 |
Publication |
Proceedings of SPIE, 8th International Conference on Machine Vision , ICMV 2015 |
Abbreviated Journal |
|
|
|
Volume |
9875 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Detecting social interaction in videos relying solely on visual cues is a valuable task that is receiving increasing attention in recent years. In this work, we address this problem in the challenging domain of egocentric photo-streams captured by a low temporal resolution wearable camera (2fpm). The major difficulties to be handled in this context are the sparsity of observations as well as unpredictability of camera motion and attention orientation due to the fact that the camera is worn as part of clothing. Our method consists of four steps: multi-faces localization and tracking, 3D localization, pose estimation and analysis of f-formations. By estimating pair-to-pair interaction probabilities over the sequence, our method states the presence or absence of interaction with the camera wearer and specifies which people are more involved in the interaction. We tested our method over a dataset of 18.000 images and we show its reliability on our considered purpose. © (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICMV |
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ ADR2015a |
Serial |
2702 |
|
Permanent link to this record |
|
|
|
|
Author |
Ivan Huerta; Michael Holte; Thomas B. Moeslund; Jordi Gonzalez |
|
|
Title |
Chromatic shadow detection and tracking for moving foreground segmentation |
Type |
Journal Article |
|
Year |
2015 |
Publication |
Image and Vision Computing |
Abbreviated Journal |
IMAVIS |
|
|
Volume |
41 |
Issue |
|
Pages |
42-53 |
|
|
Keywords |
Detecting moving objects; Chromatic shadow detection; Temporal local gradient; Spatial and Temporal brightness and angle distortions; Shadow tracking |
|
|
Abstract |
Advanced segmentation techniques in the surveillance domain deal with shadows to avoid distortions when detecting moving objects. Most approaches for shadow detection are still typically restricted to penumbra shadows and cannot cope well with umbra shadows. Consequently, umbra shadow regions are usually detected as part of moving objects, thus aecting the performance of the nal detection. In this paper we address the detection of both penumbra and umbra shadow regions. First, a novel bottom-up approach is presented based on gradient and colour models, which successfully discriminates between chromatic moving cast shadow regions and those regions detected as moving objects. In essence, those regions corresponding to potential shadows are detected based on edge partitioning and colour statistics. Subsequently (i) temporal similarities between textures and (ii) spatial similarities between chrominance angle and brightness distortions are analysed for each potential shadow region for detecting the umbra shadow regions. Our second contribution renes even further the segmentation results: a tracking-based top-down approach increases the performance of our bottom-up chromatic shadow detection algorithm by properly correcting non-detected shadows.
To do so, a combination of motion lters in a data association framework exploits the temporal consistency between objects and shadows to increase
the shadow detection rate. Experimental results exceed current state-of-the-
art in shadow accuracy for multiple well-known surveillance image databases which contain dierent shadowed materials and illumination conditions. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE; 600.078; 600.063 |
Approved |
no |
|
|
Call Number |
Admin @ si @ HHM2015 |
Serial |
2703 |
|
Permanent link to this record |
|
|
|
|
Author |
Sergio Escalera; Junior Fabian; Pablo Pardo; Xavier Baro; Jordi Gonzalez; Hugo Jair Escalante; Marc Oliu; Dusan Misevic; Ulrich Steiner; Isabelle Guyon |
|
|
Title |
ChaLearn Looking at People 2015: Apparent Age and Cultural Event Recognition Datasets and Results |
Type |
Conference Article |
|
Year |
2015 |
Publication |
16th IEEE International Conference on Computer Vision Workshops |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
243 - 251 |
|
|
Keywords |
|
|
|
Abstract |
Following previous series on Looking at People (LAP) competitions [14, 13, 11, 12, 2], in 2015 ChaLearn ran two new competitions within the field of Looking at People: (1) age estimation, and (2) cultural event recognition, both in
still images. We developed a crowd-sourcing application to collect and label data about the apparent age of people (as opposed to the real age). In terms of cultural event recognition, one hundred categories had to be recognized. These
tasks involved scene understanding and human body analysis. This paper summarizes both challenges and data, as well as the results achieved by the participants of the competition. |
|
|
Address |
Santiago de Chile; December 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCVW |
|
|
Notes |
ISE; 600.063; 600.078;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ EFP2015 |
Serial |
2704 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep M. Gonfaus; Marco Pedersoli; Jordi Gonzalez; Andrea Vedaldi; Xavier Roca |
|
|
Title |
Factorized appearances for object detection |
Type |
Journal Article |
|
Year |
2015 |
Publication |
Computer Vision and Image Understanding |
Abbreviated Journal |
CVIU |
|
|
Volume |
138 |
Issue |
|
Pages |
92–101 |
|
|
Keywords |
Object recognition; Deformable part models; Learning and sharing parts; Discovering discriminative parts |
|
|
Abstract |
Deformable object models capture variations in an object’s appearance that can be represented as image deformations. Other effects such as out-of-plane rotations, three-dimensional articulations, and self-occlusions are often captured by considering mixture of deformable models, one per object aspect. A more scalable approach is representing instead the variations at the level of the object parts, applying the concept of a mixture locally. Combining a few part variations can in fact cheaply generate a large number of global appearances.
A limited version of this idea was proposed by Yang and Ramanan [1], for human pose dectection. In this paper we apply it to the task of generic object category detection and extend it in several ways. First, we propose a model for the relationship between part appearances more general than the tree of Yang and Ramanan [1], which is more suitable for generic categories. Second, we treat part locations as well as their appearance as latent variables so that training does not need part annotations but only the object bounding boxes. Third, we modify the weakly-supervised learning of Felzenszwalb et al. and Girshick et al. [2], [3] to handle a significantly more complex latent structure.
Our model is evaluated on standard object detection benchmarks and is found to improve over existing approaches, yielding state-of-the-art results for several object categories. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE; 600.063; 600.078 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GPG2015 |
Serial |
2705 |
|
Permanent link to this record |
|
|
|
|
Author |
Alejandro Gonzalez Alzate |
|
|
Title |
Multi-modal Pedestrian Detection |
Type |
Book Whole |
|
Year |
2015 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Pedestrian detection continues to be an extremely challenging problem in real scenarios, in which situations like illumination changes, noisy images, unexpected objects, uncontrolled scenarios and variant appearance of objects occur constantly. All these problems force the development of more robust detectors for relevant applications like vision-based autonomous vehicles, intelligent surveillance, and pedestrian tracking for behavior analysis. Most reliable vision-based pedestrian detectors base their decision on features extracted using a single sensor capturing complementary features, e.g., appearance, and texture. These features usually are extracted from the current frame, ignoring temporal information, or including it in a post process step e.g., tracking or temporal coherence. Taking into account these issues we formulate the following question: can we generate more robust pedestrian detectors by introducing new information sources in the feature extraction step?
In order to answer this question we develop different approaches for introducing new information sources to well-known pedestrian detectors. We start by the inclusion of temporal information following the Stacked Sequential Learning (SSL) paradigm which suggests that information extracted from the neighboring samples in a sequence can improve the accuracy of a base classifier.
We then focus on the inclusion of complementary information from different sensors like 3D point clouds (LIDAR – depth), far infrared images (FIR), or disparity maps (stereo pair cameras). For this end we develop a multi-modal framework in which information from different sensors is used for increasing detection accuracy (by increasing information redundancy). Finally we propose a multi-view pedestrian detector, this multi-view approach splits the detection problem in n sub-problems.
Each sub-problem will detect objects in a given specific view reducing in that way the variability problem faced when a single detectors is used for the whole problem. We show that these approaches obtain competitive results with other state-of-the-art methods but instead of design new features, we reuse existing ones boosting their performance. |
|
|
Address |
November 2015 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
David Vazquez;Antonio Lopez; |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-84-943427-7-6 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.076 |
Approved |
no |
|
|
Call Number |
Admin @ si @ Gon2015 |
Serial |
2706 |
|
Permanent link to this record |
|
|
|
|
Author |
Adriana Romero |
|
|
Title |
Assisting the training of deep neural networks with applications to computer vision |
Type |
Book Whole |
|
Year |
2015 |
Publication |
PhD Thesis, Universitat de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Deep learning has recently been enjoying an increasing popularity due to its success in solving challenging tasks. In particular, deep learning has proven to be effective in a large variety of computer vision tasks, such as image classification, object recognition and image parsing. Contrary to previous research, which required engineered feature representations, designed by experts, in order to succeed, deep learning attempts to learn representation hierarchies automatically from data. More recently, the trend has been to go deeper with representation hierarchies.
Learning (very) deep representation hierarchies is a challenging task, which
involves the optimization of highly non-convex functions. Therefore, the search
for algorithms to ease the learning of (very) deep representation hierarchies from data is extensive and ongoing.
In this thesis, we tackle the challenging problem of easing the learning of (very) deep representation hierarchies. We present a hyper-parameter free, off-the-shelf, simple and fast unsupervised algorithm to discover hidden structure from the input data by enforcing a very strong form of sparsity. We study the applicability and potential of the algorithm to learn representations of varying depth in a handful of applications and domains, highlighting the ability of the algorithm to provide discriminative feature representations that are able to achieve top performance.
Yet, while emphasizing the great value of unsupervised learning methods when
labeled data is scarce, the recent industrial success of deep learning has revolved around supervised learning. Supervised learning is currently the focus of many recent research advances, which have shown to excel at many computer vision tasks. Top performing systems often involve very large and deep models, which are not well suited for applications with time or memory limitations. More in line with the current trends, we engage in making top performing models more efficient, by designing very deep and thin models. Since training such very deep models still appears to be a challenging task, we introduce a novel algorithm that guides the training of very thin and deep models by hinting their intermediate representations.
Very deep and thin models trained by the proposed algorithm end up extracting feature representations that are comparable or even better performing
than the ones extracted by large state-of-the-art models, while compellingly
reducing the time and memory consumption of the model. |
|
|
Address |
October 2015 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Carlo Gatta;Petia Radeva |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ Rom2015 |
Serial |
2707 |
|
Permanent link to this record |