Publicacions CVC -- Query Results

[51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–80]

Details

Records
Author	Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen
Title	Tex-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition			Type	Conference Article
Year	2017	Publication	19th International Conference on Multimodal Interaction	Abbreviated Journal
Volume		Issue		Pages
Keywords	Convolutional Neural Networks; Texture Recognition; Local Binary Paterns
Abstract	Recognizing materials and textures in realistic imaging conditions is a challenging computer vision problem. For many years, local features based orderless representations were a dominant approach for texture recognition. Recently deep local features, extracted from the intermediate layers of a Convolutional Neural Network (CNN), are used as filter banks. These dense local descriptors from a deep model, when encoded with Fisher Vectors, have shown to provide excellent results for texture recognition. The CNN models, employed in such approaches, take RGB patches as input and train on a large amount of labeled images. We show that CNN models, which we call TEX-Nets, trained using mapped coded images with explicit texture information provide complementary information to the standard deep models trained on RGB patches. We further investigate two deep architectures, namely early and late fusion, to combine the texture and color information. Experiments on benchmark texture datasets clearly demonstrate that TEX-Nets provide complementary information to standard RGB deep network. Our approach provides a large gain of 4.8%, 3.5%, 2.6% and 4.1% respectively in accuracy on the DTD, KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets, compared to the standard RGB network of the same architecture. Further, our final combination leads to consistent improvements over the state-of-the-art on all four datasets.
Address	Glasgow; Scothland; November 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ACM
Notes	LAMP; 600.109; 600.068; 600.120			Approved	no
Call Number	Admin @ si @ RKW2017			Serial	3038
Permanent link to this record



Author	Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen
Title	Top-Down Deep Appearance Attention for Action Recognition			Type	Conference Article
Year	2017	Publication	20th Scandinavian Conference on Image Analysis	Abbreviated Journal
Volume	10269	Issue		Pages	297-309
Keywords	Action recognition; CNNs; Feature fusion
Abstract	Recognizing human actions in videos is a challenging problem in computer vision. Recently, convolutional neural network based deep features have shown promising results for action recognition. In this paper, we investigate the problem of fusing deep appearance and motion cues for action recognition. We propose a video representation which combines deep appearance and motion based local convolutional features within the bag-of-deep-features framework. Firstly, dense deep appearance and motion based local convolutional features are extracted from spatial (RGB) and temporal (flow) networks, respectively. Both visual cues are processed in parallel by constructing separate visual vocabularies for appearance and motion. A category-specific appearance map is then learned to modulate the weights of the deep motion features. The proposed representation is discriminative and binds the deep local convolutional features to their spatial locations. Experiments are performed on two challenging datasets: JHMDB dataset with 21 action classes and ACT dataset with 43 categories. The results clearly demonstrate that our approach outperforms both standard approaches of early and late feature fusion. Further, our approach is only employing action labels and without exploiting body part information, but achieves competitive performance compared to the state-of-the-art deep features based approaches.
Address	Tromso; June 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	SCIA
Notes	LAMP; 600.109; 600.068; 600.120			Approved	no
Call Number	Admin @ si @ RKW2017b			Serial	3039
Permanent link to this record



Author	Muhammad Anwer Rao; David Vazquez; Antonio Lopez
Title	Opponent Colors for Human Detection			Type	Conference Article
Year	2011	Publication	5th Iberian Conference on Pattern Recognition and Image Analysis	Abbreviated Journal
Volume	6669	Issue		Pages	363-370
Keywords	Pedestrian Detection; Color; Part Based Models
Abstract	Human detection is a key component in fields such as advanced driving assistance and video surveillance. However, even detecting non-occluded standing humans remains a challenge of intensive research. Finding good features to build human models for further detection is probably one of the most important issues to face. Currently, shape, texture and motion features have deserve extensive attention in the literature. However, color-based features, which are important in other domains (e.g., image categorization), have received much less attention. In fact, the use of RGB color space has become a kind of choice by default. The focus has been put in developing first and second order features on top of RGB space (e.g., HOG and co-occurrence matrices, resp.). In this paper we evaluate the opponent colors (OPP) space as a biologically inspired alternative for human detection. In particular, by feeding OPP space in the baseline framework of Dalal et al. for human detection (based on RGB, HOG and linear SVM), we will obtain better detection performance than by using RGB space. This is a relevant result since, up to the best of our knowledge, OPP space has not been previously used for human detection. This suggests that in the future it could be worth to compute co-occurrence matrices, self-similarity features, etc., also on top of OPP space, i.e., as we have done with HOG in this paper.
Address	Las Palmas de Gran Canaria. Spain
Corporate Author				Thesis
Publisher	Springer	Place of Publication	Berlin Heidelberg	Editor	J. Vitria; J.M. Sanches; M. Hernandez
Language	English	Summary Language	English	Original Title	Opponent Colors for Human Detection
Series Editor		Series Title	Lecture Notes on Computer Science	Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-21256-7	Medium
Area		Expedition		Conference	IbPRIA
Notes	ADAS			Approved	no
Call Number	ADAS @ adas @ RVL2011a			Serial	1666
Permanent link to this record



Author	Muhammad Anwer Rao; David Vazquez; Antonio Lopez
Title	Color Contribution to Part-Based Person Detection in Different Types of Scenarios			Type	Conference Article
Year	2011	Publication	14th International Conference on Computer Analysis of Images and Patterns	Abbreviated Journal
Volume	6855	Issue	II	Pages	463-470
Keywords	Pedestrian Detection; Color
Abstract	Camera-based person detection is of paramount interest due to its potential applications. The task is diffcult because the great variety of backgrounds (scenarios, illumination) in which persons are present, as well as their intra-class variability (pose, clothe, occlusion). In fact, the class person is one of the included in the popular PASCAL visual object classes (VOC) challenge. A breakthrough for this challenge, regarding person detection, is due to Felzenszwalb et al. These authors proposed a part-based detector that relies on histograms of oriented gradients (HOG) and latent support vector machines (LatSVM) to learn a model of the whole human body and its constitutive parts, as well as their relative position. Since the approach of Felzenszwalb et al. appeared new variants have been proposed, usually giving rise to more complex models. In this paper, we focus on an issue that has not attracted suficient interest up to now. In particular, we refer to the fact that HOG is usually computed from RGB color space, but other possibilities exist and deserve the corresponding investigation. In this paper we challenge RGB space with the opponent color space (OPP), which is inspired in the human vision system.We will compute the HOG on top of OPP, then we train and test the part-based human classifer by Felzenszwalb et al. using PASCAL VOC challenge protocols and person database. Our experiments demonstrate that OPP outperforms RGB. We also investigate possible differences among types of scenarios: indoor, urban and countryside. Interestingly, our experiments suggest that the beneficts of OPP with respect to RGB mainly come for indoor and countryside scenarios, those in which the human visual system was designed by evolution.
Address	Seville, Spain
Corporate Author				Thesis
Publisher	Springer	Place of Publication	Berlin Heidelberg	Editor	P. Real, D. Diaz, H. Molina, A. Berciano, W. Kropatsch
Language	English	Summary Language	english	Original Title	Color Contribution to Part-Based Person Detection in Different Types of Scenarios
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-23677-8	Medium
Area		Expedition		Conference	CAIP
Notes	ADAS			Approved	no
Call Number	ADAS @ adas @ RVL2011b			Serial	1665
Permanent link to this record



Author	Muhammad Anwer Rao
Title	Color for Object Detection and Action Recognition			Type	Book Whole
Year	2013	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Recognizing object categories in real world images is a challenging problem in computer vision. The deformable part based framework is currently the most successful approach for object detection. Generally, HOG are used for image representation within the part-based framework. For action recognition, the bag-of-word framework has shown to provide promising results. Within the bag-of-words framework, local image patches are described by SIFT descriptor. Contrary to object detection and action recognition, combining color and shape has shown to provide the best performance for object and scene recognition. In the first part of this thesis, we analyze the problem of person detection in still images. Standard person detection approaches rely on intensity based features for image representation while ignoring the color. Channel based descriptors is one of the most commonly used approaches in object recognition. This inspires us to evaluate incorporating color information using the channel based fusion approach for the task of person detection. In the second part of the thesis, we investigate the problem of object detection in still images. Due to high dimensionality, channel based fusion increases the computational cost. Moreover, channel based fusion has been found to obtain inferior results for object category where one of the visual varies significantly. On the other hand, late fusion is known to provide improved results for a wide range of object categories. A consequence of late fusion strategy is the need of a pure color descriptor. Therefore, we propose to use Color attributes as an explicit color representation for object detection. Color attributes are compact and computationally efficient. Consequently color attributes are combined with traditional shape features providing excellent results for object detection task. Finally, we focus on the problem of action detection and classification in still images. We investigate the potential of color for action classification and detection in still images. We also evaluate different fusion approaches for combining color and shape information for action recognition. Additionally, an analysis is performed to validate the contribution of color for action recognition. Our results clearly demonstrate that combining color and shape information significantly improve the performance of both action classification and detection in still images.
Address	Barcelona
Corporate Author				Thesis	Ph.D. thesis
Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Antonio Lopez;Joost Van de Weijer
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	Admin @ si @ Rao2013			Serial	2281
Permanent link to this record



Author	Monica Piñol; Angel Sappa; Ricardo Toledo
Title	MultiTable Reinforcement for Visual Object Recognition			Type	Conference Article
Year	2012	Publication	4th International Conference on Signal and Image Processing	Abbreviated Journal
Volume	221	Issue		Pages	469-480
Keywords
Abstract	This paper presents a bag of feature based method for visual object recognition. Our contribution is focussed on the selection of the best feature descriptor. It is implemented by using a novel multi-table reinforcement learning method that selects among five of classical descriptors (i.e., Spin, SIFT, SURF, C-SIFT and PHOW) the one that best describes each image. Experimental results and comparisons are provided showing the improvements achieved with the proposed approach.
Address	Coimbatore, India
Corporate Author				Thesis
Publisher	Springer India	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	1876-1100	ISBN	978-81-322-0996-6	Medium
Area		Expedition		Conference	ICSIP
Notes	ADAS			Approved	no
Call Number	Admin @ si @ PST2012			Serial	2157
Permanent link to this record



Author	Monica Piñol; Angel Sappa; Ricardo Toledo
Title	Adaptive Feature Descriptor Selection based on a Multi-Table Reinforcement Learning Strategy			Type	Journal Article
Year	2015	Publication	Neurocomputing	Abbreviated Journal	NEUCOM
Volume	150	Issue	A	Pages	106–115
Keywords	Reinforcement learning; Q-learning; Bag of features; Descriptors
Abstract	This paper presents and evaluates a framework to improve the performance of visual object classification methods, which are based on the usage of image feature descriptors as inputs. The goal of the proposed framework is to learn the best descriptor for each image in a given database. This goal is reached by means of a reinforcement learning process using the minimum information. The visual classification system used to demonstrate the proposed framework is based on a bag of features scheme, and the reinforcement learning technique is implemented through the Q-learning approach. The behavior of the reinforcement learning with different state definitions is evaluated. Additionally, a method that combines all these states is formulated in order to select the optimal state. Finally, the chosen actions are obtained from the best set of image descriptors in the literature: PHOW, SIFT, C-SIFT, SURF and Spin. Experimental results using two public databases (ETH and COIL) are provided showing both the validity of the proposed approach and comparisons with state of the art. In all the cases the best results are obtained with the proposed approach.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.055; 600.076			Approved	no
Call Number	Admin @ si @ PST2015			Serial	2473
Permanent link to this record



Author	Monica Piñol; Angel Sappa; Angeles Lopez; Ricardo Toledo
Title	Feature Selection Based on Reinforcement Learning for Object Recognition			Type	Conference Article
Year	2012	Publication	Adaptive Learning Agents Workshop	Abbreviated Journal
Volume		Issue		Pages	33-39
Keywords
Abstract
Address	Valencia
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ALA
Notes	ADAS; RV			Approved	no
Call Number	Admin @ si @ PSL2012			Serial	2018
Permanent link to this record



Author	Monica Piñol
Title	Adaptative Vocabulary Tree for Image Classification using Reinforcement Learning			Type	Report
Year	2010	Publication	CVC Technical Report	Abbreviated Journal
Volume	162	Issue		Pages
Keywords
Abstract
Address	Bellaterra (Barcelona)
Corporate Author	Computer Vision Center			Thesis	Master's thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	Admin @ si @ Piñ2010			Serial	1936
Permanent link to this record



Author	Monica Piñol
Title	Reinforcement Learning of Visual Descriptors for Object Recognition			Type	Book Whole
Year	2014	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	The human visual system is able to recognize the object in an image even if the object is partially occluded, from various points of view, in different colors, or with independence of the distance to the object. To do this, the eye obtains an image and extracts features that are sent to the brain, and then, in the brain the object is recognized. In computer vision, the object recognition branch tries to learns from the human visual system behaviour to achieve its goal. Hence, an algorithm is used to identify representative features of the scene (detection), then another algorithm is used to describe these points (descriptor) and finally the extracted information is used for classifying the object in the scene. The selection of this set of algorithms is a very complicated task and thus, a very active research field. In this thesis we are focused on the selection/learning of the best descriptor for a given image. In the state of the art there are several descriptors but we do not know how to choose the best descriptor because depends on scenes that we will use (dataset) and the algorithm chosen to do the classification. We propose a framework based on reinforcement learning and bag of features to choose the best descriptor according to the given image. The system can analyse the behaviour of different learning algorithms and descriptor sets. Furthermore the proposed framework for improving the classification/recognition ratio can be used with minor changes in other computer vision fields, such as video retrieval.
Address
Corporate Author				Thesis	Ph.D. thesis
Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Ricardo Toledo;Angel Sappa
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-84-940902-5-7	Medium
Area		Expedition		Conference
Notes	ADAS; 600.076			Approved	no
Call Number	Admin @ si @ Piñ2014			Serial	2464
Permanent link to this record



Author	Mohammed Al Rawi; Ernest Valveny; Dimosthenis Karatzas
Title	Can One Deep Learning Model Learn Script-Independent Multilingual Word-Spotting?			Type	Conference Article
Year	2019	Publication	15th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume		Issue		Pages	260-267
Keywords
Abstract	Word spotting has gained increased attention lately as it can be used to extract textual information from handwritten documents and scene-text images. Current word spotting approaches are designed to work on a single language and/or script. Building intelligent models that learn script-independent multilingual word-spotting is challenging due to the large variability of multilingual alphabets and symbols. We used ResNet-152 and the Pyramidal Histogram of Characters (PHOC) embedding to build a one-model script-independent multilingual word-spotting and we tested it on Latin, Arabic, and Bangla (Indian) languages. The one-model we propose performs on par with the multi-model language-specific word-spotting system, and thus, reduces the number of models needed for each script and/or language.
Address	Sydney; Australia; September 2019
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG; 600.129; 600.121			Approved	no
Call Number	Admin @ si @ RVK2019			Serial	3337
Permanent link to this record



Author	Mohammed Al Rawi; Ernest Valveny
Title	Compact and Efficient Multitask Learning in Vision, Language and Speech			Type	Conference Article
Year	2019	Publication	IEEE International Conference on Computer Vision Workshops	Abbreviated Journal
Volume		Issue		Pages	2933-2942
Keywords
Abstract	Across-domain multitask learning is a challenging area of computer vision and machine learning due to the intra-similarities among class distributions. Addressing this problem to cope with the human cognition system by considering inter and intra-class categorization and recognition complicates the problem even further. We propose in this work an effective holistic and hierarchical learning by using a text embedding layer on top of a deep learning model. We also propose a novel sensory discriminator approach to resolve the collisions between different tasks and domains. We then train the model concurrently on textual sentiment analysis, speech recognition, image classification, action recognition from video, and handwriting word spotting of two different scripts (Arabic and English). The model we propose successfully learned different tasks across multiple domains.
Address	Seul; Korea; October 2019
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCVW
Notes	DAG; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ RaV2019			Serial	3365
Permanent link to this record



Author	Mohammed Al Rawi; Dimosthenis Karatzas
Title	On the Labeling Correctness in Computer Vision Datasets			Type	Conference Article
Year	2018	Publication	Proceedings of the Workshop on Interactive Adaptive Learning, co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Image datasets have heavily been used to build computer vision systems. These datasets are either manually or automatically labeled, which is a problem as both labeling methods are prone to errors. To investigate this problem, we use a majority voting ensemble that combines the results from several Convolutional Neural Networks (CNNs). Majority voting ensembles not only enhance the overall performance, but can also be used to estimate the confidence level of each sample. We also examined Softmax as another form to estimate posterior probability. We have designed various experiments with a range of different ensembles built from one or different, or temporal/snapshot CNNs, which have been trained multiple times stochastically. We analyzed CIFAR10, CIFAR100, EMNIST, and SVHN datasets and we found quite a few incorrect labels, both in the training and testing sets. We also present detailed confidence analysis on these datasets and we found that the ensemble is better than the Softmax when used estimate the per-sample confidence. This work thus proposes an approach that can be used to scrutinize and verify the labeling of computer vision datasets, which can later be applied to weakly/semi-supervised learning. We propose a measure, based on the Odds-Ratio, to quantify how many of these incorrectly classified labels are actually incorrectly labeled and how many of these are confusing. The proposed methods are easily scalable to larger datasets, like ImageNet, LSUN and SUN, as each CNN instance is trained for 60 epochs; or even faster, by implementing a temporal (snapshot) ensemble.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECML-PKDDW
Notes	DAG; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ RaK2018			Serial	3144
Permanent link to this record



Author	Mohammad Rouhani; E. Boyer; Angel Sappa
Title	Non-Rigid Registration meets Surface Reconstruction			Type	Conference Article
Year	2014	Publication	International Conference on 3D Vision	Abbreviated Journal
Volume		Issue		Pages	617-624
Keywords
Abstract	Non rigid registration is an important task in computer vision with many applications in shape and motion modeling. A fundamental step of the registration is the data association between the source and the target sets. Such association proves difficult in practice, due to the discrete nature of the information and its corruption by various types of noise, e.g. outliers and missing data. In this paper we investigate the benefit of the implicit representations for the non-rigid registration of 3D point clouds. First, the target points are described with small quadratic patches that are blended through partition of unity weighting. Then, the discrete association between the source and the target can be replaced by a continuous distance field induced by the interface. By combining this distance field with a proper deformation term, the registration energy can be expressed in a linear least square form that is easy and fast to solve. This significantly eases the registration by avoiding direct association between points. Moreover, a hierarchical approach can be easily implemented by employing coarse-to-fine representations. Experimental results are provided for point clouds from multi-view data sets. The qualitative and quantitative comparisons show the outperformance and robustness of our framework. %in presence of noise and outliers.
Address	Tokyo; Japan; December 2014
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	3DV
Notes	ADAS; 600.055; 600.076			Approved	no
Call Number	Admin @ si @ RBS2014			Serial	2534
Permanent link to this record



Author	Mohammad Rouhani; Angel Sappa; E. Boyer
Title	Implicit B-Spline Surface Reconstruction			Type	Journal Article
Year	2015	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
Volume	24	Issue	1	Pages	22 - 32
Keywords
Abstract	This paper presents a fast and flexible curve, and surface reconstruction technique based on implicit B-spline. This representation does not require any parameterization and it is locally supported. This fact has been exploited in this paper to propose a reconstruction technique through solving a sparse system of equations. This method is further accelerated to reduce the dimension to the active control lattice. Moreover, the surface smoothness and user interaction are allowed for controlling the surface. Finally, a novel weighting technique has been introduced in order to blend small patches and smooth them in the overlapping regions. The whole framework is very fast and efficient and can handle large cloud of points with very low computational cost. The experimental results show the flexibility and accuracy of the proposed algorithm to describe objects with complex topologies. Comparisons with other fitting methods highlight the superiority of the proposed approach in the presence of noise and missing data.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1057-7149	ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.076			Approved	no
Call Number	Admin @ si @ RSB2015			Serial	2541
Permanent link to this record