Publicacions CVC -- Query Results

[171–180] << 181 182 183 184 185 186 187 188 189 190 >> [191–200]

Details

Records
Author	Carola Figueroa Flores; Abel Gonzalez-Garcia; Joost Van de Weijer; Bogdan Raducanu
Title	Saliency for fine-grained object recognition in domains with scarce training data			Type	Journal Article
Year	2019	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	94	Issue		Pages	62-73
Keywords
Abstract	This paper investigates the role of saliency to improve the classification accuracy of a Convolutional Neural Network (CNN) for the case when scarce training data is available. Our approach consists in adding a saliency branch to an existing CNN architecture which is used to modulate the standard bottom-up visual features from the original image input, acting as an attentional mechanism that guides the feature extraction process. The main aim of the proposed approach is to enable the effective training of a fine-grained recognition model with limited training samples and to improve the performance on the task, thereby alleviating the need to annotate a large dataset. The vast majority of saliency methods are evaluated on their ability to generate saliency maps, and not on their functionality in a complete vision pipeline. Our proposed pipeline allows to evaluate saliency methods for the high-level task of object recognition. We perform extensive experiments on various fine-grained datasets (Flowers, Birds, Cars, and Dogs) under different conditions and show that saliency can considerably improve the network’s performance, especially for the case of scarce training data. Furthermore, our experiments show that saliency methods that obtain improved saliency maps (as measured by traditional saliency benchmarks) also translate to saliency methods that yield improved performance gains when applied in an object recognition pipeline.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.109; 600.141; 600.120			Approved	no
Call Number	Admin @ si @ FGW2019			Serial	3264
Permanent link to this record



Author	Carola Figueroa Flores; David Berga; Joost Van de Weijer; Bogdan Raducanu
Title	Saliency for free: Saliency prediction as a side-effect of object recognition			Type	Journal Article
Year	2021	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	150	Issue		Pages	1-7
Keywords	Saliency maps; Unsupervised learning; Object recognition
Abstract	Saliency is the perceptual capacity of our visual system to focus our attention (i.e. gaze) on relevant objects instead of the background. So far, computational methods for saliency estimation required the explicit generation of a saliency map, process which is usually achieved via eyetracking experiments on still images. This is a tedious process that needs to be repeated for each new dataset. In the current paper, we demonstrate that is possible to automatically generate saliency maps without ground-truth. In our approach, saliency maps are learned as a side effect of object recognition. Extensive experiments carried out on both real and synthetic datasets demonstrated that our approach is able to generate accurate saliency maps, achieving competitive results when compared with supervised methods.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.147; 600.120			Approved	no
Call Number	Admin @ si @ FBW2021			Serial	3559
Permanent link to this record



Author	Aymen Azaza; Joost Van de Weijer; Ali Douik; Javad Zolfaghari Bengar; Marc Masana
Title	Saliency from High-Level Semantic Image Features			Type	Journal
Year	2020	Publication	SN Computer Science	Abbreviated Journal	SN
Volume	1	Issue	4	Pages	1-12
Keywords
Abstract	Top-down semantic information is known to play an important role in assigning saliency. Recently, large strides have been made in improving state-of-the-art semantic image understanding in the fields of object detection and semantic segmentation. Therefore, since these methods have now reached a high-level of maturity, evaluation of the impact of high-level image understanding on saliency estimation is now feasible. We propose several saliency features which are computed from object detection and semantic segmentation results. We combine these features with a standard baseline method for saliency detection to evaluate their importance. Experiments demonstrate that the proposed features derived from object detection and semantic segmentation improve saliency estimation significantly. Moreover, they show that our method obtains state-of-the-art results on (FT, ImgSal, and SOD datasets) and obtains competitive results on four other datasets (ECSSD, PASCAL-S, MSRA-B, and HKU-IS).
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.120; 600.109; 600.106			Approved	no
Call Number	Admin @ si @ AWD2020			Serial	3503
Permanent link to this record



Author	Eduard Vazquez; Theo Gevers; M. Lucassen; Joost Van de Weijer; Ramon Baldrich
Title	Saliency of Color Image Derivatives: A Comparison between Computational Models and Human Perception			Type	Journal Article
Year	2010	Publication	Journal of the Optical Society of America A	Abbreviated Journal	JOSA A
Volume	27	Issue	3	Pages	613–621
Keywords
Abstract	In this paper, computational methods are proposed to compute color edge saliency based on the information content of color edges. The computational methods are evaluated on bottom-up saliency in a psychophysical experiment, and on a more complex task of salient object detection in real-world images. The psychophysical experiment demonstrates the relevance of using information theory as a saliency processing model and that the proposed methods are significantly better in predicting color saliency (with a human-method correspondence up to 74.75% and an observer agreement of 86.8%) than state-of-the-art models. Furthermore, results from salient object detection confirm that an early fusion of color and contrast provide accurate performance to compute visual saliency with a hit rate up to 95.2%.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE;CIC			Approved	no
Call Number	CAT @ cat @ VGL2010			Serial	1275
Permanent link to this record



Author	Iiris Lusi; Sergio Escalera; Gholamreza Anbarjafari
Title	SASE: RGB-Depth Database for Human Head Pose Estimation			Type	Conference Article
Year	2016	Publication	14th European Conference on Computer Vision Workshops	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Slides
Address	Amsterdam; The Netherlands; October 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	HuPBA;MILAB;			Approved	no
Call Number	Admin @ si @ LEA2016a			Serial	2840
Permanent link to this record



Author	A. Pujol; A.F. Sole; Daniel Ponsa; Javier Varona; Juan J. Villanueva
Title	Satellite Image Segmentation Trough Rotational Invariant Feature Eigenvector Projection.			Type	Miscellaneous
Year	1999	Publication	Machine Vision and Advanced Image Processing in Remote Sensing, Springer, 317–327.	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	ADAS @ adas @ PSP1999			Serial	36
Permanent link to this record



Author	H. Chouaib; Salvatore Tabbone; Oriol Ramos Terrades; F. Cloppet; N. Vincent; A.T. Thierry Paquet
Title	Sélection de Caractéristiques à partir d'un algorithme génétique et d'une combinaison de classifieurs Adaboost			Type	Conference Article
Year	2008	Publication	Colloque International Francophone sur l'Ecrit et le Document	Abbreviated Journal
Volume		Issue		Pages	181-186
Keywords
Abstract
Address	Rouen, France
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CIFED
Notes	DAG			Approved	no
Call Number	Admin @ si @ CTR2008			Serial	1874
Permanent link to this record



Author	Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Andrew Bagdanov; Michael Felsberg; Jorma
Title	Scale coding bag of deep features for human attribute and action recognition			Type	Journal Article
Year	2018	Publication	Machine Vision and Applications	Abbreviated Journal	MVAP
Volume	29	Issue	1	Pages	55-71
Keywords	Action recognition; Attribute recognition; Bag of deep features
Abstract	Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding. Both in bag-of-words and the recently popular representations based on convolutional neural networks, local features are computed at multiple scales. However, these multi-scale convolutional features are pooled into a single scale-invariant representation. We argue that entirely scale-invariant image representations are sub-optimal and investigate approaches to scale coding within a bag of deep features framework. Our approach encodes multi-scale information explicitly during the image encoding stage. We propose two strategies to encode multi-scale information explicitly in the final image representation. We validate our two scale coding techniques on five datasets: Willow, PASCAL VOC 2010, PASCAL VOC 2012, Stanford-40 and Human Attributes (HAT-27). On all datasets, the proposed scale coding approaches outperform both the scale-invariant method and the standard deep features of the same network. Further, combining our scale coding approaches with standard deep features leads to consistent improvement over the state of the art.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.068; 600.079; 600.106; 600.120			Approved	no
Call Number	Admin @ si @ KWR2018			Serial	3107
Permanent link to this record



Author	Fahad Shahbaz Khan; Joost Van de Weijer; Andrew Bagdanov; Michael Felsberg
Title	Scale Coding Bag-of-Words for Action Recognition			Type	Conference Article
Year	2014	Publication	22nd International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	1514-1519
Keywords
Abstract	Recognizing human actions in still images is a challenging problem in computer vision due to significant amount of scale, illumination and pose variation. Given the bounding box of a person both at training and test time, the task is to classify the action associated with each bounding box in an image. Most state-of-the-art methods use the bag-of-words paradigm for action recognition. The bag-of-words framework employing a dense multi-scale grid sampling strategy is the de facto standard for feature detection. This results in a scale invariant image representation where all the features at multiple-scales are binned in a single histogram. We argue that such a scale invariant strategy is sub-optimal since it ignores the multi-scale information available with each bounding box of a person. This paper investigates alternative approaches to scale coding for action recognition in still images. We encode multi-scale information explicitly in three different histograms for small, medium and large scale visual-words. Our first approach exploits multi-scale information with respect to the image size. In our second approach, we encode multi-scale information relative to the size of the bounding box of a person instance. In each approach, the multi-scale histograms are then concatenated into a single representation for action classification. We validate our approaches on the Willow dataset which contains seven action categories: interacting with computer, photography, playing music, riding bike, riding horse, running and walking. Our results clearly suggest that the proposed scale coding approaches outperform the conventional scale invariant technique. Moreover, we show that our approach obtains promising results compared to more complex state-of-the-art methods.
Address	Stockholm; August 2014
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	CIC; LAMP; 601.240; 600.074; 600.079			Approved	no
Call Number	Admin @ si @ KWB2014			Serial	2450
Permanent link to this record



Author	Yi Xiao; Felipe Codevilla; Diego Porres; Antonio Lopez
Title	Scaling Vision-Based End-to-End Autonomous Driving with Multi-View Attention Learning			Type	Conference Article
Year	2023	Publication	International Conference on Intelligent Robots and Systems	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	On end-to-end driving, human driving demonstrations are used to train perception-based driving models by imitation learning. This process is supervised on vehicle signals (e.g., steering angle, acceleration) but does not require extra costly supervision (human labeling of sensor data). As a representative of such vision-based end-to-end driving models, CILRS is commonly used as a baseline to compare with new driving models. So far, some latest models achieve better performance than CILRS by using expensive sensor suites and/or by using large amounts of human-labeled data for training. Given the difference in performance, one may think that it is not worth pursuing vision-based pure end-to-end driving. However, we argue that this approach still has great value and potential considering cost and maintenance. In this paper, we present CIL++, which improves on CILRS by both processing higher-resolution images using a human-inspired HFOV as an inductive bias and incorporating a proper attention mechanism. CIL++ achieves competitive performance compared to models which are more costly to develop. We propose to replace CILRS with CIL++ as a strong vision-based pure end-to-end driving baseline supervised by only vehicle signals and trained by conditional imitation learning.
Address	Detroit; USA; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IROS
Notes	ADAS			Approved	no
Call Number	Admin @ si @ XCP2023			Serial	3930
Permanent link to this record



Author	Miguel Oliveira; Victor Santos; Angel Sappa; P. Dias
Title	Scene Representations for Autonomous Driving: an approach based on polygonal primitives			Type	Conference Article
Year	2015	Publication	2nd Iberian Robotics Conference ROBOT2015	Abbreviated Journal
Volume	417	Issue		Pages	503-515
Keywords	Scene reconstruction; Point cloud; Autonomous vehicles
Abstract	In this paper, we present a novel methodology to compute a 3D scene representation. The algorithm uses macro scale polygonal primitives to model the scene. This means that the representation of the scene is given as a list of large scale polygons that describe the geometric structure of the environment. Results show that the approach is capable of producing accurate descriptions of the scene. In addition, the algorithm is very efficient when compared to other techniques.
Address	Lisboa; Portugal; November 2015
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ROBOT
Notes	ADAS; 600.076; 600.086			Approved	no
Call Number	Admin @ si @ OSS2015a			Serial	2662
Permanent link to this record



Author	Lluis Gomez; Dimosthenis Karatzas
Title	Scene Text Recognition: No Country for Old Men?			Type	Conference Article
Year	2014	Publication	1st International Workshop on Robust Reading	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IWRR
Notes	DAG; 600.077			Approved	no
Call Number	Admin @ si @ GoK2014c			Serial	2538
Permanent link to this record



Author	Ali Furkan Biten; Ruben Tito; Andres Mafla; Lluis Gomez; Marçal Rusiñol; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas
Title	Scene Text Visual Question Answering			Type	Conference Article
Year	2019	Publication	18th IEEE International Conference on Computer Vision	Abbreviated Journal
Volume		Issue		Pages	4291-4301
Keywords
Abstract	Current visual question answering datasets do not consider the rich semantic information conveyed by text within an image. In this work, we present a new dataset, ST-VQA, that aims to highlight the importance of exploiting highlevel semantic information present in images as textual cues in the Visual Question Answering process. We use this dataset to define a series of tasks of increasing difficulty for which reading the scene text in the context provided by the visual information is necessary to reason and generate an appropriate answer. We propose a new evaluation metric for these tasks to account both for reasoning errors as well as shortcomings of the text recognition module. In addition we put forward a series of baseline methods, which provide further insight to the newly released dataset, and set the scene for further research.
Address	Seul; Corea; October 2019
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCV
Notes	DAG; 600.129; 600.135; 601.338; 600.121			Approved	no
Call Number	Admin @ si @ BTM2019b			Serial	3285
Permanent link to this record



Author	Jose Antonio Rodriguez; Florent Perronnin
Title	Score Normalization for Hmm-based Word Spotting Using Universal Background Model			Type	Conference Article
Year	2008	Publication	International Conference on Frontiers in Handwriting Recognition	Abbreviated Journal
Volume		Issue		Pages	82–87
Keywords
Abstract
Address	Montreal (Canada)
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICFHR
Notes				Approved	no
Call Number	Admin @ si @ RoP2008c			Serial	1067
Permanent link to this record



Author	Sounak Dey; Palaiahnakote Shivakumara; K.S. Raghunanda; Umapada Pal; Tong Lu; G. Hemantha Kumar; Chee Seng Chan
Title	Script independent approach for multi-oriented text detection in scene image			Type	Journal Article
Year	2017	Publication	Neurocomputing	Abbreviated Journal	NEUCOM
Volume	242	Issue		Pages	96-112
Keywords
Abstract	Developing a text detection method which is invariant to scripts in natural scene images is a challeng- ing task due to different geometrical structures of various scripts. Besides, multi-oriented of text lines in natural scene images make the problem more challenging. This paper proposes to explore ring radius transform (RRT) for text detection in multi-oriented and multi-script environments. The method finds component regions based on convex hull to generate radius matrices using RRT. It is a fact that RRT pro- vides low radius values for the pixels that are near to edges, constant radius values for the pixels that represent stroke width, and high radius values that represent holes created in background and convex hull because of the regular structures of text components. We apply k -means clustering on the radius matrices to group such spatially coherent regions into individual clusters. Then the proposed method studies the radius values of such cluster components that are close to the centroid and far from the cen- troid to detect text components. Furthermore, we have developed a Bangla dataset (named as ISI-UM dataset) and propose a semi-automatic system for generating its ground truth for text detection of arbi- trary orientations, which can be used by the researchers for text detection and recognition in the future. The ground truth will be released to public. Experimental results on our ISI-UM data and other standard datasets, namely, ICDAR 2013 scene, SVT and MSRA data, show that the proposed method outperforms the existing methods in terms of multi-lingual and multi-oriented text detection ability.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ DSR2017			Serial	3260
Permanent link to this record