Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–14]

Details

Records
Author	Yunchao Gong; Svetlana Lazebnik; Albert Gordo; Florent Perronnin
Title	Iterative quantization: A procrustean approach to learning binary codes for Large-Scale Image Retrieval			Type	Journal Article
Year	2012	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
Volume	35	Issue	12	Pages	2916-2929
Keywords
Abstract	This paper addresses the problem of learning similarity-preserving binary codes for efficient similarity search in large-scale image collections. We formulate this problem in terms of finding a rotation of zero-centered data so as to minimize the quantization error of mapping this data to the vertices of a zero-centered binary hypercube, and propose a simple and efficient alternating minimization algorithm to accomplish this task. This algorithm, dubbed iterative quantization (ITQ), has connections to multi-class spectral clustering and to the orthogonal Procrustes problem, and it can be used both with unsupervised data embeddings such as PCA and supervised embeddings such as canonical correlation analysis (CCA). The resulting binary codes significantly outperform several other state-of-the-art methods. We also show that further performance improvements can result from transforming the data with a nonlinear kernel mapping prior to PCA or CCA. Finally, we demonstrate an application of ITQ to learning binary attributes or “classemes” on the ImageNet dataset.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0162-8828	ISBN	978-1-4577-0394-2	Medium
Area		Expedition		Conference
Notes	DAG			Approved	no
Call Number	Admin @ si @ GLG 2012b			Serial	2008
Permanent link to this record



Author	Yainuvis Socarras; David Vazquez; Antonio Lopez; David Geronimo; Theo Gevers
Title	Improving HOG with Image Segmentation: Application to Human Detection			Type	Conference Article
Year	2012	Publication	11th International Conference on Advanced Concepts for Intelligent Vision Systems	Abbreviated Journal
Volume	7517	Issue		Pages	178-189
Keywords	Segmentation; Pedestrian Detection
Abstract	In this paper we improve the histogram of oriented gradients (HOG), a core descriptor of state-of-the-art object detection, by the use of higher-level information coming from image segmentation. The idea is to re-weight the descriptor while computing it without increasing its size. The benefits of the proposal are two-fold: (i) to improve the performance of the detector by enriching the descriptor information and (ii) take advantage of the information of image segmentation, which in fact is likely to be used in other stages of the detection system such as candidate generation or refinement. We test our technique in the INRIA person dataset, which was originally developed to test HOG, embedding it in a human detection system. The well-known segmentation method, mean-shift (from smaller to larger super-pixels), and different methods to re-weight the original descriptor (constant, region-luminance, color or texture-dependent) has been evaluated. We achieve performance improvements of 4:47% in detection rate through the use of differences of color between contour pixel neighborhoods as re-weighting function.
Address	Brno, Czech Republic
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor	J. Blanc-Talon et al.
Language	English	Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-33139-8	Medium
Area		Expedition		Conference	ACIVS
Notes	ADAS;ISE			Approved	no
Call Number	ADAS @ adas @ SLV2012			Serial	1980
Permanent link to this record



Author	Xu Hu
Title	Real-Time Part Based Models for Object Detection			Type	Report
Year	2012	Publication	CVC Technical Report	Abbreviated Journal
Volume	171	Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis	Master's thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS;ISE			Approved	no
Call Number	Admin @ si @ Hu2012			Serial	2415
Permanent link to this record



Author	Xavier Perez Sala; Laura Igual; Sergio Escalera; Cecilio Angulo
Title	Uniform Sampling of Rotations for Discrete and Continuous Learning of 2D Shape Models			Type	Book Chapter
Year	2012	Publication	Vision Robotics: Technologies for Machine Learning and Vision Applications	Abbreviated Journal
Volume		Issue	2	Pages	23-42
Keywords
Abstract	Different methodologies of uniform sampling over the rotation group, SO(3), for building unbiased 2D shape models from 3D objects are introduced and reviewed in this chapter. State-of-the-art non uniform sampling approaches are discussed, and uniform sampling methods using Euler angles and quaternions are introduced. Moreover, since presented work is oriented to model building applications, it is not limited to general discrete methods to obtain uniform 3D rotations, but also from a continuous point of view in the case of Procrustes Analysis.
Address
Corporate Author				Thesis
Publisher	IGI-Global	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB;HuPBA			Approved	no
Call Number	Admin @ si @ PIE2012			Serial	2064
Permanent link to this record



Author	Xavier Otazu; Olivier Penacchio; Laura Dempere-Marco
Title	An investigation into plausible neural mechanisms related to the the CIWaM computational model for brightness induction			Type	Conference Article
Year	2012	Publication	2nd Joint AVA / BMVA Meeting on Biological and Machine Vision	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas. From a purely computational perspective, we built a low-level computational model (CIWaM) of early sensory processing based on multi-resolution wavelets with the aim of replicating brightness and colour (Otazu et al., 2010, Journal of Vision, 10(12):5) induction effects. Furthermore, we successfully used the CIWaM architecture to define a computational saliency model (Murray et al, 2011, CVPR, 433-440; Vanrell et al, submitted to AVA/BMVA'12). From a biological perspective, neurophysiological evidence suggests that perceived brightness information may be explicitly represented in V1. In this work we investigate possible neural mechanisms that offer a plausible explanation for such effects. To this end, we consider the model by Z.Li (Li, 1999, Network:Comput. Neural Syst., 10, 187-212) which is based on biological data and focuses on the part of V1 responsible for contextual influences, namely, layer 2-3 pyramidal cells, interneurons, and horizontal intracortical connections. This model has proven to account for phenomena such as visual saliency, which share with brightness induction the relevant effect of contextual influences (the ones modelled by CIWaM). In the proposed model, the input to the network is derived from a complete multiscale and multiorientation wavelet decomposition taken from the computational model (CIWaM). This model successfully accounts for well known pyschophysical effects (among them: the White's and modied White's effects, the Todorovic, Chevreul, achromatic ring patterns, and grating induction effects) for static contexts and also for brigthness induction in dynamic contexts defined by modulating the luminance of surrounding areas. From a methodological point of view, we conclude that the results obtained by the computational model (CIWaM) are compatible with the ones obtained by the neurodynamical model proposed here.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	AV A
Notes	CIC			Approved	no
Call Number	Admin @ si @ OPD2012a			Serial	2132
Permanent link to this record



Author	Xavier Otazu; Olivier Penacchio; Laura Dempere-Marco
Title	Brightness induction by contextual influences in V1: a neurodynamical account			Type	Abstract
Year	2012	Publication	Journal of Vision	Abbreviated Journal	VSS
Volume	12	Issue	9	Pages
Keywords
Abstract	Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas and reveals fundamental properties of neural organization in the visual system. Several phenomenological models have been proposed that successfully account for psychophysical data (Pessoa et al. 1995, Blakeslee and McCourt 2004, Barkan et al. 2008, Otazu et al. 2008). Neurophysiological evidence suggests that brightness information is explicitly represented in V1 and neuronal response modulations have been observed followingluminance changes outside their receptive fields (Rossi and Paradiso, 1999). In this work we investigate possible neural mechanisms that offer a plausible explanation for such effects. To this end, we consider the model by Z.Li (1999) which is based on biological data and focuses on the part of V1 responsible for contextual influences, namely, layer 2–3 pyramidal cells, interneurons, and horizontal intracortical connections. This model has proven to account for phenomena such as contour detection and preattentive segmentation, which share with brightness induction the relevant effect of contextual influences. In our model, the input to the network is derived from a complete multiscale and multiorientation wavelet decomposition which makes it possible to recover an image reflecting the perceived intensity. The proposed model successfully accounts for well known pyschophysical effects (among them: the White's and modified White's effects, the Todorović, Chevreul, achromatic ring patterns, and grating induction effects). Our work suggests that intra-cortical interactions in the primary visual cortex could partially explain perceptual brightness induction effects and reveals how a common general architecture may account for several different fundamental processes emerging early in the visual pathway.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	CIC			Approved	no
Call Number	Admin @ si @ OPD2012b			Serial	2178
Permanent link to this record



Author	Xavier Otazu
Title	Perceptual tone-mapping operator based on multiresolution contrast decomposition			Type	Abstract
Year	2012	Publication	Perception	Abbreviated Journal	PER
Volume	41	Issue		Pages	86
Keywords
Abstract	Tone-mapping operators (TMO) are used to display high dynamic range(HDR) images in low dynamic range (LDR) displays. Many computational and biologically inspired approaches have been used in the literature, being many of them based on multiresolution decompositions. In this work, a simple two stage model for TMO is presented. The first stage is a novel multiresolution contrast decomposition, which is inspired in a pyramidal contrast decomposition (Peli, 1990 Journal of the Optical Society of America7(10), 2032-2040). This novel multiresolution decomposition represents the Michelson contrast of the image at different spatial scales. This multiresolution contrast representation, applied on the intensity channel of an opponent colour decomposition, is processed by a non-linear saturating model of V1 neurons (Albrecht et al, 2002 Journal ofNeurophysiology 88(2) 888-913). This saturation model depends on the visual frequency, and it has been modified in order to include information from the extended Contrast Sensitivity Function (e-CSF) (Otazu et al, 2010 Journal ofVision10(12) 5). A set of HDR images in Radiance RGBE format (from CIS HDR Photographic Survey and Greg Ward database) have been used to test the model, obtaining a set of LDR images. The resulting LDR images do not show the usual halo or color modification artifacts.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0301-0066	ISBN		Medium
Area		Expedition		Conference
Notes	CIC			Approved	no
Call Number	Admin @ si @ Ota2012			Serial	2179
Permanent link to this record



Author	Xavier Boix; Josep M. Gonfaus; Joost Van de Weijer; Andrew Bagdanov; Joan Serrat; Jordi Gonzalez
Title	Harmony Potentials: Fusing Global and Local Scale for Semantic Image Segmentation			Type	Journal Article
Year	2012	Publication	International Journal of Computer Vision	Abbreviated Journal	IJCV
Volume	96	Issue	1	Pages	83-102
Keywords
Abstract	The Hierarchical Conditional Random Field(HCRF) model have been successfully applied to a number of image labeling problems, including image segmentation. However, existing HCRF models of image segmentation do not allow multiple classes to be assigned to a single region, which limits their ability to incorporate contextual information across multiple scales. At higher scales in the image, this representation yields an oversimplied model since multiple classes can be reasonably expected to appear within large regions. This simplied model particularly limits the impact of information at higher scales. Since class-label information at these scales is usually more reliable than at lower, noisier scales, neglecting this information is undesirable. To address these issues, we propose a new consistency potential for image labeling problems, which we call the harmony potential. It can encode any possible combi- nation of labels, penalizing only unlikely combinations of classes. We also propose an eective sampling strategy over this expanded label set that renders tractable the underlying optimization problem. Our approach obtains state-of-the-art results on two challenging, standard benchmark datasets for semantic image segmentation: PASCAL VOC 2010, and MSRC-21.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0920-5691	ISBN		Medium
Area		Expedition		Conference
Notes	ISE;CIC;ADAS			Approved	no
Call Number	Admin @ si @ BGW2012			Serial	1718
Permanent link to this record



Author	Wenjuan Gong; Jordi Gonzalez; Xavier Roca
Title	Human Action Recognition based on Estimated Weak Poses			Type	Journal Article
Year	2012	Publication	EURASIP Journal on Advances in Signal Processing	Abbreviated Journal	EURASIPJ
Volume		Issue		Pages
Keywords
Abstract	We present a novel method for human action recognition (HAR) based on estimated poses from image sequences. We use 3D human pose data as additional information and propose a compact human pose representation, called a weak pose, in a low-dimensional space while still keeping the most discriminative information for a given pose. With predicted poses from image features, we map the problem from image feature space to pose space, where a Bag of Poses (BOP) model is learned for the final goal of HAR. The BOP model is a modified version of the classical bag of words pipeline by building the vocabulary based on the most representative weak poses for a given action. Compared with the standard k-means clustering, our vocabulary selection criteria is proven to be more efficient and robust against the inherent challenges of action recognition. Moreover, since for action recognition the ordering of the poses is discriminative, the BOP model incorporates temporal information: in essence, groups of consecutive poses are considered together when computing the vocabulary and assignment. We tested our method on two well-known datasets: HumanEva and IXMAS, to demonstrate that weak poses aid to improve action recognition accuracies. The proposed method is scene-independent and is comparable with the state-of-art method.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	Admin @ si @ GGR2012			Serial	2003
Permanent link to this record



Author	Wenjuan Gong; Jordi Gonzalez; Joao Manuel R. S. Taveres; Xavier Roca
Title	A New Image Dataset on Human Interactions			Type	Conference Article
Year	2012	Publication	7th Conference on Articulated Motion and Deformable Objects	Abbreviated Journal
Volume	7378	Issue		Pages	204-209
Keywords
Abstract	This article describes a new collection of still image dataset which are dedicated to interactions between people. Human action recognition from still images have been a hot topic recently, but most of them are actions performed by a single person, like running, walking, riding bikes, phoning and so on and there is no interactions between people in one image. The dataset collected in this paper are concentrating on human interaction between two people aiming to explore this new topic in the research area of action recognition from still images.
Address	Mallorca
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-31566-4	Medium
Area		Expedition		Conference	AMDO
Notes	ISE			Approved	no
Call Number	Admin @ si @ GGT2012			Serial	2030
Permanent link to this record



Author	Volkmar Frinken; Markus Baumgartner; Andreas Fischer; Horst Bunke
Title	Semi-Supervised Learning for Cursive Handwriting Recognition using Keyword Spotting			Type	Conference Article
Year	2012	Publication	13th International Conference on Frontiers in Handwriting Recognition	Abbreviated Journal
Volume		Issue		Pages	49-54
Keywords
Abstract	State-of-the-art handwriting recognition systems are learning-based systems that require large sets of training data. The creation of training data, and consequently the creation of a well-performing recognition system, requires therefore a substantial amount of human work. This can be reduced with semi-supervised learning, which uses unlabeled text lines for training as well. Current approaches estimate the correct transcription of the unlabeled data via handwriting recognition which is not only extremely demanding as far as computational costs are concerned but also requires a good model of the target language. In this paper, we propose a different approach that makes use of keyword spotting, which is significantly faster and does not need any language model. In a set of experiments we demonstrate its superiority over existing approaches.
Address	Bari, Italy
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	10.1109/ICFHR.2012.268	ISBN	978-1-4673-2262-1	Medium
Area		Expedition		Conference	ICFHR
Notes	DAG			Approved	no
Call Number	Admin @ si @ FBF2012			Serial	2055
Permanent link to this record



Author	Volkmar Frinken; Francisco Zamora; Salvador España; Maria Jose Castro; Andreas Fischer; Horst Bunke
Title	Long-Short Term Memory Neural Networks Language Modeling for Handwriting Recognition			Type	Conference Article
Year	2012	Publication	21st International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	701-704
Keywords
Abstract	Unconstrained handwritten text recognition systems maximize the combination of two separate probability scores. The first one is the observation probability that indicates how well the returned word sequence matches the input image. The second score is the probability that reflects how likely a word sequence is according to a language model. Current state-of-the-art recognition systems use statistical language models in form of bigram word probabilities. This paper proposes to model the target language by means of a recurrent neural network with long-short term memory cells. Because the network is recurrent, the considered context is not limited to a fixed size especially as the memory cells are designed to deal with long-term dependencies. In a set of experiments conducted on the IAM off-line database we show the superiority of the proposed language model over statistical n-gram models.
Address	Tsukuba Science City, Japan
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1051-4651	ISBN	978-1-4673-2216-4	Medium
Area		Expedition		Conference	ICPR
Notes	DAG			Approved	no
Call Number	Admin @ si @ FZE2012			Serial	2052
Permanent link to this record



Author	Volkmar Frinken; Alicia Fornes; Josep Llados; Jean-Marc Ogier
Title	Bidirectional Language Model for Handwriting Recognition			Type	Conference Article
Year	2012	Publication	Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop	Abbreviated Journal
Volume	7626	Issue		Pages	611-619
Keywords
Abstract	In order to improve the results of automatically recognized handwritten text, information about the language is commonly included in the recognition process. A common approach is to represent a text line as a sequence. It is processed in one direction and the language information via n-grams is directly included in the decoding. This approach, however, only uses context on one side to estimate a word’s probability. Therefore, we propose a bidirectional recognition in this paper, using distinct forward and a backward language models. By combining decoding hypotheses from both directions, we achieve a significant increase in recognition accuracy for the off-line writer independent handwriting recognition task. Both language models are of the same type and can be estimated on the same corpus. Hence, the increase in recognition accuracy comes without any additional need for training data or language modeling complexity.
Address	Japan
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-34165-6	Medium
Area		Expedition		Conference	SSPR&SPR
Notes	DAG			Approved	no
Call Number	Admin @ si @ FFL2012			Serial	2057
Permanent link to this record



Author	Theo Gevers; Arjan Gijsenij; Joost Van de Weijer; J.M. Geusebroek
Title	Color in Computer Vision: Fundamentals and Applications			Type	Book Whole
Year	2012	Publication	Color in Computer Vision: Fundamentals and Applications	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher	The Wiley-IS&T Series in Imaging Science and Technology	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-0-470-89084-4	Medium
Area		Expedition		Conference
Notes	ALTRES;ISE			Approved	no
Call Number	Admin @ si @ GGG2012a			Serial	2068
Permanent link to this record



Author	Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
Title	Text/graphic separation using a sparse representation with multi-learned dictionaries			Type	Conference Article
Year	2012	Publication	21st International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages
Keywords	Graphics Recognition; Layout Analysis; Document Understandin
Abstract	In this paper, we propose a new approach to extract text regions from graphical documents. In our method, we first empirically construct two sequences of learned dictionaries for the text and graphical parts respectively. Then, we compute the sparse representations of all different sizes and non-overlapped document patches in these learned dictionaries. Based on these representations, each patch can be classified into the text or graphic category by comparing its reconstruction errors. Same-sized patches in one category are then merged together to define the corresponding text or graphic layers which are combined to createfinal text/graphic layer. Finally, in a post-processing step, text regions are further filtered out by using some learned thresholds.
Address	Tsukuba
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	DAG			Approved	no
Call Number	Admin @ si @ DTR2012a			Serial	2135
Permanent link to this record