|   | 
Details
   web
Records
Author (down) Yunchao Gong; Svetlana Lazebnik; Albert Gordo; Florent Perronnin
Title Iterative quantization: A procrustean approach to learning binary codes for Large-Scale Image Retrieval Type Journal Article
Year 2012 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 35 Issue 12 Pages 2916-2929
Keywords
Abstract This paper addresses the problem of learning similarity-preserving binary codes for efficient similarity search in large-scale image collections. We formulate this problem in terms of finding a rotation of zero-centered data so as to minimize the quantization error of mapping this data to the vertices of a zero-centered binary hypercube, and propose a simple and efficient alternating minimization algorithm to accomplish this task. This algorithm, dubbed iterative quantization (ITQ), has connections to multi-class spectral clustering and to the orthogonal Procrustes problem, and it can be used both with unsupervised data embeddings such as PCA and supervised embeddings such as canonical correlation analysis (CCA). The resulting binary codes significantly outperform several other state-of-the-art methods. We also show that further performance improvements can result from transforming the data with a nonlinear kernel mapping prior to PCA or CCA. Finally, we demonstrate an application of ITQ to learning binary attributes or “classemes” on the ImageNet dataset.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0162-8828 ISBN 978-1-4577-0394-2 Medium
Area Expedition Conference
Notes DAG Approved no
Call Number Admin @ si @ GLG 2012b Serial 2008
Permanent link to this record
 

 
Author (down) Yainuvis Socarras; David Vazquez; Antonio Lopez; David Geronimo; Theo Gevers
Title Improving HOG with Image Segmentation: Application to Human Detection Type Conference Article
Year 2012 Publication 11th International Conference on Advanced Concepts for Intelligent Vision Systems Abbreviated Journal
Volume 7517 Issue Pages 178-189
Keywords Segmentation; Pedestrian Detection
Abstract In this paper we improve the histogram of oriented gradients (HOG), a core descriptor of state-of-the-art object detection, by the use of higher-level information coming from image segmentation. The idea is to re-weight the descriptor while computing it without increasing its size. The benefits of the proposal are two-fold: (i) to improve the performance of the detector by enriching the descriptor information and (ii) take advantage of the information of image segmentation, which in fact is likely to be used in other stages of the detection system such as candidate generation or refinement.
We test our technique in the INRIA person dataset, which was originally developed to test HOG, embedding it in a human detection system. The well-known segmentation method, mean-shift (from smaller to larger super-pixels), and different methods to re-weight the original descriptor (constant, region-luminance, color or texture-dependent) has been evaluated. We achieve performance improvements of 4:47% in detection rate through the use of differences of color between contour pixel neighborhoods as re-weighting function.
Address Brno, Czech Republic
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor J. Blanc-Talon et al.
Language English Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-33139-8 Medium
Area Expedition Conference ACIVS
Notes ADAS;ISE Approved no
Call Number ADAS @ adas @ SLV2012 Serial 1980
Permanent link to this record
 

 
Author (down) Xu Hu
Title Real-Time Part Based Models for Object Detection Type Report
Year 2012 Publication CVC Technical Report Abbreviated Journal
Volume 171 Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis Master's thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS;ISE Approved no
Call Number Admin @ si @ Hu2012 Serial 2415
Permanent link to this record
 

 
Author (down) Xavier Perez Sala; Laura Igual; Sergio Escalera; Cecilio Angulo
Title Uniform Sampling of Rotations for Discrete and Continuous Learning of 2D Shape Models Type Book Chapter
Year 2012 Publication Vision Robotics: Technologies for Machine Learning and Vision Applications Abbreviated Journal
Volume Issue 2 Pages 23-42
Keywords
Abstract Different methodologies of uniform sampling over the rotation group, SO(3), for building unbiased 2D shape models from 3D objects are introduced and reviewed in this chapter. State-of-the-art non uniform sampling approaches are discussed, and uniform sampling methods using Euler angles and quaternions are introduced. Moreover, since presented work is oriented to model building applications, it is not limited to general discrete methods to obtain uniform 3D rotations, but also from a continuous point of view in the case of Procrustes Analysis.
Address
Corporate Author Thesis
Publisher IGI-Global Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB;HuPBA Approved no
Call Number Admin @ si @ PIE2012 Serial 2064
Permanent link to this record
 

 
Author (down) Xavier Otazu; Olivier Penacchio; Laura Dempere-Marco
Title An investigation into plausible neural mechanisms related to the the CIWaM computational model for brightness induction Type Conference Article
Year 2012 Publication 2nd Joint AVA / BMVA Meeting on Biological and Machine Vision Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas. From a purely computational perspective, we built a low-level computational model (CIWaM) of early sensory processing based on multi-resolution wavelets with the aim of replicating brightness and colour (Otazu et al., 2010, Journal of Vision, 10(12):5) induction effects. Furthermore, we successfully used the CIWaM architecture to define a computational saliency model (Murray et al, 2011, CVPR, 433-440; Vanrell et al, submitted to AVA/BMVA'12). From a biological perspective, neurophysiological evidence suggests that perceived brightness information may be explicitly represented in V1. In this work we investigate possible neural mechanisms that offer a plausible explanation for such effects. To this end, we consider the model by Z.Li (Li, 1999, Network:Comput. Neural Syst., 10, 187-212) which is based on biological data and focuses on the part of V1 responsible for contextual influences, namely, layer 2-3 pyramidal cells, interneurons, and horizontal intracortical connections. This model has proven to account for phenomena such as visual saliency, which share with brightness induction the relevant effect of contextual influences (the ones modelled by CIWaM). In the proposed model, the input to the network is derived from a complete multiscale and multiorientation wavelet decomposition taken from the computational model (CIWaM).
This model successfully accounts for well known pyschophysical effects (among them: the White's and modied White's effects, the Todorovic, Chevreul, achromatic ring patterns, and grating induction effects) for static contexts and also for brigthness induction in dynamic contexts defined by modulating the luminance of surrounding areas. From a methodological point of view, we conclude that the results obtained by the computational model (CIWaM) are compatible with the ones obtained by the neurodynamical model proposed here.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference AV A
Notes CIC Approved no
Call Number Admin @ si @ OPD2012a Serial 2132
Permanent link to this record
 

 
Author (down) Xavier Otazu; Olivier Penacchio; Laura Dempere-Marco
Title Brightness induction by contextual influences in V1: a neurodynamical account Type Abstract
Year 2012 Publication Journal of Vision Abbreviated Journal VSS
Volume 12 Issue 9 Pages
Keywords
Abstract Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas and reveals fundamental properties of neural organization in the visual system. Several phenomenological models have been proposed that successfully account for psychophysical data (Pessoa et al. 1995, Blakeslee and McCourt 2004, Barkan et al. 2008, Otazu et al. 2008).
Neurophysiological evidence suggests that brightness information is explicitly represented in V1 and neuronal response modulations have been observed followingluminance changes outside their receptive fields (Rossi and Paradiso, 1999).
In this work we investigate possible neural mechanisms that offer a plausible explanation for such effects. To this end, we consider the model by Z.Li (1999) which is based on biological data and focuses on the part of V1 responsible for contextual influences, namely, layer 2–3 pyramidal cells, interneurons, and horizontal intracortical connections. This model has proven to account for phenomena such as contour detection and preattentive segmentation, which share with brightness induction the relevant effect of contextual influences. In our model, the input to the network is derived from a complete multiscale and multiorientation wavelet decomposition which makes it possible to recover an image reflecting the perceived intensity. The proposed model successfully accounts for well known pyschophysical effects (among them: the White's and modified White's effects, the Todorović, Chevreul, achromatic ring patterns, and grating induction effects). Our work suggests that intra-cortical interactions in the primary visual cortex could partially explain perceptual brightness induction effects and reveals how a common general architecture may account for several different fundamental processes emerging early in the visual pathway.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes CIC Approved no
Call Number Admin @ si @ OPD2012b Serial 2178
Permanent link to this record
 

 
Author (down) Xavier Otazu
Title Perceptual tone-mapping operator based on multiresolution contrast decomposition Type Abstract
Year 2012 Publication Perception Abbreviated Journal PER
Volume 41 Issue Pages 86
Keywords
Abstract Tone-mapping operators (TMO) are used to display high dynamic range(HDR) images in low dynamic range (LDR) displays. Many computational and biologically inspired approaches have been used in the literature, being many of them based on multiresolution decompositions. In this work, a simple two stage model for TMO is presented. The first stage is a novel multiresolution contrast decomposition, which is inspired in a pyramidal contrast decomposition (Peli, 1990 Journal of the Optical Society of America7(10), 2032-2040).
This novel multiresolution decomposition represents the Michelson contrast of the image at different spatial scales. This multiresolution contrast representation, applied on the intensity channel of an opponent colour decomposition, is processed by a non-linear saturating model of V1 neurons (Albrecht et al, 2002 Journal ofNeurophysiology 88(2) 888-913). This saturation model depends on the visual frequency, and it has been modified in order to include information from the extended Contrast Sensitivity Function (e-CSF) (Otazu et al, 2010 Journal ofVision10(12) 5).
A set of HDR images in Radiance RGBE format (from CIS HDR Photographic Survey and Greg Ward database) have been used to test the model, obtaining a set of LDR images. The resulting LDR images do not show the usual halo or color modification artifacts.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0301-0066 ISBN Medium
Area Expedition Conference
Notes CIC Approved no
Call Number Admin @ si @ Ota2012 Serial 2179
Permanent link to this record
 

 
Author (down) Xavier Boix; Josep M. Gonfaus; Joost Van de Weijer; Andrew Bagdanov; Joan Serrat; Jordi Gonzalez
Title Harmony Potentials: Fusing Global and Local Scale for Semantic Image Segmentation Type Journal Article
Year 2012 Publication International Journal of Computer Vision Abbreviated Journal IJCV
Volume 96 Issue 1 Pages 83-102
Keywords
Abstract The Hierarchical Conditional Random Field(HCRF) model have been successfully applied to a number of image labeling problems, including image segmentation. However, existing HCRF models of image segmentation do not allow multiple classes to be assigned to a single region, which limits their ability to incorporate contextual information across multiple scales.
At higher scales in the image, this representation yields an oversimpli ed model since multiple classes can be reasonably expected to appear within large regions. This simpli ed model particularly limits the impact of information at higher scales. Since class-label information at these scales is usually more reliable than at lower, noisier scales, neglecting this information is undesirable. To
address these issues, we propose a new consistency potential for image labeling problems, which we call the harmony potential. It can encode any possible combi-
nation of labels, penalizing only unlikely combinations of classes. We also propose an e ective sampling strategy over this expanded label set that renders tractable the underlying optimization problem. Our approach obtains state-of-the-art results on two challenging, standard benchmark datasets for semantic image segmentation: PASCAL VOC 2010, and MSRC-21.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0920-5691 ISBN Medium
Area Expedition Conference
Notes ISE;CIC;ADAS Approved no
Call Number Admin @ si @ BGW2012 Serial 1718
Permanent link to this record
 

 
Author (down) Wenjuan Gong; Jordi Gonzalez; Xavier Roca
Title Human Action Recognition based on Estimated Weak Poses Type Journal Article
Year 2012 Publication EURASIP Journal on Advances in Signal Processing Abbreviated Journal EURASIPJ
Volume Issue Pages
Keywords
Abstract We present a novel method for human action recognition (HAR) based on estimated poses from image sequences. We use 3D human pose data as additional information and propose a compact human pose representation, called a weak pose, in a low-dimensional space while still keeping the most discriminative information for a given pose. With predicted poses from image features, we map the problem from image feature space to pose space, where a Bag of Poses (BOP) model is learned for the final goal of HAR. The BOP model is a modified version of the classical bag of words pipeline by building the vocabulary based on the most representative weak poses for a given action. Compared with the standard k-means clustering, our vocabulary selection criteria is proven to be more efficient and robust against the inherent challenges of action recognition. Moreover, since for action recognition the ordering of the poses is discriminative, the BOP model incorporates temporal information: in essence, groups of consecutive poses are considered together when computing the vocabulary and assignment. We tested our method on two well-known datasets: HumanEva and IXMAS, to demonstrate that weak poses aid to improve action recognition accuracies. The proposed method is scene-independent and is comparable with the state-of-art method.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE Approved no
Call Number Admin @ si @ GGR2012 Serial 2003
Permanent link to this record
 

 
Author (down) Wenjuan Gong; Jordi Gonzalez; Joao Manuel R. S. Taveres; Xavier Roca
Title A New Image Dataset on Human Interactions Type Conference Article
Year 2012 Publication 7th Conference on Articulated Motion and Deformable Objects Abbreviated Journal
Volume 7378 Issue Pages 204-209
Keywords
Abstract This article describes a new collection of still image dataset which are dedicated to interactions between people. Human action recognition from still images have been a hot topic recently, but most of them are actions performed by a single person, like running, walking, riding bikes, phoning and so on and there is no interactions between people in one image. The dataset collected in this paper are concentrating on human interaction between two people aiming to explore this new topic in the research area of action recognition from still images.
Address Mallorca
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-31566-4 Medium
Area Expedition Conference AMDO
Notes ISE Approved no
Call Number Admin @ si @ GGT2012 Serial 2030
Permanent link to this record
 

 
Author (down) Volkmar Frinken; Markus Baumgartner; Andreas Fischer; Horst Bunke
Title Semi-Supervised Learning for Cursive Handwriting Recognition using Keyword Spotting Type Conference Article
Year 2012 Publication 13th International Conference on Frontiers in Handwriting Recognition Abbreviated Journal
Volume Issue Pages 49-54
Keywords
Abstract State-of-the-art handwriting recognition systems are learning-based systems that require large sets of training data. The creation of training data, and consequently the creation of a well-performing recognition system, requires therefore a substantial amount of human work. This can be reduced with semi-supervised learning, which uses unlabeled text lines for training as well. Current approaches estimate the correct transcription of the unlabeled data via handwriting recognition which is not only extremely demanding as far as computational costs are concerned but also requires a good model of the target language. In this paper, we propose a different approach that makes use of keyword spotting, which is significantly faster and does not need any language model. In a set of experiments we demonstrate its superiority over existing approaches.
Address Bari, Italy
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 10.1109/ICFHR.2012.268 ISBN 978-1-4673-2262-1 Medium
Area Expedition Conference ICFHR
Notes DAG Approved no
Call Number Admin @ si @ FBF2012 Serial 2055
Permanent link to this record
 

 
Author (down) Volkmar Frinken; Francisco Zamora; Salvador España; Maria Jose Castro; Andreas Fischer; Horst Bunke
Title Long-Short Term Memory Neural Networks Language Modeling for Handwriting Recognition Type Conference Article
Year 2012 Publication 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 701-704
Keywords
Abstract Unconstrained handwritten text recognition systems maximize the combination of two separate probability scores. The first one is the observation probability that indicates how well the returned word sequence matches the input image. The second score is the probability that reflects how likely a word sequence is according to a language model. Current state-of-the-art recognition systems use statistical language models in form of bigram word probabilities. This paper proposes to model the target language by means of a recurrent neural network with long-short term memory cells. Because the network is recurrent, the considered context is not limited to a fixed size especially as the memory cells are designed to deal with long-term dependencies. In a set of experiments conducted on the IAM off-line database we show the superiority of the proposed language model over statistical n-gram models.
Address Tsukuba Science City, Japan
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN 978-1-4673-2216-4 Medium
Area Expedition Conference ICPR
Notes DAG Approved no
Call Number Admin @ si @ FZE2012 Serial 2052
Permanent link to this record
 

 
Author (down) Volkmar Frinken; Alicia Fornes; Josep Llados; Jean-Marc Ogier
Title Bidirectional Language Model for Handwriting Recognition Type Conference Article
Year 2012 Publication Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop Abbreviated Journal
Volume 7626 Issue Pages 611-619
Keywords
Abstract In order to improve the results of automatically recognized handwritten text, information about the language is commonly included in the recognition process. A common approach is to represent a text line as a sequence. It is processed in one direction and the language information via n-grams is directly included in the decoding. This approach, however, only uses context on one side to estimate a word’s probability. Therefore, we propose a bidirectional recognition in this paper, using distinct forward and a backward language models. By combining decoding hypotheses from both directions, we achieve a significant increase in recognition accuracy for the off-line writer independent handwriting recognition task. Both language models are of the same type and can be estimated on the same corpus. Hence, the increase in recognition accuracy comes without any additional need for training data or language modeling complexity.
Address Japan
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-34165-6 Medium
Area Expedition Conference SSPR&SPR
Notes DAG Approved no
Call Number Admin @ si @ FFL2012 Serial 2057
Permanent link to this record
 

 
Author (down) Theo Gevers; Arjan Gijsenij; Joost Van de Weijer; J.M. Geusebroek
Title Color in Computer Vision: Fundamentals and Applications Type Book Whole
Year 2012 Publication Color in Computer Vision: Fundamentals and Applications Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher The Wiley-IS&T Series in Imaging Science and Technology Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-0-470-89084-4 Medium
Area Expedition Conference
Notes ALTRES;ISE Approved no
Call Number Admin @ si @ GGG2012a Serial 2068
Permanent link to this record
 

 
Author (down) Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
Title Text/graphic separation using a sparse representation with multi-learned dictionaries Type Conference Article
Year 2012 Publication 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages
Keywords Graphics Recognition; Layout Analysis; Document Understandin
Abstract In this paper, we propose a new approach to extract text regions from graphical documents. In our method, we first empirically construct two sequences of learned dictionaries for the text and graphical parts respectively. Then, we compute the sparse representations of all different sizes and non-overlapped document patches in these learned dictionaries. Based on these representations, each patch can be classified into the text or graphic category by comparing its reconstruction errors. Same-sized patches in one category are then merged together to define the corresponding text or graphic layers which are combined to createfinal text/graphic layer. Finally, in a post-processing step, text regions are further filtered out by using some learned thresholds.
Address Tsukuba
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes DAG Approved no
Call Number Admin @ si @ DTR2012a Serial 2135
Permanent link to this record