toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Jose Antonio Rodriguez; Florent Perronnin edit  doi
openurl 
  Title Handwritten word-spotting using hidden Markov models and universal vocabularies Type Journal Article
  Year 2009 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 42 Issue 9 Pages 2103-2116  
  Keywords (down) Word-spotting; Hidden Markov model; Score normalization; Universal vocabulary; Handwriting recognition  
  Abstract Handwritten word-spotting is traditionally viewed as an image matching task between one or multiple query word-images and a set of candidate word-images in a database. This is a typical instance of the query-by-example paradigm. In this article, we introduce a statistical framework for the word-spotting problem which employs hidden Markov models (HMMs) to model keywords and a Gaussian mixture model (GMM) for score normalization. We explore the use of two types of HMMs for the word modeling part: continuous HMMs (C-HMMs) and semi-continuous HMMs (SC-HMMs), i.e. HMMs with a shared set of Gaussians. We show on a challenging multi-writer corpus that the proposed statistical framework is always superior to a traditional matching system which uses dynamic time warping (DTW) for word-image distance computation. A very important finding is that the SC-HMM is superior when labeled training data is scarce—as low as one sample per keyword—thanks to the prior information which can be incorporated in the shared set of Gaussians.  
  Address  
  Corporate Author Thesis  
  Publisher Elsevier Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0031-3203 ISBN Medium  
  Area Expedition Conference  
  Notes Approved no  
  Call Number Admin @ si @ RoP2009 Serial 1053  
Permanent link to this record
 

 
Author Jose Antonio Rodriguez; Florent Perronnin; Gemma Sanchez; Josep Llados edit  url
doi  openurl
  Title Unsupervised writer adaptation of whole-word HMMs with application to word-spotting Type Journal Article
  Year 2010 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 31 Issue 8 Pages 742–749  
  Keywords (down) Word-spotting; Handwriting recognition; Writer adaptation; Hidden Markov model; Document analysis  
  Abstract In this paper we propose a novel approach for writer adaptation in a handwritten word-spotting task. The method exploits the fact that the semi-continuous hidden Markov model separates the word model parameters into (i) a codebook of shapes and (ii) a set of word-specific parameters.

Our main contribution is to employ this property to derive writer-specific word models by statistically adapting an initial universal codebook to each document. This process is unsupervised and does not even require the appearance of the keyword(s) in the searched document. Experimental results show an increase in performance when this adaptation technique is applied. To the best of our knowledge, this is the first work dealing with adaptation for word-spotting. The preliminary version of this paper obtained an IBM Best Student Paper Award at the 19th International Conference on Pattern Recognition.
 
  Address  
  Corporate Author Thesis  
  Publisher Elsevier Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ RPS2010 Serial 1290  
Permanent link to this record
 

 
Author Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny edit  doi
openurl 
  Title Segmentation-free Word Spotting with Exemplar SVMs Type Journal Article
  Year 2014 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 47 Issue 12 Pages 3967–3978  
  Keywords (down) Word spotting; Segmentation-free; Unsupervised learning; Reranking; Query expansion; Compression  
  Abstract In this paper we propose an unsupervised segmentation-free method for word spotting in document images. Documents are represented with a grid of HOG descriptors, and a sliding-window approach is used to locate the document regions that are most similar to the query. We use the Exemplar SVM framework to produce a better representation of the query in an unsupervised way. Then, we use a more discriminative representation based on Fisher Vector to rerank the best regions retrieved, and the most promising ones are used to expand the Exemplar SVM training set and improve the query representation. Finally, the document descriptors are precomputed and compressed with Product Quantization. This offers two advantages: first, a large number of documents can be kept in RAM memory at the same time. Second, the sliding window becomes significantly faster since distances between quantized HOG descriptors can be precomputed. Our results significantly outperform other segmentation-free methods in the literature, both in accuracy and in speed and memory usage.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.045; 600.056; 600.061; 602.006; 600.077 Approved no  
  Call Number Admin @ si @ AGF2014b Serial 2485  
Permanent link to this record
 

 
Author Santiago Segui; Michal Drozdzal; Ekaterina Zaytseva; Fernando Azpiroz; Petia Radeva; Jordi Vitria edit   pdf
doi  openurl
  Title Detection of wrinkle frames in endoluminal videos using betweenness centrality measures for images Type Journal Article
  Year 2014 Publication IEEE Transactions on Information Technology in Biomedicine Abbreviated Journal TITB  
  Volume 18 Issue 6 Pages 1831-1838  
  Keywords (down) Wireless Capsule Endoscopy; Small Bowel Motility Dysfunction; Contraction Detection; Structured Prediction; Betweenness Centrality  
  Abstract Intestinal contractions are one of the most important events to diagnose motility pathologies of the small intestine. When visualized by wireless capsule endoscopy (WCE), the sequence of frames that represents a contraction is characterized by a clear wrinkle structure in the central frames that corresponds to the folding of the intestinal wall. In this paper we present a new method to robustly detect wrinkle frames in full WCE videos by using a new mid-level image descriptor that is based on a centrality measure proposed for graphs. We present an extended validation, carried out in a very large database, that shows that the proposed method achieves state of the art performance for this task.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes OR; MILAB; 600.046;MV Approved no  
  Call Number Admin @ si @ SDZ2014 Serial 2385  
Permanent link to this record
 

 
Author Santiago Segui; Michal Drozdzal; Guillem Pascual; Petia Radeva; Carolina Malagelada; Fernando Azpiroz; Jordi Vitria edit   pdf
url  openurl
  Title Generic Feature Learning for Wireless Capsule Endoscopy Analysis Type Journal Article
  Year 2016 Publication Computers in Biology and Medicine Abbreviated Journal CBM  
  Volume 79 Issue Pages 163-172  
  Keywords (down) Wireless capsule endoscopy; Deep learning; Feature learning; Motility analysis  
  Abstract The interpretation and analysis of wireless capsule endoscopy (WCE) recordings is a complex task which requires sophisticated computer aided decision (CAD) systems to help physicians with video screening and, finally, with the diagnosis. Most CAD systems used in capsule endoscopy share a common system design, but use very different image and video representations. As a result, each time a new clinical application of WCE appears, a new CAD system has to be designed from the scratch. This makes the design of new CAD systems very time consuming. Therefore, in this paper we introduce a system for small intestine motility characterization, based on Deep Convolutional Neural Networks, which circumvents the laborious step of designing specific features for individual motility events. Experimental results show the superiority of the learned features over alternative classifiers constructed using state-of-the-art handcrafted features. In particular, it reaches a mean classification accuracy of 96% for six intestinal motility events, outperforming the other classifiers by a large margin (a 14% relative performance increase).  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes OR; MILAB;MV; Approved no  
  Call Number Admin @ si @ SDP2016 Serial 2836  
Permanent link to this record
 

 
Author Saad Minhas; Zeba Khanam; Shoaib Ehsan; Klaus McDonald Maier; Aura Hernandez-Sabate edit  doi
openurl 
  Title Weather Classification by Utilizing Synthetic Data Type Journal Article
  Year 2022 Publication Sensors Abbreviated Journal SENS  
  Volume 22 Issue 9 Pages 3193  
  Keywords (down) Weather classification; synthetic data; dataset; autonomous car; computer vision; advanced driver assistance systems; deep learning; intelligent transportation systems  
  Abstract Weather prediction from real-world images can be termed a complex task when targeting classification using neural networks. Moreover, the number of images throughout the available datasets can contain a huge amount of variance when comparing locations with the weather those images are representing. In this article, the capabilities of a custom built driver simulator are explored specifically to simulate a wide range of weather conditions. Moreover, the performance of a new synthetic dataset generated by the above simulator is also assessed. The results indicate that the use of synthetic datasets in conjunction with real-world datasets can increase the training efficiency of the CNNs by as much as 74%. The article paves a way forward to tackle the persistent problem of bias in vision-based datasets.  
  Address 21 April 2022  
  Corporate Author Thesis  
  Publisher MDPI Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM; 600.139; 600.159; 600.166; 600.145; Approved no  
  Call Number Admin @ si @ MKE2022 Serial 3761  
Permanent link to this record
 

 
Author Xavier Otazu; C. Alejandro Parraga; Maria Vanrell edit  url
doi  openurl
  Title Towards a unified chromatic inducction model Type Journal Article
  Year 2010 Publication Journal of Vision Abbreviated Journal VSS  
  Volume 10 Issue 12:5 Pages 1-24  
  Keywords (down) Visual system; Color induction; Wavelet transform  
  Abstract In a previous work (X. Otazu, M. Vanrell, & C. A. Párraga, 2008b), we showed how several brightness induction effects can be predicted using a simple multiresolution wavelet model (BIWaM). Here we present a new model for chromatic induction processes (termed Chromatic Induction Wavelet Model or CIWaM), which is also implemented on a multiresolution framework and based on similar assumptions related to the spatial frequency and the contrast surround energy of the stimulus. The CIWaM can be interpreted as a very simple extension of the BIWaM to the chromatic channels, which in our case are defined in the MacLeod-Boynton (lsY) color space. This new model allows us to unify both chromatic assimilation and chromatic contrast effects in a single mathematical formulation. The predictions of the CIWaM were tested by means of several color and brightness induction experiments, which showed an acceptable agreement between model predictions and psychophysical data.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes CIC Approved no  
  Call Number CAT @ cat @ OPV2010 Serial 1450  
Permanent link to this record
 

 
Author Fadi Dornaika; Abdelmalik Moujahid; Bogdan Raducanu edit   pdf
url  doi
openurl 
  Title Facial expression recognition using tracked facial actions: Classifier performance analysis Type Journal Article
  Year 2013 Publication Engineering Applications of Artificial Intelligence Abbreviated Journal EAAI  
  Volume 26 Issue 1 Pages 467-477  
  Keywords (down) Visual face tracking; 3D deformable models; Facial actions; Dynamic facial expression recognition; Human–computer interaction  
  Abstract In this paper, we address the analysis and recognition of facial expressions in continuous videos. More precisely, we study classifiers performance that exploit head pose independent temporal facial action parameters. These are provided by an appearance-based 3D face tracker that simultaneously provides the 3D head pose and facial actions. The use of such tracker makes the recognition pose- and texture-independent. Two different schemes are studied. The first scheme adopts a dynamic time warping technique for recognizing expressions where training data are given by temporal signatures associated with different universal facial expressions. The second scheme models temporal signatures associated with facial actions with fixed length feature vectors (observations), and uses some machine learning algorithms in order to recognize the displayed expression. Experiments quantified the performance of different schemes. These were carried out on CMU video sequences and home-made video sequences. The results show that the use of dimension reduction techniques on the extracted time series can improve the classification performance. Moreover, these experiments show that the best recognition rate can be above 90%.  
  Address  
  Corporate Author Thesis  
  Publisher Elsevier Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes OR; 600.046;MV Approved no  
  Call Number Admin @ si @ DMR2013 Serial 2185  
Permanent link to this record
 

 
Author Albert Gordo; Florent Perronnin; Ernest Valveny edit   pdf
url  doi
openurl 
  Title Large-scale document image retrieval and classification with runlength histograms and binary embeddings Type Journal Article
  Year 2013 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 46 Issue 7 Pages 1898-1905  
  Keywords (down) visual document descriptor; compression; large-scale; retrieval; classification  
  Abstract We present a new document image descriptor based on multi-scale runlength
histograms. This descriptor does not rely on layout analysis and can be
computed efficiently. We show how this descriptor can achieve state-of-theart
results on two very different public datasets in classification and retrieval
tasks. Moreover, we show how we can compress and binarize these descriptors
to make them suitable for large-scale applications. We can achieve state-ofthe-
art results in classification using binary descriptors of as few as 16 to 64
bits.
 
  Address  
  Corporate Author Thesis  
  Publisher Elsevier Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0031-3203 ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.042; 600.045; 605.203 Approved no  
  Call Number Admin @ si @ GPV2013 Serial 2306  
Permanent link to this record
 

 
Author David Berga; Xose R. Fernandez-Vidal; Xavier Otazu; V. Leboran; Xose M. Pardo edit   pdf
url  openurl
  Title Psychophysical evaluation of individual low-level feature influences on visual attention Type Journal Article
  Year 2019 Publication Vision Research Abbreviated Journal VR  
  Volume 154 Issue Pages 60-79  
  Keywords (down) Visual attention; Psychophysics; Saliency; Task; Context; Contrast; Center bias; Low-level; Synthetic; Dataset  
  Abstract In this study we provide the analysis of eye movement behavior elicited by low-level feature distinctiveness with a dataset of synthetically-generated image patterns. Design of visual stimuli was inspired by the ones used in previous psychophysical experiments, namely in free-viewing and visual searching tasks, to provide a total of 15 types of stimuli, divided according to the task and feature to be analyzed. Our interest is to analyze the influences of low-level feature contrast between a salient region and the rest of distractors, providing fixation localization characteristics and reaction time of landing inside the salient region. Eye-tracking data was collected from 34 participants during the viewing of a 230 images dataset. Results show that saliency is predominantly and distinctively influenced by: 1. feature type, 2. feature contrast, 3. temporality of fixations, 4. task difficulty and 5. center bias. This experimentation proposes a new psychophysical basis for saliency model evaluation using synthetic images.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes NEUROBIT; 600.128; 600.120 Approved no  
  Call Number Admin @ si @ BFO2019a Serial 3274  
Permanent link to this record
 

 
Author Antonio Clavelli; Dimosthenis Karatzas; Josep Llados; Mario Ferraro; Giuseppe Boccignone edit   pdf
doi  openurl
  Title Modelling task-dependent eye guidance to objects in pictures Type Journal Article
  Year 2014 Publication Cognitive Computation Abbreviated Journal CoCom  
  Volume 6 Issue 3 Pages 558-584  
  Keywords (down) Visual attention; Gaze guidance; Value; Payoff; Stochastic fixation prediction  
  Abstract 5Y Impact Factor: 1.14 / 3rd (Computer Science, Artificial Intelligence)
We introduce a model of attentional eye guidance based on the rationale that the deployment of gaze is to be considered in the context of a general action-perception loop relying on two strictly intertwined processes: sensory processing, depending on current gaze position, identifies sources of information that are most valuable under the given task; motor processing links such information with the oculomotor act by sampling the next gaze position and thus performing the gaze shift. In such a framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the payoff of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects. The different levels of the action-perception loop are represented in probabilistic form and eventually give rise to a stochastic process that generates the gaze sequence. This way the model also accounts for statistical properties of gaze shifts such as individual scan path variability. Results of the simulations are compared either with experimental data derived from publicly available datasets and from our own experiments.
 
  Address  
  Corporate Author Thesis  
  Publisher Springer US Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1866-9956 ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.056; 600.045; 605.203; 601.212; 600.077 Approved no  
  Call Number Admin @ si @ CKL2014 Serial 2419  
Permanent link to this record
 

 
Author Juan Borrego-Carazo; Carles Sanchez; David Castells; Jordi Carrabina; Debora Gil edit   pdf
doi  openurl
  Title BronchoPose: an analysis of data and model configuration for vision-based bronchoscopy pose estimation Type Journal Article
  Year 2023 Publication Computer Methods and Programs in Biomedicine Abbreviated Journal CMPB  
  Volume 228 Issue Pages 107241  
  Keywords (down) Videobronchoscopy guiding; Deep learning; Architecture optimization; Datasets; Standardized evaluation framework; Pose estimation  
  Abstract Vision-based bronchoscopy (VB) models require the registration of the virtual lung model with the frames from the video bronchoscopy to provide effective guidance during the biopsy. The registration can be achieved by either tracking the position and orientation of the bronchoscopy camera or by calibrating its deviation from the pose (position and orientation) simulated in the virtual lung model. Recent advances in neural networks and temporal image processing have provided new opportunities for guided bronchoscopy. However, such progress has been hindered by the lack of comparative experimental conditions.
In the present paper, we share a novel synthetic dataset allowing for a fair comparison of methods. Moreover, this paper investigates several neural network architectures for the learning of temporal information at different levels of subject personalization. In order to improve orientation measurement, we also present a standardized comparison framework and a novel metric for camera orientation learning. Results on the dataset show that the proposed metric and architectures, as well as the standardized conditions, provide notable improvements to current state-of-the-art camera pose estimation in video bronchoscopy.
 
  Address  
  Corporate Author Thesis  
  Publisher Elsevier Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM; Approved no  
  Call Number Admin @ si @ BSC2023 Serial 3702  
Permanent link to this record
 

 
Author Pejman Rasti; Salma Samiei; Mary Agoyi; Sergio Escalera; Gholamreza Anbarjafari edit   pdf
doi  openurl
  Title Robust non-blind color video watermarking using QR decomposition and entropy analysis Type Journal Article
  Year 2016 Publication Journal of Visual Communication and Image Representation Abbreviated Journal JVCIR  
  Volume 38 Issue Pages 838-847  
  Keywords (down) Video watermarking; QR decomposition; Discrete Wavelet Transformation; Chirp Z-transform; Singular value decomposition; Orthogonal–triangular decomposition  
  Abstract Issues such as content identification, document and image security, audience measurement, ownership and copyright among others can be settled by the use of digital watermarking. Many recent video watermarking methods show drops in visual quality of the sequences. The present work addresses the aforementioned issue by introducing a robust and imperceptible non-blind color video frame watermarking algorithm. The method divides frames into moving and non-moving parts. The non-moving part of each color channel is processed separately using a block-based watermarking scheme. Blocks with an entropy lower than the average entropy of all blocks are subject to a further process for embedding the watermark image. Finally a watermarked frame is generated by adding moving parts to it. Several signal processing attacks are applied to each watermarked frame in order to perform experiments and are compared with some recent algorithms. Experimental results show that the proposed scheme is imperceptible and robust against common signal processing attacks.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA;MILAB; Approved no  
  Call Number Admin @ si @RSA2016 Serial 2766  
Permanent link to this record
 

 
Author Joan Serrat; Ferran Diego; Felipe Lumbreras; Jose Manuel Alvarez; Antonio Lopez; C. Elvira edit   pdf
doi  openurl
  Title Dynamic Comparison of Headlights Type Journal Article
  Year 2008 Publication Journal of Automobile Engineering Abbreviated Journal  
  Volume 222 Issue 5 Pages 643–656  
  Keywords (down) video alignment  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number ADAS @ adas @ SDL2008a Serial 958  
Permanent link to this record
 

 
Author Ferran Diego; Daniel Ponsa; Joan Serrat; Antonio Lopez edit   pdf
openurl 
  Title Video Alignment for Change Detection Type Journal Article
  Year 2011 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 20 Issue 7 Pages 1858-1869  
  Keywords (down) video alignment  
  Abstract In this work, we address the problem of aligning two video sequences. Such alignment refers to synchronization, i.e., the establishment of temporal correspondence between frames of the first and second video, followed by spatial registration of all the temporally corresponding frames. Video synchronization and alignment have been attempted before, but most often in the relatively simple cases of fixed or rigidly attached cameras and simultaneous acquisition. In addition, restrictive assumptions have been applied, including linear time correspondence or the knowledge of the complete trajectories of corresponding scene points; to some extent, these assumptions limit the practical applicability of any solutions developed. We intend to solve the more general problem of aligning video sequences recorded by independently moving cameras that follow similar trajectories, based only on the fusion of image intensity and GPS information. The novelty of our approach is to pose the synchronization as a MAP inference problem on a Bayesian network including the observations from these two sensor types, which have been proved complementary. Alignment results are presented in the context of videos recorded from vehicles driving along the same track at different times, for different road types. In addition, we explore two applications of the proposed video alignment method, both based on change detection between aligned videos. One is the detection of vehicles, which could be of use in ADAS. The other is online difference spotting videos of surveillance rounds.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; IF Approved no  
  Call Number DPS 2011; ADAS @ adas @ dps2011 Serial 1705  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: