toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Marco Pedersoli edit  openurl
  Title (down) Hierarchical Multiresolution Models for fast Object Detection Type Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The ability to automatically detect and recognize objects in unconstrained images is becoming more and more critical: from security systems and autonomous robots, to smart phones and augmented reality, intelligent devices need to understand the meaning of images as a composition of semantic objects. This Thesis tackles the problem of fast object detection based on template models. Detection consists of searching for an object in an image by evaluating the similarity between a template model and an image region at each possible location and scale. In this work, we show that using a template model representation based on a multiple resolution hierarchy is an optimal choice that can lead to excellent detection accuracy and fast computation. We implement two different approaches that make use of a hierarchy of multiresolution models: a multiresolution cascade and a coarse-to-fine search. Also, we extend the coarse-to-fine search by introducing a deformable part-based model that achieves state-of-the-art results together with a very reduced computational cost. Finally, we specialize our approach to the challenging task of pedestrian detection from moving vehicles and show that the overall quality of the system outperforms previous works in terms of speed and accuracy.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ Ped2012 Serial 2203  
Permanent link to this record
 

 
Author Jean-Marc Ogier; Wenyin Liu; Josep Llados (eds) edit  isbn
openurl 
  Title (down) Graphics Recognition: Achievements, Challenges, and Evolution Type Book Whole
  Year 2010 Publication 8th International Workshop GREC 2009. Abbreviated Journal  
  Volume 6020 Issue Pages  
  Keywords  
  Abstract  
  Address La Rochelle  
  Corporate Author Thesis  
  Publisher Springer Link Place of Publication Editor Jean-Marc Ogier; Wenyin Liu; Josep Llados  
  Language Summary Language Original Title  
  Series Editor Series Title Lecture Notes in Computer Science Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-642-13727-3 Medium  
  Area Expedition Conference GREC  
  Notes DAG Approved no  
  Call Number Admin @ si @ OLL2010 Serial 1976  
Permanent link to this record
 

 
Author W. Liu; Josep Llados edit  openurl
  Title (down) Graphics Recognition. Ten Years Review and Future Perspectives Type Book Whole
  Year 2006 Publication 6th International Workshop Abbreviated Journal  
  Volume 3926 Issue Pages  
  Keywords  
  Abstract  
  Address Hong Kong (China)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference GREC  
  Notes DAG Approved no  
  Call Number DAG @ dag @ LiL2006 Serial 800  
Permanent link to this record
 

 
Author Liu Wenyin; Josep Llados; Jean-Marc Ogier edit  isbn
openurl 
  Title (down) Graphics Recognition. Recent Advances and New Opportunities. Type Book Whole
  Year 2008 Publication 7th International Workshop, Selected Papers, Abbreviated Journal  
  Volume 5046 Issue Pages  
  Keywords  
  Abstract  
  Address Curitiba (Brazil)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-540-88184-1 Medium  
  Area Expedition Conference GREC  
  Notes DAG Approved no  
  Call Number DAG @ dag @ WLO2008 Serial 1012  
Permanent link to this record
 

 
Author Alicia Fornes; Bart Lamiroy edit  url
isbn  openurl
  Title (down) Graphics Recognition, Current Trends and Evolutions Type Book Whole
  Year 2018 Publication Graphics Recognition, Current Trends and Evolutions Abbreviated Journal  
  Volume 11009 Issue Pages  
  Keywords  
  Abstract This book constitutes the thoroughly refereed post-conference proceedings of the 12th International Workshop on Graphics Recognition, GREC 2017, held in Kyoto, Japan, in November 2017.
The 10 revised full papers presented were carefully reviewed and selected from 14 initial submissions. They contain both classical and emerging topics of graphics rcognition, namely analysis and detection of diagrams, search and classification, optical music recognition, interpretation of engineering drawings and maps.
 
  Address  
  Corporate Author Thesis  
  Publisher Springer International Publishing Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-02283-9 Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ FoL2018 Serial 3171  
Permanent link to this record
 

 
Author Chenshen Wu edit  isbn
openurl 
  Title (down) Going beyond Classification Problems for the Continual Learning of Deep Neural Networks Type Book Whole
  Year 2023 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Deep learning has made tremendous progress in the last decade due to the explosion of training data and computational power. Through end-to-end training on a
large dataset, image representations are more discriminative than the previously
used hand-crafted features. However, for many real-world applications, training
and testing on a single dataset is not realistic, as the test distribution may change over time. Continuous learning takes this situation into account, where the learner must adapt to a sequence of tasks, each with a different distribution. If you would naively continue training the model with a new task, the performance of the model would drop dramatically for the previously learned data. This phenomenon is known as catastrophic forgetting.
Many approaches have been proposed to address this problem, which can be divided into three main categories: regularization-based approaches, rehearsal-based
approaches, and parameter isolation-based approaches. However, most of the existing works focus on image classification tasks and many other computer vision tasks
have not been well-explored in the continual learning setting. Therefore, in this
thesis, we study continual learning for image generation, object re-identification,
and object counting.
For the image generation problem, since the model can generate images from the previously learned task, it is free to apply rehearsal without any limitation. We developed two methods based on generative replay. The first one uses the generated image for joint training together with the new data. The second one is based on
output pixel-wise alignment. We extensively evaluate these methods on several
benchmarks.
Next, we study continual learning for object Re-Identification (ReID). Although
most state-of-the-art methods of ReID and continual ReID use softmax-triplet loss,
we found that it is better to solve the ReID problem from a meta-learning perspective because continual learning of reID can benefit a lot from the generalization of metalearning. We also propose a distillation loss and found that the removal of the positive pairs before the distillation loss is critical.
Finally, we study continual learning for the counting problem. We study the mainstream method based on density maps and propose a new approach for density
map distillation. We found that fixing the counter head is crucial for the continual learning of object counting. To further improve results, we propose an adaptor to adapt the changing feature extractor for the fixed counter head. Extensive evaluation shows that this results in improved continual learning performance.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher IMPRIMA Place of Publication Editor Joost Van de Weijer;Bogdan Raducanu  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-126409-0-8 Medium  
  Area Expedition Conference  
  Notes LAMP Approved no  
  Call Number Admin @ si @ Wu2023 Serial 3960  
Permanent link to this record
 

 
Author Debora Gil edit   pdf
isbn  openurl
  Title (down) Geometric Differential Operators for Shape Modelling Type Book Whole
  Year 2004 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Medical imaging feeds research in many computer vision and image processing fields: image filtering, segmentation, shape recovery, registration, retrieval and pattern matching. Because of their low contrast changes and large variety of artifacts and noise, medical imaging processing techniques relying on an analysis of the geometry of image level sets rather than on intensity values result in more robust treatment. From the starting point of treatment of intravascular images, this PhD thesis ad- dresses the design of differential image operators based on geometric principles for a robust shape modelling and restoration. Among all fields applying shape recovery, we approach filtering and segmentation of image objects. For a successful use in real images, the segmentation process should go through three stages: noise removing, shape modelling and shape recovery. This PhD addresses all three topics, but for the sake of algorithms as automated as possible, techniques for image processing will be designed to satisfy three main principles: a) convergence of the iterative schemes to non-trivial states avoiding image degeneration to a constant image and representing smooth models of the originals; b) smooth asymptotic behav- ior ensuring stabilization of the iterative process; c) fixed parameter values ensuring equal (domain free) performance of the algorithms whatever initial images/shapes. Our geometric approach to the generic equations that model the different processes approached enables defining techniques satisfying all the former requirements. First, we introduce a new curvature-based geometric flow for image filtering achieving a good compromise between noise removing and resemblance to original images. Sec- ond, we describe a new family of diffusion operators that restrict their scope to image level curves and serve to restore smooth closed models from unconnected sets of points. Finally, we design a regularization of snake (distance) maps that ensures its smooth convergence towards any closed shape. Experiments show that performance of the techniques proposed overpasses that of state-of-the-art algorithms.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Barcelona (Spain) Editor Jordi Saludes i Closa;Petia Radeva  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 84-933652-0-3 Medium prit  
  Area Expedition Conference  
  Notes IAM; Approved no  
  Call Number IAM @ iam @ GIL2004 Serial 1517  
Permanent link to this record
 

 
Author Edgar Riba edit  openurl
  Title (down) Geometric Computer Vision Techniques for Scene Reconstruction Type Book Whole
  Year 2021 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract From the early stages of Computer Vision, scene reconstruction has been one of the most studied topics leading to a wide variety of new discoveries and applications. Object grasping and manipulation, localization and mapping, or even visual effect generation are different examples of applications in which scene reconstruction has taken an important role for industries such as robotics, factory automation, or audio visual production. However, scene reconstruction is an extensive topic that can be approached in many different ways with already existing solutions that effectively work in controlled environments. Formally, the problem of scene reconstruction can be formulated as a sequence of independent processes which compose a pipeline. In this thesis, we analyse some parts of the reconstruction pipeline from which we contribute with novel methods using Convolutional Neural Networks (CNN) proposing innovative solutions that consider the optimisation of the methods in an end-to-end fashion. First, we review the state of the art of classical local features detectors and descriptors and contribute with two novel methods that inherently improve pre-existing solutions in the scene reconstruction pipeline.

It is a fact that computer science and software engineering are two fields that usually go hand in hand and evolve according to mutual needs making easier the design of complex and efficient algorithms. For this reason, we contribute with Kornia, a library specifically designed to work with classical computer vision techniques along with deep neural networks. In essence, we created a framework that eases the design of complex pipelines for computer vision algorithms so that can be included within neural networks and be used to backpropagate gradients throw a common optimisation framework. Finally, in the last chapter of this thesis we develop the aforementioned concept of designing end-to-end systems with classical projective geometry. Thus, we contribute with a solution to the problem of synthetic view generation by hallucinating novel views from high deformable cloths objects using a geometry aware end-to-end system. To summarize, in this thesis we demonstrate that with a proper design that combine classical geometric computer vision methods with deep learning techniques can lead to improve pre-existing solutions for the problem of scene reconstruction.
 
  Address February 2021  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Place of Publication Editor Daniel Ponsa  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MSIAU Approved no  
  Call Number Admin @ si @ Rib2021 Serial 3610  
Permanent link to this record
 

 
Author Marçal Rusiñol edit  openurl
  Title (down) Geometric and Structural-based Symbol Spotting. Application to Focused Retrieval in Graphic Document Collections Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Usually, pattern recognition systems consist of two main parts. On the one hand, the data acquisition and, on the other hand, the classification of this data on a certain category. In order to recognize which category a certain query element belongs to, a set of pattern models must be provided beforehand. An off-line learning stage is needed to train the classifier and to offer a robust classification of the patterns. Within the pattern recognition field, we are interested in the recognition of graphics and, in particular, on the analysis of documents rich in graphical information. In this context, one of the main concerns is to see if the proposed systems remain scalable with respect to the data volume so as it can handle growing amounts of symbol models. In order to avoid to work with a database of reference symbols, symbol spotting and on-the-fly symbol recognition methods have been introduced in the past years.

Generally speaking, the symbol spotting problem can be defined as the identification of a set of regions of interest from a document image which are likely to contain an instance of a certain queriedn symbol without explicitly applying the whole pattern recognition scheme. Our application framework consists on indexing a collection of graphic-rich document images. This collection is
queried by example with a single instance of the symbol to look for and, by means of symbol spotting methods we retrieve the regions of interest where the symbol is likely to appear within the documents. This kind of applications are known as focused retrieval methods.

In order that the focused retrieval application can handle large collections of documents there is a need to provide an efficient access to the large volume of information that might be stored. We use indexing strategies in order to efficiently retrieve by similarity the locations where a certain part of the symbol appears. In that scenario, graphical patterns should be used as indices for accessing and navigating the collection of documents.
These indexing mechanism allow the user to search for similar elements using graphical information rather than textual queries.

Along this thesis we present a spotting architecture and different methods aiming to build a complete focused retrieval application dealing with a graphic-rich document collections. In addition, a protocol to evaluate the performance of symbol
spotting systems in terms of recognition abilities, location accuracy and scalability is proposed.
 
  Address Barcelona (Spain)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ Rus2009 Serial 1264  
Permanent link to this record
 

 
Author Dena Bazazian edit  isbn
openurl 
  Title (down) Fully Convolutional Networks for Text Understanding in Scene Images Type Book Whole
  Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Text understanding in scene images has gained plenty of attention in the computer vision community and it is an important task in many applications as text carries semantically rich information about scene content and context. For instance, reading text in a scene can be applied to autonomous driving, scene understanding or assisting visually impaired people. The general aim of scene text understanding is to localize and recognize text in scene images. Text regions are first localized in the original image by a trained detector model and afterwards fed into a recognition module. The tasks of localization and recognition are highly correlated since an inaccurate localization can affect the recognition task.
The main purpose of this thesis is to devise efficient methods for scene text understanding. We investigate how the latest results on deep learning can advance text understanding pipelines. Recently, Fully Convolutional Networks (FCNs) and derived methods have achieved a significant performance on semantic segmentation and pixel level classification tasks. Therefore, we took benefit of the strengths of FCN approaches in order to detect text in natural scenes. In this thesis we have focused on two challenging tasks of scene text understanding which are Text Detection and Word Spotting. For the task of text detection, we have proposed an efficient text proposal technique in scene images. We have considered the Text Proposals method as the baseline which is an approach to reduce the search space of possible text regions in an image. In order to improve the Text Proposals method we combined it with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same level of accuracy and thus gaining a significant speed up. Our experiments demonstrate that this text proposal approach yields significantly higher recall rates than the line based text localization techniques, while also producing better-quality localization. We have also applied this technique on compressed images such as videos from wearable egocentric cameras. For the task of word spotting, we have introduced a novel mid-level word representation method. We have proposed a technique to create and exploit an intermediate representation of images based on text attributes which roughly correspond to character probability maps. Our representation extends the concept of Pyramidal Histogram Of Characters (PHOC) by exploiting Fully Convolutional Networks to derive a pixel-wise mapping of the character distribution within candidate word regions. We call this representation the Soft-PHOC. Furthermore, we show how to use Soft-PHOC descriptors for word spotting tasks through an efficient text line proposal algorithm. To evaluate the detected text, we propose a novel line based evaluation along with the classic bounding box based approach. We test our method on incidental scene text images which comprises real-life scenarios such as urban scenes. The importance of incidental scene text images is due to the complexity of backgrounds, perspective, variety of script and language, short text and little linguistic context. All of these factors together makes the incidental scene text images challenging.
 
  Address November 2018  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Dimosthenis Karatzas;Andrew Bagdanov  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-948531-1-1 Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ Baz2018 Serial 3220  
Permanent link to this record
 

 
Author Utkarsh Porwal; Alicia Fornes; Faisal Shafait (eds) edit  doi
isbn  openurl
  Title (down) Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022 Type Book Whole
  Year 2022 Publication Frontiers in Handwriting Recognition. Abbreviated Journal  
  Volume 13639 Issue Pages  
  Keywords  
  Abstract  
  Address ICFHR 2022, Hyderabad, India, December 4–7, 2022  
  Corporate Author Thesis  
  Publisher Springer Place of Publication Editor Utkarsh Porwal; Alicia Fornes; Faisal Shafait  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-031-21648-0 Medium  
  Area Expedition Conference ICFHR  
  Notes DAG Approved no  
  Call Number Admin @ si @ PFS2022 Serial 3809  
Permanent link to this record
 

 
Author Antonio Hernandez edit  isbn
openurl 
  Title (down) From pixels to gestures: learning visual representations for human analysis in color and depth data sequences Type Book Whole
  Year 2015 Publication PhD Thesis, Universitat de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The visual analysis of humans from images is an important topic of interest due to its relevance to many computer vision applications like pedestrian detection, monitoring and surveillance, human-computer interaction, e-health or content-based image retrieval, among others.

In this dissertation we are interested in learning different visual representations of the human body that are helpful for the visual analysis of humans in images and video sequences. To that end, we analyze both RGB and depth image modalities and address the problem from three different research lines, at different levels of abstraction; from pixels to gestures: human segmentation, human pose estimation and gesture recognition.

First, we show how binary segmentation (object vs. background) of the human body in image sequences is helpful to remove all the background clutter present in the scene. The presented method, based on Graph cuts optimization, enforces spatio-temporal consistency of the produced segmentation masks among consecutive frames. Secondly, we present a framework for multi-label segmentation for obtaining much more detailed segmentation masks: instead of just obtaining a binary representation separating the human body from the background, finer segmentation masks can be obtained separating the different body parts.

At a higher level of abstraction, we aim for a simpler yet descriptive representation of the human body. Human pose estimation methods usually rely on skeletal models of the human body, formed by segments (or rectangles) that represent the body limbs, appropriately connected following the kinematic constraints of the human body. In practice, such skeletal models must fulfill some constraints in order to allow for efficient inference, while actually limiting the expressiveness of the model. In order to cope with this, we introduce a top-down approach for predicting the position of the body parts in the model, using a mid-level part representation based on Poselets.

Finally, we propose a framework for gesture recognition based on the bag of visual words framework. We leverage the benefits of RGB and depth image modalities by combining modality-specific visual vocabularies in a late fusion fashion. A new rotation-variant depth descriptor is presented, yielding better results than other state-of-the-art descriptors. Moreover, spatio-temporal pyramids are used to encode rough spatial and temporal structure. In addition, we present a probabilistic reformulation of Dynamic Time Warping for gesture segmentation in video sequences. A Gaussian-based probabilistic model of a gesture is learnt, implicitly encoding possible deformations in both spatial and time domains.
 
  Address January 2015  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Sergio Escalera;Stan Sclaroff  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-940902-0-2 Medium  
  Area Expedition Conference  
  Notes HuPBA;MILAB Approved no  
  Call Number Admin @ si @ Her2015 Serial 2576  
Permanent link to this record
 

 
Author Ivan Huerta edit  isbn
openurl 
  Title (down) Foreground Object Segmentation and Shadow Detection for Video Sequences in Uncontrolled Environments Type Book Whole
  Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This Thesis is mainly divided in two parts. The first one presents a study of motion
segmentation problems. Based on this study, a novel algorithm for mobile-object
segmentation from a static background scene is also presented. This approach is
demonstrated robust and accurate under most of the common problems in motion
segmentation. The second one tackles the problem of shadows in depth. Firstly, a
bottom-up approach based on a chromatic shadow detector is presented to deal with
umbra shadows. Secondly, a top-down approach based on a tracking system has been
developed in order to enhance the chromatic shadow detection.
In our first contribution, a case analysis of motion segmentation problems is presented by taking into account the problems associated with different cues, namely
colour, edge and intensity. Our second contribution is a hybrid architecture which
handles the main problems observed in such a case analysis, by fusing (i) the knowledge from these three cues and (ii) a temporal difference algorithm. On the one hand,
we enhance the colour and edge models to solve both global/local illumination changes
(shadows and highlights) and camouflage in intensity. In addition, local information is
exploited to cope with a very challenging problem such as the camouflage in chroma.
On the other hand, the intensity cue is also applied when colour and edge cues are not
available, such as when beyond the dynamic range. Additionally, temporal difference
is included to segment motion when these three cues are not available, such as that
background not visible during the training period. Lastly, the approach is enhanced
for allowing ghost detection. As a result, our approach obtains very accurate and robust motion segmentation in both indoor and outdoor scenarios, as quantitatively and
qualitatively demonstrated in the experimental results, by comparing our approach
with most best-known state-of-the-art approaches.
Motion Segmentation has to deal with shadows to avoid distortions when detecting
moving objects. Most segmentation approaches dealing with shadow detection are
typically restricted to penumbra shadows. Therefore, such techniques cannot cope
well with umbra shadows. Consequently, umbra shadows are usually detected as part
of moving objects.
Firstly, a bottom-up approach for detection and removal of chromatic moving
shadows in surveillance scenarios is proposed. Secondly, a top-down approach based
on kalman filters to detect and track shadows has been developed in order to enhance
the chromatic shadow detection. In the Bottom-up part, the shadow detection approach applies a novel technique based on gradient and colour models for separating
chromatic moving shadows from moving objects.
Well-known colour and gradient models are extended and improved into an invariant colour cone model and an invariant gradient model, respectively, to perform
automatic segmentation while detecting potential shadows. Hereafter, the regions corresponding to potential shadows are grouped by considering ”a bluish effect” and an
edge partitioning. Lastly, (i) temporal similarities between local gradient structures
and (ii) spatial similarities between chrominance angle and brightness distortions are
analysed for all potential shadow regions in order to finally identify umbra shadows.
In the top-down process, after detection of objects and shadows both are tracked
using Kalman filters, in order to enhance the chromatic shadow detection, when it
fails to detect a shadow. Firstly, this implies a data association between the blobs
(foreground and shadow) and Kalman filters. Secondly, an event analysis of the different data association cases is performed, and occlusion handling is managed by a
Probabilistic Appearance Model (PAM). Based on this association, temporal consistency is looked for the association between foregrounds and shadows and their
respective Kalman Filters. From this association several cases are studied, as a result
lost chromatic shadows are correctly detected. Finally, the tracking results are used
as feedback to improve the shadow and object detection.
Unlike other approaches, our method does not make any a-priori assumptions
about camera location, surface geometries, surface textures, shapes and types of
shadows, objects, and background. Experimental results show the performance and
accuracy of our approach in different shadowed materials and illumination conditions.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-937261-3-3 Medium  
  Area Expedition Conference  
  Notes Approved no  
  Call Number ISE @ ise @ Hue2010 Serial 1332  
Permanent link to this record
 

 
Author Hongxing Gao edit  isbn
openurl 
  Title (down) Focused Structural Document Image Retrieval in Digital Mailroom Applications Type Book Whole
  Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In this work, we develop a generic framework that is able to handle the document retrieval problem in various scenarios such as searching for full page matches or retrieving the counterparts for specific document areas, focusing on their structural similarity or letting their visual resemblance to play a dominant role. Based on the spatial indexing technique, we propose to search for matches of local key-region pairs carrying both structural and visual information from the collection while a scheme allowing to adjust the relative contribution of structural and visual similarity is presented.
Based on the fact that the structure of documents is tightly linked with the distance among their elements, we firstly introduce an efficient detector named Distance Transform based Maximally Stable Extremal Regions (DTMSER). We illustrate that this detector is able to efficiently extract the structure of a document image as a dendrogram (hierarchical tree) of multi-scale key-regions that roughly correspond to letters, words and paragraphs. We demonstrate that, without benefiting from the structure information, the key-regions extracted by the DTMSER algorithm achieve better results comparing with state-of-the-art methods while much less amount of key-regions are employed.
We subsequently propose a pair-wise Bag of Words (BoW) framework to efficiently embed the explicit structure extracted by the DTMSER algorithm. We represent each document as a list of key-region pairs that correspond to the edges in the dendrogram where inclusion relationship is encoded. By employing those structural key-region pairs as the pooling elements for generating the histogram of features, the proposed method is able to encode the explicit inclusion relations into a BoW representation. The experimental results illustrate that the pair-wise BoW, powered by the embedded structural information, achieves remarkable improvement over the conventional BoW and spatial pyramidal BoW methods.
To handle various retrieval scenarios in one framework, we propose to directly query a series of key-region pairs, carrying both structure and visual information, from the collection. We introduce the spatial indexing techniques to the document retrieval community to speed up the structural relationship computation for key-region pairs. We firstly test the proposed framework in a full page retrieval scenario where structurally similar matches are expected. In this case, the pair-wise querying method achieves notable improvement over the BoW and spatial pyramidal BoW frameworks. Furthermore, we illustrate that the proposed method is also able to handle focused retrieval situations where the queries are defined as a specific interesting partial areas of the images. We examine our method on two types of focused queries: structure-focused and exact queries. The experimental results show that, the proposed generic framework obtains nearly perfect precision on both types of focused queries while it is the first framework able to tackle structure-focused queries, setting a new state of the art in the field.
Besides, we introduce a line verification method to check the spatial consistency among the matched key-region pairs. We propose a computationally efficient version of line verification through a two step implementation. We first compute tentative localizations of the query and subsequently employ them to divide the matched key-region pairs into several groups, then line verification is performed within each group while more precise bounding boxes are computed. We demonstrate that, comparing with the standard approach (based on RANSAC), the line verification proposed generally achieves much higher recall with slight loss on precision on specific queries.
 
  Address January 2015  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados;Dimosthenis Karatzas;Marçal Rusiñol  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-943427-0-7 Medium  
  Area Expedition Conference  
  Notes DAG; 600.077 Approved no  
  Call Number Admin @ si @ Gao2015 Serial 2577  
Permanent link to this record
 

 
Author David Masip edit  isbn
openurl 
  Title (down) Face Classification Using Discriminative Features and Classifier Combination Type Book Whole
  Year 2005 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address CVC (UAB)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Place of Publication Editor Jordi Vitria  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 84-933652-3-8 Medium  
  Area Expedition Conference  
  Notes OR;MV Approved no  
  Call Number Admin @ si @ Mas2005b Serial 602  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: