|   | 
Details
   web
Records
Author Angel Sappa (ed)
Title Computer Graphics and Imaging Type Book Whole
Year 2010 Publication Computer Graphics and Imaging Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor Angel Sappa
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 978–0–88986–836–6 Medium
Area Expedition Conference CGIM
Notes ADAS Approved no
Call Number ADAS @ adas @ Sap2010 Serial 1468
Permanent link to this record
 

 
Author David Augusto Rojas; Joost Van de Weijer; Theo Gevers
Title Color Edge Saliency Boosting using Natural Image Statistics Type Conference Article
Year 2010 Publication 5th European Conference on Colour in Graphics, Imaging and Vision and 12th International Symposium on Multispectral Colour Science Abbreviated Journal
Volume Issue Pages 228–234
Keywords
Abstract State of the art methods for image matching, content-based retrieval and recognition use local features. Most of these still exploit only the luminance information for detection. The color saliency boosting algorithm has provided an efficient method to exploit the saliency of color edges based on information theory. However, during the design of this algorithm, some issues were not addressed in depth: (1) The method has ignored the underlying distribution of derivatives in natural images. (2) The dependence of information content in color-boosted edges on its spatial derivatives has not been quantitatively established. (3) To evaluate luminance and color contributions to saliency of edges, a parameter gradually balancing both contributions is required.
We introduce a novel algorithm, based on the principles of independent component analysis, which models the first order derivatives of color natural images by a generalized Gaussian distribution. Furthermore, using this probability model we show that for images with a Laplacian distribution, which is a particular case of generalized Gaussian distribution, the magnitudes of color-boosted edges reflect their corresponding information content. In order to evaluate the impact of color edge saliency in real world applications, we introduce an extension of the Laplacian-of-Gaussian detector to color, and the performance for image matching is evaluated. Our experiments show that our approach provides more discriminative regions in comparison with the original detector.
Address Joensuu, Finland
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 9781617388897 Medium
Area Expedition Conference CGIV/MCS
Notes ISE Approved no
Call Number CAT @ cat @ RWG2010 Serial 1306
Permanent link to this record
 

 
Author Jaime Moreno; Xavier Otazu; Maria Vanrell
Title Local Perceptual Weighting in JPEG2000 for Color Images Type Conference Article
Year 2010 Publication 5th European Conference on Colour in Graphics, Imaging and Vision and 12th International Symposium on Multispectral Colour Science Abbreviated Journal
Volume Issue Pages 255–260
Keywords
Abstract The aim of this work is to explain how to apply perceptual concepts to define a perceptual pre-quantizer and to improve JPEG2000 compressor. The approach consists in quantizing wavelet transform coefficients using some of the human visual system behavior properties. Noise is fatal to image compression performance, because it can be both annoying for the observer and consumes excessive bandwidth when the imagery is transmitted. Perceptual pre-quantization reduces unperceivable details and thus improve both visual impression and transmission properties. The comparison between JPEG2000 without and with perceptual pre-quantization shows that the latter is not favorable in PSNR, but the recovered image is more compressed at the same or even better visual quality measured with a weighted PSNR. Perceptual criteria were taken from the CIWaM (Chromatic Induction Wavelet Model).
Address Joensuu, Finland
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 9781617388897 Medium
Area Expedition Conference CGIV/MCS
Notes CIC Approved no
Call Number CAT @ cat @ MOV2010a Serial 1307
Permanent link to this record
 

 
Author C. Alejandro Parraga; Ramon Baldrich; Maria Vanrell
Title Accurate Mapping of Natural Scenes Radiance to Cone Activation Space: A New Image Dataset Type Conference Article
Year 2010 Publication 5th European Conference on Colour in Graphics, Imaging and Vision and 12th International Symposium on Multispectral Colour Science Abbreviated Journal
Volume Issue Pages 50–57
Keywords
Abstract The characterization of trichromatic cameras is usually done in terms of a device-independent color space, such as the CIE 1931 XYZ space. This is indeed convenient since it allows the testing of results against colorimetric measures. We have characterized our camera to represent human cone activation by mapping the camera sensor's (RGB) responses to human (LMS) through a polynomial transformation, which can be “customized” according to the types of scenes we want to represent. Here we present a method to test the accuracy of the camera measures and a study on how the choice of training reflectances for the polynomial may alter the results.
Address Joensuu, Finland
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 9781617388897 Medium
Area Expedition Conference CGIV/MCS
Notes CIC Approved no
Call Number CAT @ cat @ PBV2010a Serial 1322
Permanent link to this record
 

 
Author Javier Vazquez; G. D. Finlayson; Maria Vanrell
Title A compact singularity function to predict WCS data and unique hues Type Conference Article
Year 2010 Publication 5th European Conference on Colour in Graphics, Imaging and Vision and 12th International Symposium on Multispectral Colour Science Abbreviated Journal
Volume Issue Pages 33–38
Keywords
Abstract Understanding how colour is used by the human vision system is a widely studied research field. The field, though quite advanced, still faces important unanswered questions. One of them is the explanation of the unique hues and the assignment of color names. This problem addresses the fact of different perceptual status for different colors.
Recently, Philipona and O'Regan have proposed a biological model that allows to extract the reflection properties of any surface independently of the lighting conditions. These invariant properties are the basis to compute a singularity index that predicts the asymmetries presented in unique hues and basic color categories psychophysical data, therefore is giving a further step in their explanation.

In this paper we build on their formulation and propose a new singularity index. This new formulation equally accounts for the location of the 4 peaks of the World colour survey and has two main advantages. First, it is a simple elegant numerical measure (the Philipona measurement is a rather cumbersome formula). Second, we develop a colour-based explanation for the measure.
Address Joensuu, Finland
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 9781617388897 Medium
Area Expedition Conference CGIV/MCS
Notes CIC Approved no
Call Number CAT @ cat @ VFV2010 Serial 1324
Permanent link to this record
 

 
Author David Aldavert; Arnau Ramisa; Ramon Lopez de Mantaras; Ricardo Toledo
Title Real-time Object Segmentation using a Bag of Features Approach Type Conference Article
Year 2010 Publication 13th International Conference of the Catalan Association for Artificial Intelligence Abbreviated Journal
Volume 220 Issue Pages 321–329
Keywords Object Segmentation; Bag Of Features; Feature Quantization; Densely sampled descriptors
Abstract In this paper, we propose an object segmentation framework, based on the popular bag of features (BoF), which can process several images per second while achieving a good segmentation accuracy assigning an object category to every pixel of the image. We propose an efficient color descriptor to complement the information obtained by a typical gradient-based local descriptor. Results show that color proves to be a useful cue to increase the segmentation accuracy, specially in large homogeneous regions. Then, we extend the Hierarchical K-Means codebook using the recently proposed Vector of Locally Aggregated Descriptors method. Finally, we show that the BoF method can be easily parallelized since it is applied locally, thus the time necessary to process an image is further reduced. The performance of the proposed method is evaluated in the standard PASCAL 2007 Segmentation Challenge object segmentation dataset.
Address
Corporate Author Thesis
Publisher IOS Press Amsterdam, Place of Publication Editor In R.Alquezar, A.Moreno, J.Aguilar.
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 9781607506423 Medium
Area Expedition Conference CCIA
Notes ADAS Approved no
Call Number Admin @ si @ ARL2010b Serial 1417
Permanent link to this record
 

 
Author N. Serrano; L. Tarazon; D. Perez; Oriol Ramos Terrades; S. Juan
Title The GIDOC Prototype Type Conference Article
Year 2010 Publication 10th International Workshop on Pattern Recognition in Information Systems Abbreviated Journal
Volume Issue Pages 82-89
Keywords
Abstract Transcription of handwritten text in (old) documents is an important, time-consuming task for digital libraries. It might be carried out by first processing all document images off-line, and then manually supervising system transcriptions to edit incorrect parts. However, current techniques for automatic page layout analysis, text line detection and handwriting recognition are still far from perfect, and thus post-editing system output is not clearly better than simply ignoring it.

A more effective approach to transcribe old text documents is to follow an interactive- predictive paradigm in which both, the system is guided by the user, and the user is assisted by the system to complete the transcription task as efficiently as possible. Following this approach, a system prototype called GIDOC (Gimp-based Interactive transcription of old text DOCuments) has been developed to provide user-friendly, integrated support for interactive-predictive layout analysis, line detection and handwriting transcription.

GIDOC is designed to work with (large) collections of homogeneous documents, that is, of similar structure and writing styles. They are annotated sequentially, by (par- tially) supervising hypotheses drawn from statistical models that are constantly updated with an increasing number of available annotated documents. And this is done at different annotation levels. For instance, at the level of page layout analysis, GIDOC uses a novel text block detection method in which conventional, memoryless techniques are improved with a “history” model of text block positions. Similarly, at the level of text line image transcription, GIDOC includes a handwriting recognizer which is steadily improved with a growing number of (partially) supervised transcriptions.
Address Funchal, Portugal
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 978-989-8425-14-0 Medium
Area Expedition Conference PRIS
Notes DAG Approved no
Call Number Admin @ si @ STP2010 Serial 1868
Permanent link to this record
 

 
Author Thierry Brouard; A. Delaplace; Muhammad Muzzamil Luqman; H. Cardot; Jean-Yves Ramel
Title Design of Evolutionary Methods Applied to the Learning of Bayesian Nerwork Structures Type Book Chapter
Year 2010 Publication Bayesian Network Abbreviated Journal
Volume Issue Pages 13-37
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Sciyo Place of Publication Editor Ahmed Rebai
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 978-953-307-124-4 Medium
Area Expedition Conference
Notes Approved no
Call Number Admin @ si @ BDL2010 Serial 1461
Permanent link to this record
 

 
Author Robert Benavente; C. Alejandro Parraga; Maria Vanrell
Title La influencia del contexto en la definicion de las fronteras entre las categorias cromaticas Type Conference Article
Year 2010 Publication 9th Congreso Nacional del Color Abbreviated Journal
Volume Issue Pages 92–95
Keywords Categorización del color; Apariencia del color; Influencia del contexto; Patrones de Mondrian; Modelos paramétricos
Abstract En este artículo presentamos los resultados de un experimento de categorización de color en el que las muestras se presentaron sobre un fondo multicolor (Mondrian) para simular los efectos del contexto. Los resultados se comparan con los de un experimento previo que, utilizando un paradigma diferente, determinó las fronteras sin tener en cuenta el contexto. El análisis de los resultados muestra que las fronteras obtenidas con el experimento en contexto presentan menos confusión que las obtenidas en el experimento sin contexto.
Address Alicante (Spain)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 978-84-9717-144-1 Medium
Area Expedition Conference CNC
Notes CIC Approved no
Call Number CAT @ cat @ BPV2010 Serial 1327
Permanent link to this record
 

 
Author Ignasi Rius
Title Motion Priors for Efficient Bayesian Tracking in Human Sequence Evaluation Type Book Whole
Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Recovering human motion by visual analysis is a challenging computer vision research
area with a lot of potential applications. Model-based tracking approaches, and in
particular particle lters, formulate the problem as a Bayesian inference task whose
aim is to sequentially estimate the distribution of the parameters of a human body
model over time. These approaches strongly rely on good dynamical and observation
models to predict and update congurations of the human body according to measurements from the image data. However, it is very dicult to design observation
models which extract useful and reliable information from image sequences robustly.
This results specially challenging in monocular tracking given that only one viewpoint
from the scene is available. Therefore, to overcome these limitations strong motion
priors are needed to guide the exploration of the state space.
The work presented in this Thesis is aimed to retrieve the 3D motion parameters
of a human body model from incomplete and noisy measurements of a monocular
image sequence. These measurements consist of the 2D positions of a reduced set of
joints in the image plane. Towards this end, we present a novel action-specic model
of human motion which is trained from several databases of real motion-captured
performances of an action, and is used as a priori knowledge within a particle ltering
scheme.
Body postures are represented by means of a simple and compact stick gure
model which uses direction cosines to represent the direction of body limbs in the 3D
Cartesian space. Then, for a given action, Principal Component Analysis is applied to
the training data to perform dimensionality reduction over the highly correlated input
data. Before the learning stage of the action model, the input motion performances
are synchronized by means of a novel dense matching algorithm based on Dynamic
Programming. The algorithm synchronizes all the motion sequences of the same
action class, nding an optimal solution in real-time.
Then, a probabilistic action model is learnt, based on the synchronized motion
examples, which captures the variability and temporal evolution of full-body motion
within a specic action. In particular, for each action, the parameters learnt are: a
representative manifold for the action consisting of its mean performance, the standard deviation from the mean performance, the mean observed direction vectors from
each motion subsequence of a given length and the expected error at a given time
instant.
Subsequently, the action-specic model is used as a priori knowledge on human
motion which improves the eciency and robustness of the overall particle filtering tracking framework. First, the dynamic model guides the particles according to similar
situations previously learnt. Then, the state space is constrained so only feasible
human postures are accepted as valid solutions at each time step. As a result, the
state space is explored more eciently as the particle set covers the most probable
body postures.
Finally, experiments are carried out using test sequences from several motion
databases. Results point out that our tracker scheme is able to estimate the rough
3D conguration of a full-body model providing only the 2D positions of a reduced
set of joints. Separate tests on the sequence synchronization method and the subsequence probabilistic matching technique are also provided.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 978-84-937261-9-5 Medium
Area Expedition Conference
Notes Approved no
Call Number Admin @ si @ Riu2010 Serial 1331
Permanent link to this record
 

 
Author Jose Manuel Alvarez
Title Combining Context and Appearance for Road Detection Type Book Whole
Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Road traffic crashes have become a major cause of death and injury throughout the world.
Hence, in order to improve road safety, the automobile manufacture is moving towards the
development of vehicles with autonomous functionalities such as keeping in the right lane, safe distance keeping between vehicles or regulating the speed of the vehicle according to the traffic conditions. A key component of these systems is vision–based road detection that aims to detect the free road surface ahead the moving vehicle. Detecting the road using a monocular vision system is very challenging since the road is an outdoor scenario imaged from a mobile platform. Hence, the detection algorithm must be able to deal with continuously changing imaging conditions such as the presence ofdifferent objects (vehicles, pedestrians), different environments (urban, highways, off–road), different road types (shape, color), and different imaging conditions (varying illumination, different viewpoints and changing weather conditions). Therefore, in this thesis, we focus on vision–based road detection using a single color camera. More precisely, we first focus on analyzing and grouping pixels according to their low–level properties. In this way, two different approaches are presented to exploit
color and photometric invariance. Then, we focus the research of the thesis on exploiting context information. This information provides relevant knowledge about the road not using pixel features from road regions but semantic information from the analysis of the scene.
In this way, we present two different approaches to infer the geometry of the road ahead
the moving vehicle. Finally, we focus on combining these context and appearance (color)
approaches to improve the overall performance of road detection algorithms. The qualitative and quantitative results presented in this thesis on real–world driving sequences show that the proposed method is robust to varying imaging conditions, road types and scenarios going beyond the state–of–the–art.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;Theo Gevers
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 978-84-937261-8-8 Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number Admin @ si @ Alv2010 Serial 1454
Permanent link to this record
 

 
Author Partha Pratim Roy
Title Multi-Oriented and Multi-Scaled Text Character Analysis and Recognition in Graphical Documents and their Applications to Document Image Retrieval Type Book Whole
Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract With the advent research of Document Image Analysis and Recognition (DIAR), an
important line of research is explored on indexing and retrieval of graphics rich documents. It aims at finding relevant documents relying on segmentation and recognition
of text and graphics components underlying in non-standard layout where commercial
OCRs can not be applied due to complexity. This thesis is focused towards text information extraction approaches in graphical documents and retrieval of such documents
using text information.
Automatic text recognition in graphical documents (map, engineering drawing,
etc.) involves many challenges because text characters are usually printed in multioriented and multi-scale way along with different graphical objects. Text characters
are used to annotate the graphical curve lines and hence, many times they follow
curvi-linear paths too. For OCR of such documents, individual text lines and their
corresponding words/characters need to be extracted.
For recognition of multi-font, multi-scale and multi-oriented characters, we have
proposed a feature descriptor for character shape using angular information from contour pixels to take care of the invariance nature. To improve the efficiency of OCR, an
approach towards the segmentation of multi-oriented touching strings into individual
characters is also discussed. Convex hull based background information is used to
segment a touching string into possible primitive segments and later these primitive
segments are merged to get optimum segmentation using dynamic programming. To
overcome the touching/overlapping problem of text with graphical lines, a character
spotting approach using SIFT and skeleton information is included. Afterwards, we
propose a novel method to extract individual curvi-linear text lines using the foreground and background information of the characters of the text and a water reservoir
concept is used to utilize the background information.
We have also formulated the methodologies for graphical document retrieval applications using query words and seals. The retrieval approaches are performed using
recognition results of individual components in the document. Given a query text,
the system extracts positional knowledge from the query word and uses the same to
generate hypothetical locations in the document. Indexing of documents is also performed based on automatic detection of seals from documents containing cluttered
background. A seal is characterized by scale and rotation invariant spatial feature
descriptors computed from labelled text characters and a concept based on the Generalized Hough Transform is used to locate the seal in documents.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados;Umapada Pal
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 978-84-937261-7-1 Medium
Area Expedition Conference
Notes Approved no
Call Number Admin @ si @ Roy2010 Serial 1455
Permanent link to this record
 

 
Author Joan Mas
Title A Syntactic Pattern Recognition Approach based on a Distribution Tolerant Adjacency Grammar and a Spatial Indexed Parser. Application to Sketched Document Recognition Type Book Whole
Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Sketch recognition is a discipline which has gained an increasing interest in the last
20 years. This is due to the appearance of new devices such as PDA, Tablet PC’s
or digital pen & paper protocols. From the wide range of sketched documents we
focus on those that represent structured documents such as: architectural floor-plans,
engineering drawing, UML diagrams, etc. To recognize and understand these kinds
of documents, first we have to recognize the different compounding symbols and then
we have to identify the relations between these elements. From the way that a sketch
is captured, there are two categories: on-line and off-line. On-line input modes refer
to draw directly on a PDA or a Tablet PC’s while off-line input modes refer to scan
a previously drawn sketch.
This thesis is an overlapping of three different areas on Computer Science: Pattern
Recognition, Document Analysis and Human-Computer Interaction. The aim of this
thesis is to interpret sketched documents independently on whether they are captured
on-line or off-line. For this reason, the proposed approach should contain the following
features. First, as we are working with sketches the elements present in our input
contain distortions. Second, as we would work in on-line or off-line input modes, the
order in the input of the primitives is indifferent. Finally, the proposed method should
be applied in real scenarios, its response time must be slow.
To interpret a sketched document we propose a syntactic approach. A syntactic
approach is composed of two correlated components: a grammar and a parser. The
grammar allows describing the different elements on the document as well as their
relations. The parser, given a document checks whether it belongs to the language
generated by the grammar or not. Thus, the grammar should be able to cope with
the distortions appearing on the instances of the elements. Moreover, it would be
necessary to define a symbol independently of the order of their primitives. Concerning to the parser when analyzing 2D sentences, it does not assume an order in the
primitives. Then, at each new primitive in the input, the parser searches among the
previous analyzed symbols candidates to produce a valid reduction.
Taking into account these features, we have proposed a grammar based on Adjacency Grammars. This kind of grammars defines their productions as a multiset
of symbols rather than a list. This allows describing a symbol without an order in
their components. To cope with distortion we have proposed a distortion model.
This distortion model is an attributed estimated over the constraints of the grammar and passed through the productions. This measure gives an idea on how far is the
symbol from its ideal model. In addition to the distortion on the constraints other
distortions appear when working with sketches. These distortions are: overtracing,
overlapping, gaps or spurious strokes. Some grammatical productions have been defined to cope with these errors. Concerning the recognition, we have proposed an
incremental parser with an indexation mechanism. Incremental parsers analyze the
input symbol by symbol given a response to the user when a primitive is analyzed.
This makes incremental parser suitable to work in on-line as well as off-line input
modes. The parser has been adapted with an indexation mechanism based on a spatial division. This indexation mechanism allows setting the primitives in the space
and reducing the search to a neighbourhood.
A third contribution is a grammatical inference algorithm. This method given a
set of symbols captures the production describing it. In the field of formal languages,
different approaches has been proposed but in the graphical domain not so much work
is done in this field. The proposed method is able to capture the production from
a set of symbol although they are drawn in different order. A matching step based
on the Haussdorff distance and the Hungarian method has been proposed to match
the primitives of the different symbols. In addition the proposed approach is able to
capture the variability in the parameters of the constraints.
From the experimental results, we may conclude that we have proposed a robust
approach to describe and recognize sketches. Moreover, the addition of new symbols
to the alphabet is not restricted to an expert. Finally, the proposed approach has
been used in two real scenarios obtaining a good performance.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Gemma Sanchez;Josep Llados
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 978-84-937261-4-0 Medium
Area Expedition Conference
Notes DAG Approved no
Call Number DAG @ dag @ Mas2010 Serial 1334
Permanent link to this record
 

 
Author Ivan Huerta
Title Foreground Object Segmentation and Shadow Detection for Video Sequences in Uncontrolled Environments Type Book Whole
Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract This Thesis is mainly divided in two parts. The first one presents a study of motion
segmentation problems. Based on this study, a novel algorithm for mobile-object
segmentation from a static background scene is also presented. This approach is
demonstrated robust and accurate under most of the common problems in motion
segmentation. The second one tackles the problem of shadows in depth. Firstly, a
bottom-up approach based on a chromatic shadow detector is presented to deal with
umbra shadows. Secondly, a top-down approach based on a tracking system has been
developed in order to enhance the chromatic shadow detection.
In our first contribution, a case analysis of motion segmentation problems is presented by taking into account the problems associated with different cues, namely
colour, edge and intensity. Our second contribution is a hybrid architecture which
handles the main problems observed in such a case analysis, by fusing (i) the knowledge from these three cues and (ii) a temporal difference algorithm. On the one hand,
we enhance the colour and edge models to solve both global/local illumination changes
(shadows and highlights) and camouflage in intensity. In addition, local information is
exploited to cope with a very challenging problem such as the camouflage in chroma.
On the other hand, the intensity cue is also applied when colour and edge cues are not
available, such as when beyond the dynamic range. Additionally, temporal difference
is included to segment motion when these three cues are not available, such as that
background not visible during the training period. Lastly, the approach is enhanced
for allowing ghost detection. As a result, our approach obtains very accurate and robust motion segmentation in both indoor and outdoor scenarios, as quantitatively and
qualitatively demonstrated in the experimental results, by comparing our approach
with most best-known state-of-the-art approaches.
Motion Segmentation has to deal with shadows to avoid distortions when detecting
moving objects. Most segmentation approaches dealing with shadow detection are
typically restricted to penumbra shadows. Therefore, such techniques cannot cope
well with umbra shadows. Consequently, umbra shadows are usually detected as part
of moving objects.
Firstly, a bottom-up approach for detection and removal of chromatic moving
shadows in surveillance scenarios is proposed. Secondly, a top-down approach based
on kalman filters to detect and track shadows has been developed in order to enhance
the chromatic shadow detection. In the Bottom-up part, the shadow detection approach applies a novel technique based on gradient and colour models for separating
chromatic moving shadows from moving objects.
Well-known colour and gradient models are extended and improved into an invariant colour cone model and an invariant gradient model, respectively, to perform
automatic segmentation while detecting potential shadows. Hereafter, the regions corresponding to potential shadows are grouped by considering ”a bluish effect” and an
edge partitioning. Lastly, (i) temporal similarities between local gradient structures
and (ii) spatial similarities between chrominance angle and brightness distortions are
analysed for all potential shadow regions in order to finally identify umbra shadows.
In the top-down process, after detection of objects and shadows both are tracked
using Kalman filters, in order to enhance the chromatic shadow detection, when it
fails to detect a shadow. Firstly, this implies a data association between the blobs
(foreground and shadow) and Kalman filters. Secondly, an event analysis of the different data association cases is performed, and occlusion handling is managed by a
Probabilistic Appearance Model (PAM). Based on this association, temporal consistency is looked for the association between foregrounds and shadows and their
respective Kalman Filters. From this association several cases are studied, as a result
lost chromatic shadows are correctly detected. Finally, the tracking results are used
as feedback to improve the shadow and object detection.
Unlike other approaches, our method does not make any a-priori assumptions
about camera location, surface geometries, surface textures, shapes and types of
shadows, objects, and background. Experimental results show the performance and
accuracy of our approach in different shadowed materials and illumination conditions.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 978-84-937261-3-3 Medium
Area Expedition Conference
Notes Approved no
Call Number ISE @ ise @ Hue2010 Serial 1332
Permanent link to this record
 

 
Author Carles Fernandez
Title Understanding Image Sequences: the Role of Ontologies in Cognitive Vision Type Book Whole
Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract The increasing ubiquitousness of digital information in our daily lives has positioned
video as a favored information vehicle, and given rise to an astonishing generation of
social media and surveillance footage. This raises a series of technological demands
for automatic video understanding and management, which together with the compromising attentional limitations of human operators, have motivated the research
community to guide its steps towards a better attainment of such capabilities. As
a result, current trends on cognitive vision promise to recognize complex events and
self-adapt to different environments, while managing and integrating several types of
knowledge. Future directions suggest to reinforce the multi-modal fusion of information sources and the communication with end-users.
In this thesis we tackle the problem of recognizing and describing meaningful
events in video sequences from different domains, and communicating the resulting
knowledge to end-users by means of advanced interfaces for human–computer interaction. This problem is addressed by designing the high-level modules of a cognitive
vision framework exploiting ontological knowledge. Ontologies allow us to define the
relevant concepts in a domain and the relationships among them; we prove that the
use of ontologies to organize, centralize, link, and reuse different types of knowledge
is a key factor in the materialization of our objectives.
The proposed framework contributes to: (i) automatically learn the characteristics
of different scenarios in a domain; (ii) reason about uncertain, incomplete, or vague
information from visual –camera’s– or linguistic –end-user’s– inputs; (iii) derive plausible interpretations of complex events from basic spatiotemporal developments; (iv)
facilitate natural interfaces that adapt to the needs of end-users, and allow them to
communicate efficiently with the system at different levels of interaction; and finally,
(v) find mechanisms to guide modeling processes, maintain and extend the resulting
models, and to exploit multimodal resources synergically to enhance the former tasks.
We describe a holistic methodology to achieve these goals. First, the use of prior
taxonomical knowledge is proved useful to guide MAP-MRF inference processes in
the automatic identification of semantic regions, with independence of a particular scenario. Towards the recognition of complex video events, we combine fuzzy
metric-temporal reasoning with SGTs, thus assessing high-level interpretations from
spatiotemporal data. Here, ontological resources like T–Boxes, onomasticons, or factual databases become useful to derive video indexing and retrieval capabilities, and
also to forward highlighted content to smart user interfaces. There, we explore the
application of ontologies to discourse analysis and cognitive linguistic principles, or scene augmentation techniques towards advanced communication by means of natural language dialogs and synthetic visualizations. Ontologies become fundamental to
coordinate, adapt, and reuse the different modules in the system.
The suitability of our ontological framework is demonstrated by a series of applications that especially benefit the field of smart video surveillance, viz. automatic generation of linguistic reports about the content of video sequences in multiple natural
languages; content-based filtering and summarization of these reports; dialogue-based
interfaces to query and browse video contents; automatic learning of semantic regions
in a scenario; and tools to evaluate the performance of components and models in the
system, via simulation and augmented reality.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN (down) 978-84-937261-2-6 Medium
Area Expedition Conference
Notes Approved no
Call Number Admin @ si @ Fer2010a Serial 1333
Permanent link to this record