Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	31–45 of 199 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–14]

List View

Citations

Details

	Records
	Author	Jaume Garcia
	Title	Statistical Models of the Architecture and Function of the Left Ventricle			Type	Book Whole
	Year	2009	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Cardiovascular Diseases, specially those affecting the Left Ventricle (LV), are the leading cause of death in developed countries with approximately a 30% of all global deaths. In order to address this public health concern, physicians focus on diagnosis and therapy planning. On one hand, early and accurate detection of Regional Wall Motion Abnormalities (RWMA) significantly contributes to a quick diagnosis and prevents the patient to reach more severe stages. On the other hand, a thouroughly knowledge of the normal gross anatomy of the LV, as well as, the distribution of its muscular fibers is crucial for designing specific interventions and therapies (such as pacemaker implanction). Statistical models obtained from the analysis of different imaging modalities allow the computation of the normal ranges of variation within a given population. Normality models are a valuable tool for the definition of objective criterions quantifying the degree of (anomalous) deviation of the LV function and anatomy for a given subject. The creation of statistical models involve addressing three main issues: extraction of data from images, definition of a common domain for comparison of data across patients and designing appropriate statistical analysis schemes. In this PhD thesis we present generic image processing tools for the creation of statistical models of the LV anatomy and function. On one hand, we use differential geometry concepts to define a computational framework (the Normalized Parametric Domain, NPD) suitable for the comparison and fusion of several clinical scores obtained over the LV. On the other hand, we present a variational approach (the Harmonic Phase Flow, HPF) for the estimation of myocardial motion that provides dense and continuous vector fields without overestimating motion at injured areas. These tools are used for the creation of statistical models. Regarding anatomy, we obtain an atlas jointly modelling, both, LV gross anatomy and fiber architecture. Regarding function, we compute normality patterns of scores characterizing the (global and local) LV function and explore, for the first time, the configuration of local scores better suited for RWMA detection.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Debora Gil
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	IAM			Approved	no
	Call Number	IAM @ iam @ Gar2009a			Serial	1499
Permanent link to this record



	Author	Javier Vazquez
	Title	Colour Constancy in Natural Through Colour Naming and Sensor Sharpening			Type	Book Whole
	Year	2011	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Colour is derived from three physical properties: incident light, object reflectance and sensor sensitivities. Incident light varies under natural conditions; hence, recovering scene illuminant is an important issue in computational colour. One way to deal with this problem under calibrated conditions is by following three steps, 1) building a narrow-band sensor basis to accomplish the diagonal model, 2) building a feasible set of illuminants, and 3) defining criteria to select the best illuminant. In this work we focus on colour constancy for natural images by introducing perceptual criteria in the first and third stages. To deal with the illuminant selection step, we hypothesise that basic colour categories can be used as anchor categories to recover the best illuminant. These colour names are related to the way that the human visual system has evolved to encode relevant natural colour statistics. Therefore the recovered image provides the best representation of the scene labelled with the basic colour terms. We demonstrate with several experiments how this selection criterion achieves current state-of-art results in computational colour constancy. In addition to this result, we psychophysically prove that usual angular error used in colour constancy does not correlate with human preferences, and we propose a new perceptual colour constancy evaluation. The implementation of this selection criterion strongly relies on the use of a diagonal model for illuminant change. Consequently, the second contribution focuses on building an appropriate narrow-band sensor basis to represent natural images. We propose to use the spectral sharpening technique to compute a unique narrow-band basis optimised to represent a large set of natural reflectances under natural illuminants and given in the basis of human cones. The proposed sensors allow predicting unique hues and the World colour Survey data independently of the illuminant by using a compact singularity function. Additionally, we studied different families of sharp sensors to minimise different perceptual measures. This study brought us to extend the spherical sampling procedure from 3D to 6D. Several research lines still remain open. One natural extension would be to measure the effects of using the computed sharp sensors on the category hypothesis, while another might be to insert spatial contextual information to improve category hypothesis. Finally, much work still needs to be done to explore how individual sensors can be adjusted to the colours in a scene.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Maria Vanrell;Graham D. Finlayson
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	CIC			Approved	no
	Call Number	Admin @ si @ Vaz2011a			Serial	1785
Permanent link to this record



	Author	Ferran Diego
	Title	Probabilistic Alignment of Video Sequences Recorded by Moving Cameras			Type	Book Whole
	Year	2011	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Video alignment consists of integrating multiple video sequences recorded independently into a single video sequence. This means to register both in time (synchronize frames) and space (image registration) so that the two videos sequences can be fused or compared pixel–wise. In spite of being relatively unknown, many applications today may benefit from the availability of robust and efficient video alignment methods. For instance, video surveillance requires to integrate video sequences that are recorded of the same scene at different times in order to detect changes. The problem of aligning videos has been addressed before, but in the relatively simple cases of fixed or rigidly attached cameras and simultaneous acquisition. In addition, most works rely on restrictive assumptions which reduce its difficulty such as linear time correspondence or the knowledge of the complete trajectories of corresponding scene points on the images; to some extent, these assumptions limit the practical applicability of the solutions developed until now. In this thesis, we focus on the challenging problem of aligning sequences recorded at different times from independent moving cameras following similar but not coincident trajectories. More precisely, this thesis covers four studies that advance the state-of-the-art in video alignment. First, we focus on analyzing and developing a probabilistic framework for video alignment, that is, a principled way to integrate multiple observations and prior information. In this way, two different approaches are presented to exploit the combination of several purely visual features (image–intensities, visual words and dense motion field descriptor), and global positioning system (GPS) information. Second, we focus on reformulating the problem into a single alignment framework since previous works on video alignment adopt a divide–and–conquer strategy, i.e., first solve the synchronization, and then register corresponding frames. This also generalizes the ’classic’ case of fixed geometric transform and linear time mapping. Third, we focus on exploiting directly the time domain of the video sequences in order to avoid exhaustive cross–frame search. This provides relevant information used for learning the temporal mapping between pairs of video sequences. Finally, we focus on adapting these methods to the on–line setting for road detection and vehicle geolocation. The qualitative and quantitative results presented in this thesis on a variety of real–world pairs of video sequences show that the proposed method is: robust to varying imaging conditions, different image content (e.g., incoming and outgoing vehicles), variations on camera velocity, and different scenarios (indoor and outdoor) going beyond the state–of–the–art. Moreover, the on–line video alignment has been successfully applied for road detection and vehicle geolocation achieving promising results.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Joan Serrat
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ Die2011			Serial	1787
Permanent link to this record



	Author	Eduard Vazquez
	Title	Unsupervised image segmentation based on material reflectance description and saliency			Type	Book Whole
	Year	2011	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Image segmentations aims to partition an image into a set of non-overlapped regions, called segments. Despite the simplicity of the definition, image segmentation raises as a very complex problem in all its stages. The definition of segment is still unclear. When asking to a human to perform a segmentation, this person segments at different levels of abstraction. Some segments might be a single, well-defined texture whereas some others correspond with an object in the scene which might including multiple textures and colors. For this reason, segmentation is divided in bottom-up segmentation and top-down segmentation. Bottom up-segmentation is problem independent, that is, focused on general properties of the images such as textures or illumination. Top-down segmentation is a problem-dependent approach which looks for specific entities in the scene, such as known objects. This work is focused on bottom-up segmentation. Beginning from the analysis of the lacks of current methods, we propose an approach called RAD. Our approach overcomes the main shortcomings of those methods which use the physics of the light to perform the segmentation. RAD is a topological approach which describes a single-material reflectance. Afterwards, we cope with one of the main problems in image segmentation: non supervised adaptability to image content. To yield a non-supervised method, we use a model of saliency yet presented in this thesis. It computes the saliency of the chromatic transitions of an image by means of a statistical analysis of the images derivatives. This method of saliency is used to build our final approach of segmentation: spRAD. This method is a non-supervised segmentation approach. Our saliency approach has been validated with a psychophysical experiment as well as computationally, overcoming a state-of-the-art saliency method. spRAD also outperforms state-of-the-art segmentation techniques as results obtained with a widely-used segmentation dataset show
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher		Place of Publication		Editor	Ramon Baldrich
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	CIC			Approved	no
	Call Number	Admin @ si @ Vaz2011b			Serial	1835
Permanent link to this record



	Author	Santiago Segui
	Title	Contributions to the Diagnosis of Intestinal Motility by Automatic Image Analysis			Type	Book Whole
	Year	2011	Publication	PhD Thesis, Universitat de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In the early twenty first century Given Imaging Ltd. presented wireless capsule endoscopy (WCE) as a new technological breakthrough that allowed the visualization of the intestine by using a small, swallowed camera. This small size device was received with a high enthusiasm within the medical community, and until now, it is still one of the medical devices with the highest use growth rate. WCE can be used as a novel diagnostic tool that presents several clinical advantages, since it is non-invasive and at the same time it provides, for the first time, a full picture of the small bowel morphology, contents and dynamics. Since its appearance, the WCE has been used to detect several intestinal dysfunctions such as: polyps, ulcers and bleeding. However, the visual analysis of WCE videos presents an important drawback: the long time required by the physicians for proper video visualization. In this sense and regarding to this limitation, the development of computer aided systems is required for the extensive use of WCE in the medical community. The work presented in this thesis is a set of contributions for the automatic image analysis and computer-aided diagnosis of intestinal motility disorders using WCE. Until now, the diagnosis of small bowel motility dysfunctions was basically performed by invasive techniques such as the manometry test, which can only be conducted at some referral centers around the world owing to the complexity of the procedure and the medial expertise required in the interpretation of the results. Our contributions are divided in three main blocks: 1. Image analysis by computer vision techniques to detect events in the endoluminal WCE scene. Several methods have been proposed to detect visual events such as: intestinal contractions, intestinal content, tunnel and wrinkles; 2. Machine learning techniques for the analysis and the manipulation of the data from WCE. These methods have been proposed in order to overcome the problems that the analysis of WCE presents such as: video acquisition cost, unlabeled data and large number of data; 3. Two different systems for the computer-aided diagnosis of intestinal motility disorders using WCE. The first system presents a fully automatic method that aids at discriminating healthy subjects from patients with severe intestinal motor disorders like pseudo-obstruction or food intolerance. The second system presents another automatic method that models healthy subjects and discriminate them from mild intestinal motility patients.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Jordi Vitria
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB			Approved	no
	Call Number	Admin @ si @ Seg2011			Serial	1836
Permanent link to this record



	Author	Pierluigi Casale
	Title	Approximate Ensemble Methods for Physical Activity Recognition Applications			Type	Book Whole
	Year	2011	Publication	PhD Thesis, Universitat de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	The main interest of this thesis focuses on computational methodologies able to reduce the degree of complexity of learning algorithms and its application to physical activity recognition. Random Projections will be used to reduce the computational complexity in Multiple Classifier Systems. A new boosting algorithm and a new one-class classification methodology have been developed. In both cases, random projections are used for reducing the dimensionality of the problem and for generating diversity, exploiting in this way the benefits that ensembles of classifiers provide in terms of performances and stability. Moreover, the new one-class classification methodology, based on an ensemble strategy able to approximate a multidimensional convex-hull, has been proved to over-perform state-of-the-art one-class classification methodologies. The practical focus of the thesis is towards Physical Activity Recognition. A new hardware platform for wearable computing application has been developed and used for collecting data of activities of daily living allowing to study the optimal features set able to successful classify activities. Based on the classification methodologies developed and the study conducted on physical activity classification, a machine learning architecture capable to provide a continuous authentication mechanism for mobile-devices users has been worked out, as last part of the thesis. The system, based on a personalized classifier, states on the analysis of the characteristic gait patterns typical of each individual ensuring an unobtrusive and continuous authentication mechanism
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Oriol Pujol;Petia Radeva
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB			Approved	no
	Call Number	Admin @ si @ Cas2011			Serial	1837
Permanent link to this record



	Author	Fahad Shahbaz Khan
	Title	Coloring bag-of-words based image representations			Type	Book Whole
	Year	2011	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Put succinctly, the bag-of-words based image representation is the most successful approach for object and scene recognition. Within the bag-of-words framework the optimal fusion of multiple cues, such as shape, texture and color, still remains an active research domain. There exist two main approaches to combine color and shape information within the bag-of-words framework. The first approach called, early fusion, fuses color and shape at the feature level as a result of which a joint colorshape vocabulary is produced. The second approach, called late fusion, concatenates histogram representation of both color and shape, obtained independently. In the first part of this thesis, we analyze the theoretical implications of both early and late feature fusion. We demonstrate that both these approaches are suboptimal for a subset of object categories. Consequently, we propose a novel method for recognizing object categories when using multiple cues by separately processing the shape and color cues and combining them by modulating the shape features by category specific color attention. Color is used to compute bottom-up and top-down attention maps. Subsequently, the color attention maps are used to modulate the weights of the shape features. Shape features are given more weight in regions with higher attention and vice versa. The approach is tested on several benchmark object recognition data sets and the results clearly demonstrate the effectiveness of our proposed method. In the second part of the thesis, we investigate the problem of obtaining compact spatial pyramid representations for object and scene recognition. Spatial pyramids have been successfully applied to incorporate spatial information into bag-of-words based image representation. However, a major drawback of spatial pyramids is that it leads to high dimensional image representations. We present a novel framework for obtaining compact pyramid representation. The approach reduces the size of a high dimensional pyramid representation upto an order of magnitude without any significant reduction in accuracy. Moreover, we also investigate the optimal combination of multiple features such as color and shape within the context of our compact pyramid representation. Finally, we describe a novel technique to build discriminative visual words from multiple cues learned independently from training images. To this end, we use an information theoretic vocabulary compression technique to find discriminative combinations of visual cues and the resulting visual vocabulary is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. The approach is tested on standard object recognition data sets. The results obtained clearly demonstrate the effectiveness of our approach.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher		Place of Publication		Editor	Joost Van de Weijer;Maria Vanrell
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	CIC			Approved	no
	Call Number	Admin @ si @ Kha2011			Serial	1838
Permanent link to this record



	Author	Muhammad Anwer Rao
	Title	Color for Object Detection and Action Recognition			Type	Book Whole
	Year	2013	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Recognizing object categories in real world images is a challenging problem in computer vision. The deformable part based framework is currently the most successful approach for object detection. Generally, HOG are used for image representation within the part-based framework. For action recognition, the bag-of-word framework has shown to provide promising results. Within the bag-of-words framework, local image patches are described by SIFT descriptor. Contrary to object detection and action recognition, combining color and shape has shown to provide the best performance for object and scene recognition. In the first part of this thesis, we analyze the problem of person detection in still images. Standard person detection approaches rely on intensity based features for image representation while ignoring the color. Channel based descriptors is one of the most commonly used approaches in object recognition. This inspires us to evaluate incorporating color information using the channel based fusion approach for the task of person detection. In the second part of the thesis, we investigate the problem of object detection in still images. Due to high dimensionality, channel based fusion increases the computational cost. Moreover, channel based fusion has been found to obtain inferior results for object category where one of the visual varies significantly. On the other hand, late fusion is known to provide improved results for a wide range of object categories. A consequence of late fusion strategy is the need of a pure color descriptor. Therefore, we propose to use Color attributes as an explicit color representation for object detection. Color attributes are compact and computationally efficient. Consequently color attributes are combined with traditional shape features providing excellent results for object detection task. Finally, we focus on the problem of action detection and classification in still images. We investigate the potential of color for action classification and detection in still images. We also evaluate different fusion approaches for combining color and shape information for action recognition. Additionally, an analysis is performed to validate the contribution of color for action recognition. Our results clearly demonstrate that combining color and shape information significantly improve the performance of both action classification and detection in still images.
	Address	Barcelona
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Antonio Lopez;Joost Van de Weijer
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ Rao2013			Serial	2281
Permanent link to this record



	Author	Javier Marin
	Title	Pedestrian Detection Based on Local Experts			Type	Book Whole
	Year	2013	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	During the last decade vision-based human detection systems have started to play a key rolein multiple applications linked to driver assistance, surveillance, robot sensing and home automation. Detecting humans is by far one of the most challenging tasks in Computer Vision. This is mainly due to the high degree of variability in the human appearanceassociated to the clothing, pose, shape and size. Besides, other factors such as cluttered scenarios, partial occlusions, or environmental conditions can make the detection task even harder. Most promising methods of the state-of-the-art rely on discriminative learning paradigms which are fed with positive and negative examples. The training data is one of the most relevant elements in order to build a robust detector as it has to cope the large variability of the target. In order to create this dataset human supervision is required. The drawback at this point is the arduous effort of annotating as well as looking for such claimed variability. In this PhD thesis we address two recurrent problems in the literature. In the first stage,we aim to reduce the consuming task of annotating, namely, by using computer graphics. More concretely, we develop a virtual urban scenario for later generating a pedestrian dataset. Then, we train a detector using this dataset, and finally we assess if this detector can be successfully applied in a real scenario. In the second stage, we focus on increasing the robustness of our pedestrian detectors under partial occlusions. In particular, we present a novel occlusion handling approach to increase the performance of block-based holistic methods under partial occlusions. For this purpose, we make use of local experts via a RandomSubspaceMethod (RSM) to handle these cases. If the method infers a possible partial occlusion, then the RSM, based on performance statistics obtained from partially occluded data, is applied. The last objective of this thesis is to propose a robust pedestrian detector based on an ensemble of local experts. To achieve this goal, we use the random forest paradigm, where the trees act as ensembles an their nodesare the local experts. In particular, each expert focus on performing a robust classification ofa pedestrian body patch. This approach offers computational efficiency and far less design complexity when compared to other state-of-the-artmethods, while reaching better accuracy
	Address	Barcelona
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Antonio Lopez;Jaume Amores
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ Mar2013			Serial	2280
Permanent link to this record



	Author	Wenjuan Gong
	Title	3D Motion Data aided Human Action Recognition and Pose Estimation			Type	Book Whole
	Year	2013	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In this work, we explore human action recognition and pose estimation prob- lems. Different from traditional works of learning from 2D images or video sequences and their annotated output, we seek to solve the problems with ad- ditional 3D motion capture information, which helps to fill the gap between 2D image features and human interpretations. We first compare two different schools of approaches commonly used for 3D pose estimation from 2D pose configuration: modeling and learning methods. By looking into experiments results and considering our problems, we fixed a learning method as the following approaches to do pose estimation. We then establish a framework by adding a module of detecting 2D pose configuration from images with varied background, which widely extend the application of the approach. We also seek to directly estimate 3D poses from image features, instead of estimating 2D poses as a intermediate module. We explore a robust input feature, which combined with the proposed distance measure, provides a solution for noisy or corrupted inputs. We further utilize the above method to estimate weak poses,which is a concise representation of the original poses by using dimension deduction technologies, from image features. Weak pose space is where we calculate vocabulary and label action types using a bog of words pipeline. Temporal information of an action is taken into consideration by considering several consecutive frames as a single unit for computing vocabulary and histogram assignments.
	Address	Barcelona
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Jordi Gonzalez;Xavier Roca
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ISE			Approved	no
	Call Number	Admin @ si @ Gon2013			Serial	2279
Permanent link to this record



	Author	Murad Al Haj
	Title	Looking at Faces: Detection, Tracking and Pose Estimation			Type	Book Whole
	Year	2013	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Humans can effortlessly perceive faces, follow them over space and time, and decode their rich content, such as pose, identity and expression. However, despite many decades of research on automatic facial perception in areas like face detection, expression recognition, pose estimation and face recognition, and despite many successes, a complete solution remains elusive. This thesis is dedicated to three problems in automatic face perception, namely face detection, face tracking and pose estimation. In face detection, an initial simple model is presented that uses pixel-based heuristics to segment skin locations and hand-crafted rules to determine the locations of the faces present in an image. Different colorspaces are studied to judge whether a colorspace transformation can aid skin color detection. The output of this study is used in the design of a more complex face detector that is able to successfully generalize to different scenarios. In face tracking, a framework that combines estimation and control in a joint scheme is presented to track a face with a single pan-tilt-zoom camera. While this work is mainly motivated by tracking faces, it can be easily applied atop of any detector to track different objects. The applicability of this method is demonstrated on simulated as well as real-life scenarios. The last and most important part of this thesis is dedicate to monocular head pose estimation. In this part, a method based on partial least squares (PLS) regression is proposed to estimate pose and solve the alignment problem simultaneously. The contributions of this work are two-fold: 1) demonstrating that the proposed method achieves better than state-of-the-art results on the estimation problem and 2) developing a technique to reduce misalignment based on the learned PLS factors that outperform multiple instance learning (MIL) without the need for any re-training or the inclusion of misaligned samples in the training process, as normally done in MIL.
	Address	Barcelona
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Jordi Gonzalez;Xavier Roca
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ISE			Approved	no
	Call Number	Admin @ si @ Haj2013			Serial	2278
Permanent link to this record



	Author	Albert Gordo
	Title	Document Image Representation, Classification and Retrieval in Large-Scale Domains			Type	Book Whole
	Year	2013	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Despite the “paperless office” ideal that started in the decade of the seventies, businesses still strive against an increasing amount of paper documentation. Companies still receive huge amounts of paper documentation that need to be analyzed and processed, mostly in a manual way. A solution for this task consists in, first, automatically scanning the incoming documents. Then, document images can be analyzed and information can be extracted from the data. Documents can also be automatically dispatched to the appropriate workflows, used to retrieve similar documents in the dataset to transfer information, etc. Due to the nature of this “digital mailroom”, we need document representation methods to be general, i.e., able to cope with very different types of documents. We need the methods to be sound, i.e., able to cope with unexpected types of documents, noise, etc. And, we need to methods to be scalable, i.e., able to cope with thousands or millions of documents that need to be processed, stored, and consulted. Unfortunately, current techniques of document representation, classification and retrieval are not apt for this digital mailroom framework, since they do not fulfill some or all of these requirements. Through this thesis we focus on the problem of document representation aimed at classification and retrieval tasks under this digital mailroom framework. We first propose a novel document representation based on runlength histograms, and extend it to cope with more complex documents such as multiple-page documents, or documents that contain more sources of information such as extracted OCR text. Then we focus on the scalability requirements and propose a novel binarization method which we dubbed PCAE, as well as two general asymmetric distances between binary embeddings that can significantly improve the retrieval results at a minimal extra computational cost. Finally, we note the importance of supervised learning when performing large-scale retrieval, and study several approaches that can significantly boost the results at no extra cost at query time.
	Address	Barcelona
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Ernest Valveny;Florent Perronnin
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ Gor2013			Serial	2277
Permanent link to this record



	Author	Mario Hernandez; Joao Sanchez; Jordi Vitria
	Title	Selected papers from Iberian Conference on Pattern Recognition and Image Analysis			Type	Book Whole
	Year	2012	Publication	Pattern Recognition	Abbreviated Journal
	Volume	45	Issue	9	Pages	3047-3582
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0031-3203	ISBN		Medium
	Area		Expedition		Conference
	Notes	OR;MV			Approved	no
	Call Number	Admin @ si @ HSV2012			Serial	2069
Permanent link to this record



	Author	Shida Beigpour
	Title	Illumination and object reflectance modeling			Type	Book Whole
	Year	2013	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	More realistic and accurate models of the scene illumination and object reflectance can greatly improve the quality of many computer vision and computer graphics tasks. Using such model, a more profound knowledge about the interaction of light with object surfaces can be established which proves crucial to a variety of computer vision applications. In the current work, we investigate the various existing approaches to illumination and reflectance modeling and form an analysis on their shortcomings in capturing the complexity of real-world scenes. Based on this analysis we propose improvements to different aspects of reflectance and illumination estimation in order to more realistically model the real-world scenes in the presence of complex lighting phenomena (i.e, multiple illuminants, interreflections and shadows). Moreover, we captured our own multi-illuminant dataset which consists of complex scenes and illumination conditions both outdoor and in laboratory conditions. In addition we investigate the use of synthetic data to facilitate the construction of datasets and improve the process of obtaining ground-truth information.
	Address	Barcelona
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Joost Van de Weijer;Ernest Valveny
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	CIC			Approved	no
	Call Number	Admin @ si @ Bei2013			Serial	2267
Permanent link to this record



	Author	Francesco Ciompi
	Title	Multi-Class Learning for Vessel Characterization in Intravascular Ultrasound			Type	Book Whole
	Year	2012	Publication	PhD Thesis, Universitat de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In this thesis we tackle the problem of automatic characterization of human coronary vessel in Intravascular Ultrasound (IVUS) image modality. The basis for the whole characterization process is machine learning applied to multi-class problems. In all the presented approaches, the Error-Correcting Output Codes (ECOC) framework is used as central element for the design of multi-class classifiers. Two main topics are tackled in this thesis. First, the automatic detection of the vessel borders is presented. For this purpose, a novel context-aware classifier for multi-class classification of the vessel morphology is presented, namely ECOC-DRF. Based on ECOC-DRF, the lumen border and the media-adventitia border in IVUS are robustly detected by means of a novel holistic approach, achieving an error comparable with inter-observer variability and with state of the art methods. The two vessel borders define the atheroma area of the vessel. In this area, tissue characterization is required. For this purpose, we present a framework for automatic plaque characterization by processing both texture in IVUS images and spectral information in raw Radio Frequency data. Furthermore, a novel method for fusing in-vivo and in-vitro IVUS data for plaque characterization is presented, namely pSFFS. The method demonstrates to effectively fuse data generating a classifier that improves the tissue characterization in both in-vitro and in-vivo datasets. A novel method for automatic video summarization in IVUS sequences is also presented. The method aims to detect the key frames of the sequence, i.e., the frames representative of morphological changes. This novel method represents the basis for video summarization in IVUS as well as the markers for the partition of the vessel into morphological and clinically interesting events. Finally, multi-class learning based on ECOC is applied to lung tissue characterization in Computed Tomography. The novel proposed approach, based on supervised and unsupervised learning, achieves accurate tissue classification on a large and heterogeneous dataset.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Petia Radeva;Oriol Pujol
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB			Approved	no
	Call Number	Admin @ si @ Cio2012			Serial	2146
Permanent link to this record