toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author (down) Jose Luis Gomez edit  openurl
  Title Synth-to-real semi-supervised learning for visual tasks Type Book Whole
  Year 2023 Publication Going beyond Classification Problems for the Continual Learning of Deep Neural Networks Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The curse of data labeling is a costly bottleneck in supervised deep learning, where large amounts of labeled data are needed to train intelligent systems. In onboard perception for autonomous driving, this cost corresponds to the labeling of raw data from sensors such as cameras, LiDARs, RADARs, etc. Therefore, synthetic data with automatically generated ground truth (labels) has aroused as a reliable alternative for training onboard perception models.
However, synthetic data commonly suffers from synth-to-real domain shift, i.e., models trained on the synthetic domain do not show their achievable accuracy when performing in the real world. This shift needs to be addressed by techniques falling in the realm of domain adaptation (DA).
The semi-supervised learning (SSL) paradigm can be followed to address DA. In this case, a model is trained using source data with labels (here synthetic) and leverages minimal knowledge from target data (here the real world) to generate pseudo-labels. These pseudo-labels help the training process to reduce the gap between the source and the target domains. In general, we can assume accessing both, pseudo-labels and a few amounts of human-provided labels for the target-domain data. However, the most interesting and challenging setting consists in assuming that we do not have human-provided labels at all. This setting is known as unsupervised domain adaptation (UDA). This PhD focuses on applying SSL to the UDA setting, for onboard visual tasks related to autonomous driving. We start by addressing the synth-to-real UDA problem on onboard vision-based object detection (pedestrians and cars), a critical task for autonomous driving and driving assistance. In particular, we propose to apply an SSL technique known as co-training, which we adapt to work with deep models that process a multi-modal input. The multi-modality consists of the visual appearance of the images (RGB) and their monocular depth estimation. The synthetic data we use as the source domain contains both, object bounding boxes and depth information. This prior knowledge is the
starting point for the co-training technique, which iteratively labels unlabeled real-world data and uses such pseudolabels (here bounding boxes with an assigned object class) to progressively improve the labeling results. Along this
process, two models collaborate to automatically label the images, in a way that one model compensates for the errors of the other, so avoiding error drift. While this automatic labeling process is done offline, the resulting pseudolabels can be used to train object detection models that must perform in real-time onboard a vehicle. We show that multi-modal co-training improves the labeling results compared to single-modal co-training, remaining competitive compared to human labeling.
Given the success of co-training in the context of object detection, we have also adapted this technique to a more crucial and challenging visual task, namely, onboard semantic segmentation. In fact, providing labels for a single image
can take from 30 to 90 minutes for a human labeler, depending on the content of the image. Thus, developing automatic labeling techniques for this visual task is of great interest to the automotive industry. In particular, the new co-training framework addresses synth-to-real UDA by an initial stage of self-training. Intermediate models arising from this stage are used to start the co-training procedure, for which we have elaborated an accurate collaboration policy between the two models performing the automatic labeling. Moreover, our co-training seamlessly leverages datasets from different synthetic domains. In addition, the co-training procedure is agnostic to the loss function used to train the semantic segmentation models which perform the automatic labeling. We achieve state-of-the-art results on publicly available benchmark datasets, again, remaining competitive compared to human labeling.
Finally, on the ground of our previous experience, we have designed and implemented a new SSL technique for UDA in the context of visual semantic segmentation. In this case, we mimic the labeling methodology followed by human labelers. In particular, rather than labeling full images at a time, categories of semantic classes are defined and only those are labeled in a labeling pass. In fact, different human labelers can become specialists in labeling different categories. Afterward, these per-category-labeled layers are combined to provide fully labeled images. Our technique is inspired by this methodology since we perform synth-to-real UDA per category, using the self-training stage previously developed as part of our co-training framework. The pseudo-labels obtained for each category are finally
fused to obtain fully automatically labeled images. In this context, we have also contributed to the development of a new photo-realistic synthetic dataset based on path-tracing rendering. Our new SSL technique seamlessly leverages publicly available synthetic datasets as well as this new one to obtain state-of-the-art results on synth-to-real UDA for semantic segmentation. We show that the new dataset allows us to reach better labeling accuracy than previously existing datasets, at the same time that it complements well them when combined. Moreover, we also show that the new human-inspired SSL technique outperforms co-training.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher IMPRIMA Place of Publication Editor Antonio Lopez  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Gom2023 Serial 3961  
Permanent link to this record
 

 
Author (down) Jose Luis Alba; A. Pujol; Juan J. Villanueva edit  openurl
  Title Novel SOM-PCA Network for Face Identification. Type Miscellaneous
  Year 2001 Publication Applications and Science of Neural Networks, Fuzzy Systems and Evolutonary Computation IV, SPIE´s International Symposium on Optical Science and Technology. Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes Approved no  
  Call Number ISE @ ise @ APV2001a Serial 69  
Permanent link to this record
 

 
Author (down) Jose Luis Alba; A. Pujol; Juan J. Villanueva edit  openurl
  Title ST-SOM: A Shape+Texture Self Organizing Map. Type Miscellaneous
  Year 2001 Publication Proceedings of the IX Spanish Symposium on Pattern Recognition and Image Analysis, 1:55–60 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes Approved no  
  Call Number ISE @ ise @ APV2001b Serial 70  
Permanent link to this record
 

 
Author (down) Jose Luis Alba; A. Pujol; Juan J. Villanueva edit  openurl
  Title Separating Geometry from Texture to Improve Face Analysis. Type Miscellaneous
  Year 2001 Publication Proceeding ICIP 2001, IEEE International Conference on Image Processing, 2:673–676 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Grecia  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes Approved no  
  Call Number ISE @ ise @ APV2001c Serial 71  
Permanent link to this record
 

 
Author (down) Jose Garcia-Rodriguez; Isabelle Guyon; Sergio Escalera; Alexandra Psarrou; Andrew Lewis; Miguel Cazorla edit  doi
openurl 
  Title Editorial: Special Issue on Computational Intelligence for Vision and Robotics Type Journal Article
  Year 2017 Publication Neural Computing and Applications Abbreviated Journal Neural Computing and Applications  
  Volume 28 Issue 5 Pages 853–854  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA;MILAB; no menciona Approved no  
  Call Number Admin @ si @ GGE2017 Serial 2845  
Permanent link to this record
 

 
Author (down) Jose Elias Yauri; M. Lagos; H. Vega-Huerta; P. de-la-Cruz; G.L.E Maquen-Niño; E. Condor-Tinoco edit  doi
openurl 
  Title Detection of Epileptic Seizures Based-on Channel Fusion and Transformer Network in EEG Recordings Type Journal Article
  Year 2023 Publication International Journal of Advanced Computer Science and Applications Abbreviated Journal IJACSA  
  Volume 14 Issue 5 Pages 1067-1074  
  Keywords Epilepsy; epilepsy detection; EEG; EEG channel fusion; convolutional neural network; self-attention  
  Abstract According to the World Health Organization, epilepsy affects more than 50 million people in the world, and specifically, 80% of them live in developing countries. Therefore, epilepsy has become among the major public issue for many governments and deserves to be engaged. Epilepsy is characterized by uncontrollable seizures in the subject due to a sudden abnormal functionality of the brain. Recurrence of epilepsy attacks change people’s lives and interferes with their daily activities. Although epilepsy has no cure, it could be mitigated with an appropriated diagnosis and medication. Usually, epilepsy diagnosis is based on the analysis of an electroencephalogram (EEG) of the patient. However, the process of searching for seizure patterns in a multichannel EEG recording is a visual demanding and time consuming task, even for experienced neurologists. Despite the recent progress in automatic recognition of epilepsy, the multichannel nature of EEG recordings still challenges current methods. In this work, a new method to detect epilepsy in multichannel EEG recordings is proposed. First, the method uses convolutions to perform channel fusion, and next, a self-attention network extracts temporal features to classify between interictal and ictal epilepsy states. The method was validated in the public CHB-MIT dataset using the k-fold cross-validation and achieved 99.74% of specificity and 99.15% of sensitivity, surpassing current approaches.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM Approved no  
  Call Number Admin @ si @ Serial 3856  
Permanent link to this record
 

 
Author (down) Jose Elias Yauri; Aura Hernandez-Sabate; Pau Folch; Debora Gil edit  doi
openurl 
  Title Mental Workload Detection Based on EEG Analysis Type Conference Article
  Year 2021 Publication Artificial Intelligent Research and Development. Proceedings 23rd International Conference of the Catalan Association for Artificial Intelligence. Abbreviated Journal  
  Volume 339 Issue Pages 268-277  
  Keywords Cognitive states; Mental workload; EEG analysis; Neural Networks.  
  Abstract The study of mental workload becomes essential for human work efficiency, health conditions and to avoid accidents, since workload compromises both performance and awareness. Although workload has been widely studied using several physiological measures, minimising the sensor network as much as possible remains both a challenge and a requirement.
Electroencephalogram (EEG) signals have shown a high correlation to specific cognitive and mental states like workload. However, there is not enough evidence in the literature to validate how well models generalize in case of new subjects performing tasks of a workload similar to the ones included during model’s training.
In this paper we propose a binary neural network to classify EEG features across different mental workloads. Two workloads, low and medium, are induced using two variants of the N-Back Test. The proposed model was validated in a dataset collected from 16 subjects and shown a high level of generalization capability: model reported an average recall of 81.81% in a leave-one-out subject evaluation.
 
  Address Virtual; October 20-22 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CCIA  
  Notes IAM; 600.139; 600.118; 600.145 Approved no  
  Call Number Admin @ si @ Serial 3723  
Permanent link to this record
 

 
Author (down) Jose Elias Yauri edit  openurl
  Title Deep Learning Based Data Fusion Approaches for the Assessment of Cognitive States on EEG Signals Type Book Whole
  Year 2023 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract For millennia, the study of the couple brain-mind has fascinated the humanity in order to understand the complex nature of cognitive states. A cognitive state is the state of the mind at a specific time and involves cognition activities to acquire and process information for making a decision, solving a problem, or achieving a goal.
While normal cognitive states assist in the successful accomplishment of tasks; on the contrary, abnormal states of the mind can lead to task failures due to a reduced cognition capability. In this thesis, we focus on the assessment of cognitive states by means of the analysis of ElectroEncephaloGrams (EEG) signals using deep learning methods. EEG records the electrical activity of the brain using a set of electrodes placed on the scalp that output a set of spatiotemporal signals that are expected to be correlated to a specific mental process.
From the point of view of artificial intelligence, any method for the assessment of cognitive states using EEG signals as input should face several challenges. On the one hand, one should determine which is the most suitable approach for the optimal combination of the multiple signals recorded by EEG electrodes. On the other hand, one should have a protocol for the collection of good quality unambiguous annotated data, and an experimental design for the assessment of the generalization and transfer of models. In order to tackle them, first, we propose several convolutional neural architectures to perform data fusion of the signals recorded by EEG electrodes, at raw signal and feature levels. Four channel fusion methods, easy to incorporate into any neural network architecture, are proposed and assessed. Second, we present a method to create an unambiguous dataset for the prediction of cognitive mental workload using serious games and an Airbus-320 flight simulator. Third, we present a validation protocol that takes into account the levels of generalization of models based on the source and amount of test data.
Finally, the approaches for the assessment of cognitive states are applied to two use cases of high social impact: the assessment of mental workload for personalized support systems in the cockpit and the detection of epileptic seizures. The results obtained from the first use case show the feasibility of task transfer of models trained to detect workload in serious games to real flight scenarios. The results from the second use case show the generalization capability of our EEG channel fusion methods at k-fold cross-validation, patient-specific, and population levels.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher IMPRIMA Place of Publication Editor Aura Hernandez;Debora Gil  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM Approved no  
  Call Number Admin @ si @ Yau2023 Serial 3962  
Permanent link to this record
 

 
Author (down) Jose Carlos Rubio; Joan Serrat; Antonio Lopez; N. Paragios edit   pdf
url  isbn
openurl 
  Title Image Contextual Representation and Matching through Hierarchies and Higher Order Graphs Type Conference Article
  Year 2012 Publication 21st International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 2664 - 2667  
  Keywords  
  Abstract We present a region matching algorithm which establishes correspondences between regions from two segmented images. An abstract graph-based representation conceals the image in a hierarchical graph, exploiting the scene properties at two levels. First, the similarity and spatial consistency of the image semantic objects is encoded in a graph of commute times. Second, the cluttered regions of the semantic objects are represented with a shape descriptor. Many-to-many matching of regions is specially challenging due to the instability of the segmentation under slight image changes, and we explicitly handle it through high order potentials. We demonstrate the matching approach applied to images of world famous buildings, captured under different conditions, showing the robustness of our method to large variations in illumination and viewpoint.  
  Address Tsukuba Science City, Japan  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1051-4651 ISBN 978-1-4673-2216-4 Medium  
  Area Expedition Conference ICPR  
  Notes ADAS Approved no  
  Call Number Admin @ si @ RSL2012a; Serial 2032  
Permanent link to this record
 

 
Author (down) Jose Carlos Rubio; Joan Serrat; Antonio Lopez; Daniel Ponsa edit   pdf
openurl 
  Title Multiple-target tracking for the intelligent headlights control Type Conference Article
  Year 2010 Publication 13th Annual International Conference on Intelligent Transportation Systems Abbreviated Journal  
  Volume Issue Pages 903–910  
  Keywords Intelligent Headlights  
  Abstract TA7.4
Intelligent vehicle lighting systems aim at automatically regulating the headlights' beam to illuminate as much of the road ahead as possible while avoiding dazzling other drivers. A key component of such a system is computer vision software that is able to distinguish blobs due to vehicles' headlights and rear lights from those due to road lamps and reflective elements such as poles and traffic signs. In a previous work, we have devised a set of specialized supervised classifiers to make such decisions based on blob features related to its intensity and shape. Despite the overall good performance, there remain challenging that have yet to be solved: notably, faint and tiny blobs corresponding to quite distant vehicles. In fact, for such distant blobs, classification decisions can be taken after observing them during a few frames. Hence, incorporating tracking could improve the overall lighting system performance by enforcing the temporal consistency of the classifier decision. Accordingly, this paper focuses on the problem of constructing blob tracks, which is actually one of multiple-target tracking (MTT), but under two special conditions: We have to deal with frequent occlusions, as well as blob splits and merges. We approach it in a novel way by formulating the problem as a maximum a posteriori inference on a Markov random field. The qualitative (in video form) and quantitative evaluation of our new MTT method shows good tracking results. In addition, we will also see that the classification performance of the problematic blobs improves due to the proposed MTT algorithm.
 
  Address Madeira Island (Portugal)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ITSC  
  Notes ADAS Approved no  
  Call Number ADAS @ adas @ RSL2010 Serial 1422  
Permanent link to this record
 

 
Author (down) Jose Carlos Rubio; Joan Serrat; Antonio Lopez edit   pdf
doi  isbn
openurl 
  Title Unsupervised co-segmentation through region matching Type Conference Article
  Year 2012 Publication 25th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 749-756  
  Keywords  
  Abstract Co-segmentation is defined as jointly partitioning multiple images depicting the same or similar object, into foreground and background. Our method consists of a multiple-scale multiple-image generative model, which jointly estimates the foreground and background appearance distributions from several images, in a non-supervised manner. In contrast to other co-segmentation methods, our approach does not require the images to have similar foregrounds and different backgrounds to function properly. Region matching is applied to exploit inter-image information by establishing correspondences between the common objects that appear in the scene. Moreover, computing many-to-many associations of regions allow further applications, like recognition of object parts across images. We report results on iCoseg, a challenging dataset that presents extreme variability in camera viewpoint, illumination and object deformations and poses. We also show that our method is robust against large intra-class variability in the MSRC database.  
  Address Providence, Rhode Island  
  Corporate Author Thesis  
  Publisher IEEE Xplore Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1063-6919 ISBN 978-1-4673-1226-4 Medium  
  Area Expedition Conference CVPR  
  Notes ADAS Approved no  
  Call Number Admin @ si @ RSL2012b; ADAS @ adas @ Serial 2033  
Permanent link to this record
 

 
Author (down) Jose Carlos Rubio; Joan Serrat; Antonio Lopez edit   pdf
openurl 
  Title Multiple target tracking and identity linking under split, merge and occlusion of targets and observations Type Conference Article
  Year 2012 Publication 1st International Conference on Pattern Recognition Applications and Methods Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Algarve, Portugal  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPRAM  
  Notes ADAS Approved no  
  Call Number Admin @ si @ RSL2012c; ADAS @ adas Serial 2034  
Permanent link to this record
 

 
Author (down) Jose Carlos Rubio; Joan Serrat; Antonio Lopez edit   pdf
doi  isbn
openurl 
  Title Video Co-segmentation Type Conference Article
  Year 2012 Publication 11th Asian Conference on Computer Vision Abbreviated Journal  
  Volume 7725 Issue Pages 13-24  
  Keywords  
  Abstract Segmentation of a single image is in general a highly underconstrained problem. A frequent approach to solve it is to somehow provide prior knowledge or constraints on how the objects of interest look like (in terms of their shape, size, color, location or structure). Image co-segmentation trades the need for such knowledge for something much easier to obtain, namely, additional images showing the object from other viewpoints. Now the segmentation problem is posed as one of differentiating the similar object regions in all the images from the more varying background. In this paper, for the first time, we extend this approach to video segmentation: given two or more video sequences showing the same object (or objects belonging to the same class) moving in a similar manner, we aim to outline its region in all the frames. In addition, the method works in an unsupervised manner, by learning to segment at testing time. We compare favorably with two state-of-the-art methods on video segmentation and report results on benchmark videos.  
  Address Daejeon, Korea  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-37443-2 Medium  
  Area Expedition Conference ACCV  
  Notes ADAS Approved no  
  Call Number Admin @ si @ RSL2012d Serial 2153  
Permanent link to this record
 

 
Author (down) Jose Carlos Rubio; Joan Serrat; Antonio Lopez; Daniel Ponsa edit   pdf
url  doi
openurl 
  Title Multiple target tracking for intelligent headlights control Type Journal Article
  Year 2012 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS  
  Volume 13 Issue 2 Pages 594-605  
  Keywords Intelligent Headlights  
  Abstract Intelligent vehicle lighting systems aim at automatically regulating the headlights' beam to illuminate as much of the road ahead as possible while avoiding dazzling other drivers. A key component of such a system is computer vision software that is able to distinguish blobs due to vehicles' headlights and rear lights from those due to road lamps and reflective elements such as poles and traffic signs. In a previous work, we have devised a set of specialized supervised classifiers to make such decisions based on blob features related to its intensity and shape. Despite the overall good performance, there remain challenging that have yet to be solved: notably, faint and tiny blobs corresponding to quite distant vehicles. In fact, for such distant blobs, classification decisions can be taken after observing them during a few frames. Hence, incorporating tracking could improve the overall lighting system performance by enforcing the temporal consistency of the classifier decision. Accordingly, this paper focuses on the problem of constructing blob tracks, which is actually one of multiple-target tracking (MTT), but under two special conditions: We have to deal with frequent occlusions, as well as blob splits and merges. We approach it in a novel way by formulating the problem as a maximum a posteriori inference on a Markov random field. The qualitative (in video form) and quantitative evaluation of our new MTT method shows good tracking results. In addition, we will also see that the classification performance of the problematic blobs improves due to the proposed MTT algorithm.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1524-9050 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ RLP2012; ADAS @ adas @ rsl2012g Serial 1877  
Permanent link to this record
 

 
Author (down) Jose Carlos Rubio edit  openurl
  Title Many-to-Many High Order Matching. Applications to Tracking and Object Segmentation Type Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Feature matching is a fundamental problem in Computer Vision, having multiple applications such as tracking, image classification and retrieval, shape recognition and stereo fusion. In numerous domains, it is useful to represent the local structure of the matching features to increase the matching accuracy or to make the correspondence invariant to certain transformations (affine, homography, etc. . . ). However, encoding this knowledge requires complicating the model by establishing high-order relationships between the model elements, and therefore increasing the complexity of the optimization problem.

The importance of many-to-many matching is sometimes dismissed in the literature. Most methods are restricted to perform one-to-one matching, and are usually validated on synthetic, or non-realistic datasets. In a real challenging environment, with scale, pose and illumination variations of the object of interest, as well as the presence of occlusions, clutter, and noisy observations, many-to-many matching is necessary to achieve satisfactory results. As a consequence, finding the most likely many-to-many correspondence often involves a challenging combinatorial optimization process.

In this work, we design and demonstrate matching algorithms that compute many-to-many correspondences, applied to several challenging problems. Our goal is to make use of high-order representations to improve the expressive power of the matching, at the same time that we make feasible the process of inference or optimization of such models. We effectively use graphical models as our preferred representation because they provide an elegant probabilistic framework to tackle structured prediction problems.

We introduce a matching-based tracking algorithm which performs matching between frames of a video sequence in order to solve the difficult problem of headlight tracking at night-time. We also generalise this algorithm to solve the problem of data association applied to various tracking scenarios. We demonstrate the effectiveness of such approach in real video sequences and we show that our tracking algorithm can be used to improve the accuracy of a headlight classification system.

In the second part of this work, we move from single (point) matching to dense (region) matching and we introduce a new hierarchical image representation. We make use of such model to develop a high-order many-to-many matching between pairs of images. We show that the use of high-order models in comparison to simpler models improves not only the accuracy of the results, but also the convergence speed of the inference algorithm.

Finally, we keep exploiting the idea of region matching to design a fully unsupervised image co-segmentation algorithm that is able to perform competitively with state-of-the-art supervised methods. Our method also overcomes the typical drawbacks of some of the past works, such as avoiding the necessity of variate appearances on the image backgrounds. The region matching in this case is applied to effectively exploit inter-image information. We also extend this work to perform co-segmentation of videos, being the first time that such problem is addressed, as a way to perform video object segmentation
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Joan Serrat  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Rub2012 Serial 2206  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: