|   | 
Details
   web
Records
Author Miguel Oliveira; Victor Santos; Angel Sappa
Title (up) Multimodal Inverse Perspective Mapping Type Journal Article
Year 2015 Publication Information Fusion Abbreviated Journal IF
Volume 24 Issue Pages 108–121
Keywords Inverse perspective mapping; Multimodal sensor fusion; Intelligent vehicles
Abstract Over the past years, inverse perspective mapping has been successfully applied to several problems in the field of Intelligent Transportation Systems. In brief, the method consists of mapping images to a new coordinate system where perspective effects are removed. The removal of perspective associated effects facilitates road and obstacle detection and also assists in free space estimation. There is, however, a significant limitation in the inverse perspective mapping: the presence of obstacles on the road disrupts the effectiveness of the mapping. The current paper proposes a robust solution based on the use of multimodal sensor fusion. Data from a laser range finder is fused with images from the cameras, so that the mapping is not computed in the regions where obstacles are present. As shown in the results, this considerably improves the effectiveness of the algorithm and reduces computation time when compared with the classical inverse perspective mapping. Furthermore, the proposed approach is also able to cope with several cameras with different lenses or image resolutions, as well as dynamic viewpoints.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.055; 600.076 Approved no
Call Number Admin @ si @ OSS2015c Serial 2532
Permanent link to this record
 

 
Author Sergio Escalera; Eloi Puertas; Petia Radeva; Oriol Pujol
Title (up) Multimodal laughter recognition in video conversations Type Conference Article
Year 2009 Publication 2nd IEEE Workshop on CVPR for Human communicative Behavior analysis Abbreviated Journal
Volume Issue Pages 110–115
Keywords
Abstract Laughter detection is an important area of interest in the Affective Computing and Human-computer Interaction fields. In this paper, we propose a multi-modal methodology based on the fusion of audio and visual cues to deal with the laughter recognition problem in face-to-face conversations. The audio features are extracted from the spectogram and the video features are obtained estimating the mouth movement degree and using a smile and laughter classifier. Finally, the multi-modal cues are included in a sequential classifier. Results over videos from the public discussion blog of the New York Times show that both types of features perform better when considered together by the classifier. Moreover, the sequential methodology shows to significantly outperform the results obtained by an Adaboost classifier.
Address Miami (USA)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 2160-7508 ISBN 978-1-4244-3994-2 Medium
Area Expedition Conference CVPR
Notes MILAB;HuPBA Approved no
Call Number BCNPCL @ bcnpcl @ EPR2009c Serial 1188
Permanent link to this record
 

 
Author Marçal Rusiñol; Volkmar Frinken; Dimosthenis Karatzas; Andrew Bagdanov; Josep Llados
Title (up) Multimodal page classification in administrative document image streams Type Journal Article
Year 2014 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume 17 Issue 4 Pages 331-341
Keywords Digital mail room; Multimodal page classification; Visual and textual document description
Abstract In this paper, we present a page classification application in a banking workflow. The proposed architecture represents administrative document images by merging visual and textual descriptions. The visual description is based on a hierarchical representation of the pixel intensity distribution. The textual description uses latent semantic analysis to represent document content as a mixture of topics. Several off-the-shelf classifiers and different strategies for combining visual and textual cues have been evaluated. A final step uses an n-gram model of the page stream allowing a finer-grained classification of pages. The proposed method has been tested in a real large-scale environment and we report results on a dataset of 70,000 pages.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1433-2833 ISBN Medium
Area Expedition Conference
Notes DAG; LAMP; 600.056; 600.061; 601.240; 601.223; 600.077; 600.079 Approved no
Call Number Admin @ si @ RFK2014 Serial 2523
Permanent link to this record
 

 
Author David Rotger
Title (up) Multimodal Registration of Intravascular Ultrasound Images and Angiography Type Miscellaneous
Year 2002 Publication Director: P. Radeva, Master Thesis. Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number Admin @ si @ Rot2002 Serial 324
Permanent link to this record
 

 
Author David Rotger; Petia Radeva; E Fernandez-Nofrerias; J. Mauri
Title (up) Multimodal Registration of Intravascular Ultrasound Images and Angiography. Type Miscellaneous
Year 2002 Publication XX Congreso Anual de la Sociedad Española de Ingenieria Biomedica CASEIB 2002, 1: 137–140. Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Zaragoza, Espanya
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB Approved no
Call Number BCNPCL @ bcnpcl @ RRF2002b Serial 317
Permanent link to this record
 

 
Author Adam Fodor; Rachid R. Saboundji; Julio C. S. Jacques Junior; Sergio Escalera; David Gallardo Pujol; Andras Lorincz
Title (up) Multimodal Sentiment and Personality Perception Under Speech: A Comparison of Transformer-based Architectures Type Conference Article
Year 2022 Publication Understanding Social Behavior in Dyadic and Small Group Interactions Abbreviated Journal
Volume 173 Issue Pages 218-241
Keywords
Abstract Human-machine, human-robot interaction, and collaboration appear in diverse fields, from homecare to Cyber-Physical Systems. Technological development is fast, whereas real-time methods for social communication analysis that can measure small changes in sentiment and personality states, including visual, acoustic and language modalities are lagging, particularly when the goal is to build robust, appearance invariant, and fair methods. We study and compare methods capable of fusing modalities while satisfying real-time and invariant appearance conditions. We compare state-of-the-art transformer architectures in sentiment estimation and introduce them in the much less explored field of personality perception. We show that the architectures perform differently on automatic sentiment and personality perception, suggesting that each task may be better captured/modeled by a particular method. Our work calls attention to the attractive properties of the linear versions of the transformer architectures. In particular, we show that the best results are achieved by fusing the different architectures{’} preprocessing methods. However, they pose extreme conditions in computation power and energy consumption for real-time computations for quadratic transformers due to their memory requirements. In turn, linear transformers pave the way for quantifying small changes in sentiment estimation and personality perception for real-time social communications for machines and robots.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference PMLR
Notes HuPBA; no menciona Approved no
Call Number Admin @ si @ FSJ2022 Serial 3769
Permanent link to this record
 

 
Author Fernando Barrera
Title (up) Multimodal Stereo from Thermal Infrared and Visible Spectrum Type Book Whole
Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Recent advances in thermal infrared imaging (LWIR) has allowed its use in applications beyond of the military domain. Nowadays, this new family of sensors is included in different technical and scientific applications. They offer features that facilitate tasks, such as detection of pedestrians, hot spots, differences in temperature, among others, which can significantly improve the performance of a system where the persons are expected to play the principal role. For instance, video surveillance applications, monitoring, and pedestrian detection.
During this dissertation the next question is stated: Could a couple of sensors measuring different bands of the electromagnetic spectrum, as the visible and thermal infrared, be used to extract depth information? Although it is a complex question, we shows that a system of these characteristics is possible as well as their advantages, drawbacks, and potential opportunities.
The matching and fusion of data coming from different sensors, as the emissions registered at visible and infrared bands, represents a special challenge, because it has been showed that theses signals are weak correlated. Therefore, many traditional techniques of image processing and computer vision are not helpful, requiring adjustments for their correct performance in every modality.
In this research an experimental study that compares different cost functions and matching approaches is performed, in order to build a multimodal stereovision system. Furthermore, the common problems in infrared/visible stereo, specially in the outdoor scenes are identified. Our framework summarizes the architecture of a generic stereo algorithm, at different levels: computational, functional, and structural, which can be extended toward high-level fusion (semantic) and high-order (prior).The proposed framework is intended to explore novel multimodal stereo matching approaches, going from sparse to dense representations (both disparity and depth maps). Moreover, context information is added in form of priors and assumptions. Finally, this dissertation shows a promissory way toward the integration of multiple sensors for recovering three-dimensional information.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Felipe Lumbreras;Angel Sappa
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number Admin @ si @ Bar2012 Serial 2209
Permanent link to this record
 

 
Author Fernando Barrera; Felipe Lumbreras; Angel Sappa
Title (up) Multimodal Stereo Vision System: 3D Data Extraction and Algorithm Evaluation Type Journal Article
Year 2012 Publication IEEE Journal of Selected Topics in Signal Processing Abbreviated Journal J-STSP
Volume 6 Issue 5 Pages 437-446
Keywords
Abstract This paper proposes an imaging system for computing sparse depth maps from multispectral images. A special stereo head consisting of an infrared and a color camera defines the proposed multimodal acquisition system. The cameras are rigidly attached so that their image planes are parallel. Details about the calibration and image rectification procedure are provided. Sparse disparity maps are obtained by the combined use of mutual information enriched with gradient information. The proposed approach is evaluated using a Receiver Operating Characteristics curve. Furthermore, a multispectral dataset, color and infrared images, together with their corresponding ground truth disparity maps, is generated and used as a test bed. Experimental results in real outdoor scenarios are provided showing its viability and that the proposed approach is not restricted to a specific domain.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1932-4553 ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number Admin @ si @ BLS2012b Serial 2155
Permanent link to this record
 

 
Author Fernando Barrera; Felipe Lumbreras; Angel Sappa
Title (up) Multimodal Template Matching based on Gradient and Mutual Information using Scale-Space Type Conference Article
Year 2010 Publication 17th IEEE International Conference on Image Processing Abbreviated Journal
Volume Issue Pages 2749–2752
Keywords
Abstract This paper presents the combined use of gradient and mutual information for infrared and intensity templates matching. We propose to joint: (i) feature matching in a multiresolution context and (ii) information propagation through scale-space representations. Our method consists in combining mutual information with a shape descriptor based on gradient, and propagate them following a coarse-to-fine strategy. The main contributions of this work are: to offer a theoretical formulation towards a multimodal stereo matching; to show that gradient and mutual information can be reinforced while they are propagated between consecutive levels; and to show that they are valid cost functions in multimodal template matchings. Comparisons are presented showing the improvements and viability of the proposed approach.
Address Hong-Kong
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1522-4880 ISBN 978-1-4244-7992-4 Medium
Area Expedition Conference ICIP
Notes ADAS Approved no
Call Number ADAS @ adas @ BLS2010 Serial 1358
Permanent link to this record
 

 
Author Marçal Rusiñol; Dimosthenis Karatzas; Andrew Bagdanov; Josep Llados
Title (up) Multipage Document Retrieval by Textual and Visual Representations Type Conference Article
Year 2012 Publication 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 521-524
Keywords
Abstract In this paper we present a multipage administrative document image retrieval system based on textual and visual representations of document pages. Individual pages are represented by textual or visual information using a bag-of-words framework. Different fusion strategies are evaluated which allow the system to perform multipage document retrieval on the basis of a single page retrieval system. Results are reported on a large dataset of document images sampled from a banking workflow.
Address Tsukuba Science City, Japan
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN 978-1-4673-2216-4 Medium
Area Expedition Conference ICPR
Notes DAG Approved no
Call Number Admin @ si @ RKB2012 Serial 2053
Permanent link to this record
 

 
Author David Roche; Debora Gil; Jesus Giraldo
Title (up) Multiple active receptor conformation, agonist efficacy and maximum effect of the system: the conformation-based operational model of agonism, Type Journal Article
Year 2013 Publication Drug Discovery Today Abbreviated Journal DDT
Volume 18 Issue 7-8 Pages 365-371
Keywords
Abstract The operational model of agonism assumes that the maximum effect a particular receptor system can achieve (the Em parameter) is fixed. Em estimates are above but close to the asymptotic maximum effects of endogenous agonists. The concept of Em is contradicted by superagonists and those positive allosteric modulators that significantly increase the maximum effect of endogenous agonists. An extension of the operational model is proposed that assumes that the Em parameter does not necessarily have a single value for a receptor system but has multiple values associated to multiple active receptor conformations. The model provides a mechanistic link between active receptor conformation and agonist efficacy, which can be useful for the analysis of agonist response under different receptor scenarios.
Address
Corporate Author Thesis
Publisher Elsevier Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM; 600.057; 600.054 Approved no
Call Number IAM @ iam @ RGG2013a Serial 2190
Permanent link to this record
 

 
Author Ariel Amato
Title (up) Multiple Camera Calibration for Trajectories Tracking Type Report
Year 2007 Publication CVC Technical Report #112 Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address CVC (UAB)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE Approved no
Call Number Admin @ si @ Ama2007a Serial 824
Permanent link to this record
 

 
Author Jaume Gibert; Ernest Valveny; Oriol Ramos Terrades; Horst Bunke
Title (up) Multiple Classifiers for Graph of Words Embedding Type Conference Article
Year 2011 Publication 10th International Conference on Multiple Classifier Systems Abbreviated Journal
Volume 6713 Issue Pages 36-45
Keywords
Abstract During the last years, there has been an increasing interest in applying the multiple classifier framework to the domain of structural pattern recognition. Constructing base classifiers when the input patterns are graph based representations is not an easy problem. In this work, we make use of the graph embedding methodology in order to construct different feature vector representations for graphs. The graph of words embedding assigns a feature vector to every graph by counting unary and binary relations between node representatives and combining these pieces of information into a single vector. Selecting different node representatives leads to different vectorial representations and therefore to different base classifiers that can be combined. We experimentally show how this methodology significantly improves the classification of graphs with respect to single base classifiers.
Address Napoles, Italy
Corporate Author Thesis
Publisher Place of Publication Editor Carlo Sansone; Josef Kittler; Fabio Roli
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-642-21556-8 Medium
Area Expedition Conference MCS
Notes DAG Approved no
Call Number Admin @ si @GVR2011 Serial 1745
Permanent link to this record
 

 
Author J.M. Sanchez; X. Binefa; J.R. Kender
Title (up) Multiple Feature Temporal Models for Object Detection in Video. Type Miscellaneous
Year 2002 Publication Proceeding of the International Conference on Multimedia and Expo ICME 2002 Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Lausanne
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number Admin @ si @ SBK2002b Serial 299
Permanent link to this record
 

 
Author Jaume Amores; David Geronimo; Antonio Lopez
Title (up) Multiple instance and active learning for weakly-supervised object-class segmentation Type Conference Article
Year 2010 Publication 3rd IEEE International Conference on Machine Vision Abbreviated Journal
Volume Issue Pages
Keywords Multiple Instance Learning; Active Learning; Object-class segmentation.
Abstract In object-class segmentation, one of the most tedious tasks is to manually segment many object examples in order to learn a model of the object category. Yet, there has been little research on reducing the degree of manual annotation for
object-class segmentation. In this work we explore alternative strategies which do not require full manual segmentation of the object in the training set. In particular, we study the use of bounding boxes as a coarser and much cheaper form of segmentation and we perform a comparative study of several Multiple-Instance Learning techniques that allow to obtain a model with this type of weak annotation. We show that some of these methods can be competitive, when used with coarse
segmentations, with methods that require full manual segmentation of the objects. Furthermore, we show how to use active learning combined with this weakly supervised strategy.
As we see, this strategy permits to reduce the amount of annotation and optimize the number of examples that require full manual segmentation in the training set.
Address Hong-Kong
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICMV
Notes ADAS Approved no
Call Number ADAS @ adas @ AGL2010b Serial 1429
Permanent link to this record