Publicacions CVC -- Query Results

[131–140] << 141 142 143 144 145 146 147 148 149 150 >> [151–160]

Details

Records
Author	Miguel Oliveira; Victor Santos; Angel Sappa
Title	Multimodal Inverse Perspective Mapping			Type	Journal Article
Year	2015	Publication	Information Fusion	Abbreviated Journal	IF
Volume	24	Issue		Pages	108–121
Keywords	Inverse perspective mapping; Multimodal sensor fusion; Intelligent vehicles
Abstract	Over the past years, inverse perspective mapping has been successfully applied to several problems in the field of Intelligent Transportation Systems. In brief, the method consists of mapping images to a new coordinate system where perspective effects are removed. The removal of perspective associated effects facilitates road and obstacle detection and also assists in free space estimation. There is, however, a significant limitation in the inverse perspective mapping: the presence of obstacles on the road disrupts the effectiveness of the mapping. The current paper proposes a robust solution based on the use of multimodal sensor fusion. Data from a laser range finder is fused with images from the cameras, so that the mapping is not computed in the regions where obstacles are present. As shown in the results, this considerably improves the effectiveness of the algorithm and reduces computation time when compared with the classical inverse perspective mapping. Furthermore, the proposed approach is also able to cope with several cameras with different lenses or image resolutions, as well as dynamic viewpoints.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.055; 600.076			Approved	no
Call Number	Admin @ si @ OSS2015c			Serial	2532
Permanent link to this record



Author	Sergio Escalera; Eloi Puertas; Petia Radeva; Oriol Pujol
Title	Multimodal laughter recognition in video conversations			Type	Conference Article
Year	2009	Publication	2nd IEEE Workshop on CVPR for Human communicative Behavior analysis	Abbreviated Journal
Volume		Issue		Pages	110–115
Keywords
Abstract	Laughter detection is an important area of interest in the Affective Computing and Human-computer Interaction fields. In this paper, we propose a multi-modal methodology based on the fusion of audio and visual cues to deal with the laughter recognition problem in face-to-face conversations. The audio features are extracted from the spectogram and the video features are obtained estimating the mouth movement degree and using a smile and laughter classifier. Finally, the multi-modal cues are included in a sequential classifier. Results over videos from the public discussion blog of the New York Times show that both types of features perform better when considered together by the classifier. Moreover, the sequential methodology shows to significantly outperform the results obtained by an Adaboost classifier.
Address	Miami (USA)
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	2160-7508	ISBN	978-1-4244-3994-2	Medium
Area		Expedition		Conference	CVPR
Notes	MILAB;HuPBA			Approved	no
Call Number	BCNPCL @ bcnpcl @ EPR2009c			Serial	1188
Permanent link to this record



Author	Marçal Rusiñol; Volkmar Frinken; Dimosthenis Karatzas; Andrew Bagdanov; Josep Llados
Title	Multimodal page classification in administrative document image streams			Type	Journal Article
Year	2014	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
Volume	17	Issue	4	Pages	331-341
Keywords	Digital mail room; Multimodal page classification; Visual and textual document description
Abstract	In this paper, we present a page classification application in a banking workflow. The proposed architecture represents administrative document images by merging visual and textual descriptions. The visual description is based on a hierarchical representation of the pixel intensity distribution. The textual description uses latent semantic analysis to represent document content as a mixture of topics. Several off-the-shelf classifiers and different strategies for combining visual and textual cues have been evaluated. A final step uses an n-gram model of the page stream allowing a finer-grained classification of pages. The proposed method has been tested in a real large-scale environment and we report results on a dataset of 70,000 pages.
Address
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1433-2833	ISBN		Medium
Area		Expedition		Conference
Notes	DAG; LAMP; 600.056; 600.061; 601.240; 601.223; 600.077; 600.079			Approved	no
Call Number	Admin @ si @ RFK2014			Serial	2523
Permanent link to this record



Author	David Rotger
Title	Multimodal Registration of Intravascular Ultrasound Images and Angiography			Type	Miscellaneous
Year	2002	Publication	Director: P. Radeva, Master Thesis.	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes				Approved	no
Call Number	Admin @ si @ Rot2002			Serial	324
Permanent link to this record



Author	David Rotger; Petia Radeva; E Fernandez-Nofrerias; J. Mauri
Title	Multimodal Registration of Intravascular Ultrasound Images and Angiography.			Type	Miscellaneous
Year	2002	Publication	XX Congreso Anual de la Sociedad Española de Ingenieria Biomedica CASEIB 2002, 1: 137–140.	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Zaragoza, Espanya
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB			Approved	no
Call Number	BCNPCL @ bcnpcl @ RRF2002b			Serial	317
Permanent link to this record



Author	Adam Fodor; Rachid R. Saboundji; Julio C. S. Jacques Junior; Sergio Escalera; David Gallardo Pujol; Andras Lorincz
Title	Multimodal Sentiment and Personality Perception Under Speech: A Comparison of Transformer-based Architectures			Type	Conference Article
Year	2022	Publication	Understanding Social Behavior in Dyadic and Small Group Interactions	Abbreviated Journal
Volume	173	Issue		Pages	218-241
Keywords
Abstract	Human-machine, human-robot interaction, and collaboration appear in diverse fields, from homecare to Cyber-Physical Systems. Technological development is fast, whereas real-time methods for social communication analysis that can measure small changes in sentiment and personality states, including visual, acoustic and language modalities are lagging, particularly when the goal is to build robust, appearance invariant, and fair methods. We study and compare methods capable of fusing modalities while satisfying real-time and invariant appearance conditions. We compare state-of-the-art transformer architectures in sentiment estimation and introduce them in the much less explored field of personality perception. We show that the architectures perform differently on automatic sentiment and personality perception, suggesting that each task may be better captured/modeled by a particular method. Our work calls attention to the attractive properties of the linear versions of the transformer architectures. In particular, we show that the best results are achieved by fusing the different architectures{’} preprocessing methods. However, they pose extreme conditions in computation power and energy consumption for real-time computations for quadratic transformers due to their memory requirements. In turn, linear transformers pave the way for quantifying small changes in sentiment estimation and personality perception for real-time social communications for machines and robots.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	PMLR
Notes	HuPBA; no menciona			Approved	no
Call Number	Admin @ si @ FSJ2022			Serial	3769
Permanent link to this record



Author	Fernando Barrera
Title	Multimodal Stereo from Thermal Infrared and Visible Spectrum			Type	Book Whole
Year	2012	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Recent advances in thermal infrared imaging (LWIR) has allowed its use in applications beyond of the military domain. Nowadays, this new family of sensors is included in different technical and scientific applications. They offer features that facilitate tasks, such as detection of pedestrians, hot spots, differences in temperature, among others, which can significantly improve the performance of a system where the persons are expected to play the principal role. For instance, video surveillance applications, monitoring, and pedestrian detection. During this dissertation the next question is stated: Could a couple of sensors measuring different bands of the electromagnetic spectrum, as the visible and thermal infrared, be used to extract depth information? Although it is a complex question, we shows that a system of these characteristics is possible as well as their advantages, drawbacks, and potential opportunities. The matching and fusion of data coming from different sensors, as the emissions registered at visible and infrared bands, represents a special challenge, because it has been showed that theses signals are weak correlated. Therefore, many traditional techniques of image processing and computer vision are not helpful, requiring adjustments for their correct performance in every modality. In this research an experimental study that compares different cost functions and matching approaches is performed, in order to build a multimodal stereovision system. Furthermore, the common problems in infrared/visible stereo, specially in the outdoor scenes are identified. Our framework summarizes the architecture of a generic stereo algorithm, at different levels: computational, functional, and structural, which can be extended toward high-level fusion (semantic) and high-order (prior).The proposed framework is intended to explore novel multimodal stereo matching approaches, going from sparse to dense representations (both disparity and depth maps). Moreover, context information is added in form of priors and assumptions. Finally, this dissertation shows a promissory way toward the integration of multiple sensors for recovering three-dimensional information.
Address
Corporate Author				Thesis	Ph.D. thesis
Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Felipe Lumbreras;Angel Sappa
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	Admin @ si @ Bar2012			Serial	2209
Permanent link to this record



Author	Fernando Barrera; Felipe Lumbreras; Angel Sappa
Title	Multimodal Stereo Vision System: 3D Data Extraction and Algorithm Evaluation			Type	Journal Article
Year	2012	Publication	IEEE Journal of Selected Topics in Signal Processing	Abbreviated Journal	J-STSP
Volume	6	Issue	5	Pages	437-446
Keywords
Abstract	This paper proposes an imaging system for computing sparse depth maps from multispectral images. A special stereo head consisting of an infrared and a color camera defines the proposed multimodal acquisition system. The cameras are rigidly attached so that their image planes are parallel. Details about the calibration and image rectification procedure are provided. Sparse disparity maps are obtained by the combined use of mutual information enriched with gradient information. The proposed approach is evaluated using a Receiver Operating Characteristics curve. Furthermore, a multispectral dataset, color and infrared images, together with their corresponding ground truth disparity maps, is generated and used as a test bed. Experimental results in real outdoor scenarios are provided showing its viability and that the proposed approach is not restricted to a specific domain.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1932-4553	ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	Admin @ si @ BLS2012b			Serial	2155
Permanent link to this record



Author	Fernando Barrera; Felipe Lumbreras; Angel Sappa
Title	Multimodal Template Matching based on Gradient and Mutual Information using Scale-Space			Type	Conference Article
Year	2010	Publication	17th IEEE International Conference on Image Processing	Abbreviated Journal
Volume		Issue		Pages	2749–2752
Keywords
Abstract	This paper presents the combined use of gradient and mutual information for infrared and intensity templates matching. We propose to joint: (i) feature matching in a multiresolution context and (ii) information propagation through scale-space representations. Our method consists in combining mutual information with a shape descriptor based on gradient, and propagate them following a coarse-to-fine strategy. The main contributions of this work are: to offer a theoretical formulation towards a multimodal stereo matching; to show that gradient and mutual information can be reinforced while they are propagated between consecutive levels; and to show that they are valid cost functions in multimodal template matchings. Comparisons are presented showing the improvements and viability of the proposed approach.
Address	Hong-Kong
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1522-4880	ISBN	978-1-4244-7992-4	Medium
Area		Expedition		Conference	ICIP
Notes	ADAS			Approved	no
Call Number	ADAS @ adas @ BLS2010			Serial	1358
Permanent link to this record



Author	Marçal Rusiñol; Dimosthenis Karatzas; Andrew Bagdanov; Josep Llados
Title	Multipage Document Retrieval by Textual and Visual Representations			Type	Conference Article
Year	2012	Publication	21st International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	521-524
Keywords
Abstract	In this paper we present a multipage administrative document image retrieval system based on textual and visual representations of document pages. Individual pages are represented by textual or visual information using a bag-of-words framework. Different fusion strategies are evaluated which allow the system to perform multipage document retrieval on the basis of a single page retrieval system. Results are reported on a large dataset of document images sampled from a banking workflow.
Address	Tsukuba Science City, Japan
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1051-4651	ISBN	978-1-4673-2216-4	Medium
Area		Expedition		Conference	ICPR
Notes	DAG			Approved	no
Call Number	Admin @ si @ RKB2012			Serial	2053
Permanent link to this record



Author	David Roche; Debora Gil; Jesus Giraldo
Title	Multiple active receptor conformation, agonist efficacy and maximum effect of the system: the conformation-based operational model of agonism,			Type	Journal Article
Year	2013	Publication	Drug Discovery Today	Abbreviated Journal	DDT
Volume	18	Issue	7-8	Pages	365-371
Keywords
Abstract	The operational model of agonism assumes that the maximum effect a particular receptor system can achieve (the Em parameter) is fixed. Em estimates are above but close to the asymptotic maximum effects of endogenous agonists. The concept of Em is contradicted by superagonists and those positive allosteric modulators that significantly increase the maximum effect of endogenous agonists. An extension of the operational model is proposed that assumes that the Em parameter does not necessarily have a single value for a receptor system but has multiple values associated to multiple active receptor conformations. The model provides a mechanistic link between active receptor conformation and agonist efficacy, which can be useful for the analysis of agonist response under different receptor scenarios.
Address
Corporate Author				Thesis
Publisher	Elsevier	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM; 600.057; 600.054			Approved	no
Call Number	IAM @ iam @ RGG2013a			Serial	2190
Permanent link to this record



Author	Ariel Amato
Title	Multiple Camera Calibration for Trajectories Tracking			Type	Report
Year	2007	Publication	CVC Technical Report #112	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	CVC (UAB)
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	Admin @ si @ Ama2007a			Serial	824
Permanent link to this record



Author	Jaume Gibert; Ernest Valveny; Oriol Ramos Terrades; Horst Bunke
Title	Multiple Classifiers for Graph of Words Embedding			Type	Conference Article
Year	2011	Publication	10th International Conference on Multiple Classifier Systems	Abbreviated Journal
Volume	6713	Issue		Pages	36-45
Keywords
Abstract	During the last years, there has been an increasing interest in applying the multiple classifier framework to the domain of structural pattern recognition. Constructing base classifiers when the input patterns are graph based representations is not an easy problem. In this work, we make use of the graph embedding methodology in order to construct different feature vector representations for graphs. The graph of words embedding assigns a feature vector to every graph by counting unary and binary relations between node representatives and combining these pieces of information into a single vector. Selecting different node representatives leads to different vectorial representations and therefore to different base classifiers that can be combined. We experimentally show how this methodology significantly improves the classification of graphs with respect to single base classifiers.
Address	Napoles, Italy
Corporate Author				Thesis
Publisher		Place of Publication		Editor	Carlo Sansone; Josef Kittler; Fabio Roli
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-642-21556-8	Medium
Area		Expedition		Conference	MCS
Notes	DAG			Approved	no
Call Number	Admin @ si @GVR2011			Serial	1745
Permanent link to this record



Author	J.M. Sanchez; X. Binefa; J.R. Kender
Title	Multiple Feature Temporal Models for Object Detection in Video.			Type	Miscellaneous
Year	2002	Publication	Proceeding of the International Conference on Multimedia and Expo ICME 2002	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Lausanne
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes				Approved	no
Call Number	Admin @ si @ SBK2002b			Serial	299
Permanent link to this record



Author	Jaume Amores; David Geronimo; Antonio Lopez
Title	Multiple instance and active learning for weakly-supervised object-class segmentation			Type	Conference Article
Year	2010	Publication	3rd IEEE International Conference on Machine Vision	Abbreviated Journal
Volume		Issue		Pages
Keywords	Multiple Instance Learning; Active Learning; Object-class segmentation.
Abstract	In object-class segmentation, one of the most tedious tasks is to manually segment many object examples in order to learn a model of the object category. Yet, there has been little research on reducing the degree of manual annotation for object-class segmentation. In this work we explore alternative strategies which do not require full manual segmentation of the object in the training set. In particular, we study the use of bounding boxes as a coarser and much cheaper form of segmentation and we perform a comparative study of several Multiple-Instance Learning techniques that allow to obtain a model with this type of weak annotation. We show that some of these methods can be competitive, when used with coarse segmentations, with methods that require full manual segmentation of the objects. Furthermore, we show how to use active learning combined with this weakly supervised strategy. As we see, this strategy permits to reduce the amount of annotation and optimize the number of examples that require full manual segmentation in the training set.
Address	Hong-Kong
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICMV
Notes	ADAS			Approved	no
Call Number	ADAS @ adas @ AGL2010b			Serial	1429
Permanent link to this record