|   | 
Details
   web
Records
Author Sergio Escalera; Alicia Fornes; Oriol Pujol; Josep Llados; Petia Radeva
Title (up) Multi-class Binary Object Categorization using Blurred Shape Models Type Conference Article
Year 2007 Publication Progress in Pattern Recognition, Image Analysis and Applications, 12th Iberoamerican Congress on Pattern Abbreviated Journal
Volume 4756 Issue Pages 773–782
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LCNS
Series Volume Series Issue Edition
ISSN ISBN 978-3-540-76724-4 Medium
Area Expedition Conference CIARP
Notes MILAB; DAG;HuPBA Approved no
Call Number BCNPCL @ bcnpcl @ EFP2007 Serial 911
Permanent link to this record
 

 
Author Sergio Escalera; Alicia Fornes; Oriol Pujol; Petia Radeva
Title (up) Multi-class Binary Symbol Classification with Circular Blurred Shape Models Type Conference Article
Year 2009 Publication 15th International Conference on Image Analysis and Processing Abbreviated Journal
Volume 5716 Issue Pages 1005–1014
Keywords
Abstract Multi-class binary symbol classification requires the use of rich descriptors and robust classifiers. Shape representation is a difficult task because of several symbol distortions, such as occlusions, elastic deformations, gaps or noise. In this paper, we present the Circular Blurred Shape Model descriptor. This descriptor encodes the arrangement information of object parts in a correlogram structure. A prior blurring degree defines the level of distortion allowed to the symbol. Moreover, we learn the new feature space using a set of Adaboost classifiers, which are combined in the Error-Correcting Output Codes framework to deal with the multi-class categorization problem. The presented work has been validated over different multi-class data sets, and compared to the state-of-the-art descriptors, showing significant performance improvements.
Address Salerno, Italy
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-04145-7 Medium
Area Expedition Conference ICIAP
Notes MILAB;HuPBA;DAG Approved no
Call Number BCNPCL @ bcnpcl @ EFP2009c Serial 1186
Permanent link to this record
 

 
Author Sergio Escalera; David M.J. Tax; Oriol Pujol; Petia Radeva; Robert P.W. Duin
Title (up) Multi-Class Classification in Image Analysis Via Error-Correcting Output Codes Type Book Chapter
Year 2011 Publication Innovations in Intelligent Image Analysis Abbreviated Journal
Volume 339 Issue Pages 7-29
Keywords
Abstract A common way to model multi-class classification problems is by means of Error-Correcting Output Codes (ECOC). Given a multi-class problem, the ECOC technique designs a codeword for each class, where each position of the code identifies the membership of the class for a given binary problem.A classification decision is obtained by assigning the label of the class with the closest code. In this paper, we overview the state-of-the-art on ECOC designs and test them in real applications. Results on different multi-class data sets show the benefits of using the ensemble of classifiers when categorizing objects in images.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Berlin Editor H. Kawasnicka; L.Jain
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1860-949X ISBN 978-3-642-17933-4 Medium
Area Expedition Conference
Notes MILAB;HuPBA Approved no
Call Number Admin @ si @ ETP2011 Serial 1746
Permanent link to this record
 

 
Author Francesco Ciompi
Title (up) Multi-Class Learning for Vessel Characterization in Intravascular Ultrasound Type Book Whole
Year 2012 Publication PhD Thesis, Universitat de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In this thesis we tackle the problem of automatic characterization of human coronary vessel in Intravascular Ultrasound (IVUS) image modality. The basis for the whole characterization process is machine learning applied to multi-class problems. In all the presented approaches, the Error-Correcting Output Codes (ECOC) framework is used as central element for the design of multi-class classifiers.
Two main topics are tackled in this thesis. First, the automatic detection of the vessel borders is presented. For this purpose, a novel context-aware classifier for multi-class classification of the vessel morphology is presented, namely ECOC-DRF. Based on ECOC-DRF, the lumen border and the media-adventitia border in IVUS are robustly detected by means of a novel holistic approach, achieving an error comparable with inter-observer variability and with state of the art methods.
The two vessel borders define the atheroma area of the vessel. In this area, tissue characterization is required. For this purpose, we present a framework for automatic plaque characterization by processing both texture in IVUS images and spectral information in raw Radio Frequency data. Furthermore, a novel method for fusing in-vivo and in-vitro IVUS data for plaque characterization is presented, namely pSFFS. The method demonstrates to effectively fuse data generating a classifier that improves the tissue characterization in both in-vitro and in-vivo datasets.
A novel method for automatic video summarization in IVUS sequences is also presented. The method aims to detect the key frames of the sequence, i.e., the frames representative of morphological changes. This novel method represents the basis for video summarization in IVUS as well as the markers for the partition of the vessel into morphological and clinically interesting events.
Finally, multi-class learning based on ECOC is applied to lung tissue characterization in Computed Tomography. The novel proposed approach, based on supervised and unsupervised learning, achieves accurate tissue classification on a large and heterogeneous dataset.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Petia Radeva;Oriol Pujol
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB Approved no
Call Number Admin @ si @ Cio2012 Serial 2146
Permanent link to this record
 

 
Author Eloi Puertas; Sergio Escalera; Oriol Pujol
Title (up) Multi-Class Multi-Scale Stacked Sequential Learning Type Conference Article
Year 2011 Publication 10th International Conference on Multiple Classifier Systems Abbreviated Journal
Volume 6713 Issue Pages 197-206
Keywords
Abstract
Address Napoles, Italy
Corporate Author Thesis
Publisher Springer Place of Publication Editor Carlo Sansone; Josef Kittler; Fabio Roli
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MCS
Notes HuPBA;MILAB Approved no
Call Number Admin @ si @ PEP2011b Serial 1772
Permanent link to this record
 

 
Author Juan Jose Rubio; Takahiro Kashiwa; Teera Laiteerapong; Wenlong Deng; Kohei Nagai; Sergio Escalera; Kotaro Nakayama; Yutaka Matsuo; Helmut Prendinger
Title (up) Multi-class structural damage segmentation using fully convolutional networks Type Journal Article
Year 2019 Publication Computers in Industry Abbreviated Journal COMPUTIND
Volume 112 Issue Pages 103121
Keywords Bridge damage detection; Deep learning; Semantic segmentation
Abstract Structural Health Monitoring (SHM) has benefited from computer vision and more recently, Deep Learning approaches, to accurately estimate the state of deterioration of infrastructure. In our work, we test Fully Convolutional Networks (FCNs) with a dataset of deck areas of bridges for damage segmentation. We create a dataset for delamination and rebar exposure that has been collected from inspection records of bridges in Niigata Prefecture, Japan. The dataset consists of 734 images with three labels per image, which makes it the largest dataset of images of bridge deck damage. This data allows us to estimate the performance of our method based on regions of agreement, which emulates the uncertainty of in-field inspections. We demonstrate the practicality of FCNs to perform automated semantic segmentation of surface damages. Our model achieves a mean accuracy of 89.7% for delamination and 78.4% for rebar exposure, and a weighted F1 score of 81.9%.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ RKL2019 Serial 3315
Permanent link to this record
 

 
Author Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva
Title (up) Multi-face tracking by extended bag-of-tracklets in egocentric photo-streams Type Journal Article
Year 2016 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU
Volume 149 Issue Pages 146-156
Keywords
Abstract Wearable cameras offer a hands-free way to record egocentric images of daily experiences, where social events are of special interest. The first step towards detection of social events is to track the appearance of multiple persons involved in them. In this paper, we propose a novel method to find correspondences of multiple faces in low temporal resolution egocentric videos acquired through a wearable camera. This kind of photo-stream imposes additional challenges to the multi-tracking problem with respect to conventional videos. Due to the free motion of the camera and to its low temporal resolution, abrupt changes in the field of view, in illumination condition and in the target location are highly frequent. To overcome such difficulties, we propose a multi-face tracking method that generates a set of tracklets through finding correspondences along the whole sequence for each detected face and takes advantage of the tracklets redundancy to deal with unreliable ones. Similar tracklets are grouped into the so called extended bag-of-tracklets (eBoT), which is aimed to correspond to a specific person. Finally, a prototype tracklet is extracted for each eBoT, where the occurred occlusions are estimated by relying on a new measure of confidence. We validated our approach over an extensive dataset of egocentric photo-streams and compared it to state of the art methods, demonstrating its effectiveness and robustness.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; Approved no
Call Number Admin @ si @ ADR2016b Serial 2742
Permanent link to this record
 

 
Author Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva
Title (up) Multi-Face Tracking by Extended Bag-of-Tracklets in Egocentric Videos Type Miscellaneous
Year 2015 Publication Arxiv Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Egocentric images offer a hands-free way to record daily experiences and special events, where social interactions are of special interest. A natural question that arises is how to extract and track the appearance of multiple persons in a social event captured by a wearable camera. In this paper, we propose a novel method to find correspondences of multiple-faces in low temporal resolution egocentric sequences acquired through a wearable camera. This kind of sequences imposes additional challenges to the multitracking problem with respect to conventional videos. Due to the free motion of the camera and to its low temporal resolution (2 fpm), abrupt changes in the field of view, in illumination conditions and in the target location are very frequent. To overcome such a difficulty, we propose to generate, for each detected face, a set of correspondences along the whole sequence that we call tracklet and to take advantage of their redundancy to deal with both false positive face detections and unreliable tracklets. Similar tracklets are grouped into the so called extended bag-of-tracklets (eBoT), which are aimed to correspond to specific persons. Finally, a prototype tracklet is extracted for each eBoT. We validated our method over a dataset of 18.000 images from 38 egocentric sequences with 52 trackable persons and compared to the state-of-the-art methods, demonstrating its effectiveness and robustness.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB Approved no
Call Number Admin @ si @ ADR2015b Serial 2713
Permanent link to this record
 

 
Author Shida Beigpour; Christian Riess; Joost Van de Weijer; Elli Angelopoulou
Title (up) Multi-Illuminant Estimation with Conditional Random Fields Type Journal Article
Year 2014 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP
Volume 23 Issue 1 Pages 83-95
Keywords color constancy; CRF; multi-illuminant
Abstract Most existing color constancy algorithms assume uniform illumination. However, in real-world scenes, this is not often the case. Thus, we propose a novel framework for estimating the colors of multiple illuminants and their spatial distribution in the scene. We formulate this problem as an energy minimization task within a conditional random field over a set of local illuminant estimates. In order to quantitatively evaluate the proposed method, we created a novel data set of two-dominant-illuminant images comprised of laboratory, indoor, and outdoor scenes. Unlike prior work, our database includes accurate pixel-wise ground truth illuminant information. The performance of our method is evaluated on multiple data sets. Experimental results show that our framework clearly outperforms single illuminant estimators as well as a recently proposed multi-illuminant estimation approach.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1057-7149 ISBN Medium
Area Expedition Conference
Notes CIC; LAMP; 600.074; 600.079 Approved no
Call Number Admin @ si @ BRW2014 Serial 2451
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla
Title (up) Multi-Image Super-Resolution for Thermal Images Type Conference Article
Year 2022 Publication 17th International Conference on Computer Vision Theory and Applications (VISAPP 2022) Abbreviated Journal
Volume 4 Issue Pages 635-642
Keywords Thermal Images; Multi-view; Multi-frame; Super-Resolution; Deep Learning; Attention Block
Abstract This paper proposes a novel CNN architecture for the multi-thermal image super-resolution problem. In the proposed scheme, the multi-images are synthetically generated by downsampling and slightly shifting the given image; noise is also added to each of these synthesized images. The proposed architecture uses two
attention blocks paths to extract high-frequency details taking advantage of the large information extracted from multiple images of the same scene. Experimental results are provided, showing the proposed scheme has overcome the state-of-the-art approaches.
Address Online; Feb 6-8, 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VISAPP
Notes MSIAU; 601.349 Approved no
Call Number Admin @ si @ RSV2022a Serial 3690
Permanent link to this record
 

 
Author Spencer Low; Oliver Nina; Angel Sappa; Erik Blasch; Nathan Inkawhich
Title (up) Multi-Modal Aerial View Image Challenge: Translation From Synthetic Aperture Radar to Electro-Optical Domain Results-PBVS 2023 Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages 515-523
Keywords
Abstract This paper unveils the discoveries and outcomes of the inaugural iteration of the Multi-modal Aerial View Image Challenge (MAVIC) aimed at image translation. The primary objective of this competition is to stimulate research efforts towards the development of models capable of translating co-aligned images between multiple modalities. To accomplish the task of image translation, the competition utilizes images obtained from both synthetic aperture radar (SAR) and electro-optical (EO) sources. Specifically, the challenge centers on the translation from the SAR modality to the EO modality, an area of research that has garnered attention. The inaugural challenge demonstrates the feasibility of the task. The dataset utilized in this challenge is derived from the UNIfied COincident Optical and Radar for recognitioN (UNICORN) dataset. We introduce an new version of the UNICORN dataset that is focused on enabling the sensor translation task. Performance evaluation is conducted using a combination of measures to ensure high fidelity and high accuracy translations.
Address Vancouver; Canada; June 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes MSIAU Approved no
Call Number Admin @ si @ LNS2023a Serial 3913
Permanent link to this record
 

 
Author Spencer Low; Oliver Nina; Angel Sappa; Erik Blasch; Nathan Inkawhich
Title (up) Multi-Modal Aerial View Object Classification Challenge Results – PBVS 2022 Type Conference Article
Year 2022 Publication IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Abbreviated Journal
Volume Issue Pages 350-358
Keywords
Abstract This paper details the results and main findings of the second iteration of the Multi-modal Aerial View Object Classification (MAVOC) challenge. The primary goal of both MAVOC challenges is to inspire research into methods for building recognition models that utilize both synthetic aperture radar (SAR) and electro-optical (EO) imagery. Teams are encouraged to develop multi-modal approaches that incorporate complementary information from both domains. While the 2021 challenge showed a proof of concept that both modalities could be used together, the 2022 challenge focuses on the detailed multi-modal methods. The 2022 challenge uses the same UNIfied Coincident Optical and Radar for recognitioN (UNICORN) dataset and competition format that was used in 2021. Specifically, the challenge focuses on two tasks, (1) SAR classification and (2) SAR + EO classification. The bulk of this document is dedicated to discussing the top performing methods and describing their performance on our blind test set. Notably, all of the top ten teams outperform a Resnet-18 baseline. For SAR classification, the top team showed a 129% improvement over baseline and an 8% average improvement from the 2021 winner. The top team for SAR + EO classification shows a 165% improvement with a 32% average improvement over 2021.
Address New Orleans; USA; June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes MSIAU Approved no
Call Number Admin @ si @ LNS2022 Serial 3768
Permanent link to this record
 

 
Author Spencer Low; Oliver Nina; Angel Sappa; Erik Blasch; Nathan Inkawhich
Title (up) Multi-Modal Aerial View Object Classification Challenge Results-PBVS 2023 Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages 412-421
Keywords
Abstract This paper presents the findings and results of the third edition of the Multi-modal Aerial View Object Classification (MAVOC) challenge in a detailed and comprehensive manner. The challenge consists of two tracks. The primary aim of both tracks is to encourage research into building recognition models that utilize both synthetic aperture radar (SAR) and electro-optical (EO) imagery. Participating teams are encouraged to develop multi-modal approaches that incorporate complementary information from both domains. While the 2021 challenge demonstrated the feasibility of combining both modalities, the 2022 challenge expanded on the capability of multi-modal models. The 2023 challenge introduces a refined version of the UNICORN dataset and demonstrates significant improvements made. The 2023 challenge adopts an updated UNIfied CO-incident Optical and Radar for recognitioN (UNICORN V2) dataset and competition format. Two tasks are featured: SAR classification and SAR + EO classification. In addition to measuring accuracy of models, we also introduce out-of-distribution measures to encourage model robustness.The majority of this paper is dedicated to discussing the top performing methods and evaluating their performance on our blind test set. It is worth noting that all of the top ten teams outperformed the Resnet-50 baseline. The top team for SAR classification achieved a 173% performance improvement over the baseline, while the top team for SAR + EO classification achieved a 175% improvement.
Address Vancouver; Canada; June 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes MSIAU Approved no
Call Number Admin @ si @ LNS2023b Serial 3915
Permanent link to this record
 

 
Author Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
Title (up) Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine Type Journal Article
Year 2018 Publication Entropy Abbreviated Journal ENTROPY
Volume 20 Issue 11 Pages 809
Keywords hand sign language; deep learning; restricted Boltzmann machine (RBM); multi-modal; profoundly deaf; noisy image
Abstract In this paper, a deep learning approach, Restricted Boltzmann Machine (RBM), is used to perform automatic hand sign language recognition from visual data. We evaluate how RBM, as a deep generative model, is capable of generating the distribution of the input data for an enhanced recognition of unseen data. Two modalities, RGB and Depth, are considered in the model input in three forms: original image, cropped image, and noisy cropped image. Five crops of the input image are used and the hand of these cropped images are detected using Convolutional Neural Network (CNN). After that, three types of the detected hand images are generated for each modality and input to RBMs. The outputs of the RBMs for two modalities are fused in another RBM in order to recognize the output sign label of the input image. The proposed multi-modal model is trained on all and part of the American alphabet and digits of four publicly available datasets. We also evaluate the robustness of the proposal against noise. Experimental results show that the proposed multi-modal model, using crops and the RBM fusing methodology, achieves state-of-the-art results on Massey University Gesture Dataset 2012, American Sign Language (ASL). and Fingerspelling Dataset from the University of Surrey’s Center for Vision, Speech and Signal Processing, NYU, and ASL Fingerspelling A datasets.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ RKE2018 Serial 3198
Permanent link to this record
 

 
Author Laura Lopez-Fuentes; Joost Van de Weijer; Marc Bolaños; Harald Skinnemoen
Title (up) Multi-modal Deep Learning Approach for Flood Detection Type Conference Article
Year 2017 Publication MediaEval Benchmarking Initiative for Multimedia Evaluation Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In this paper we propose a multi-modal deep learning approach to detect floods in social media posts. Social media posts normally contain some metadata and/or visual information, therefore in order to detect the floods we use this information. The model is based on a Convolutional Neural Network which extracts the visual features and a bidirectional Long Short-Term Memory network to extract the semantic features from the textual metadata. We validate the
method on images extracted from Flickr which contain both visual information and metadata and compare the results when using both, visual information only or metadata only. This work has been done in the context of the MediaEval Multimedia Satellite Task.
Address Dublin; Ireland; September 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MediaEval
Notes LAMP; 600.084; 600.109; 600.120 Approved no
Call Number Admin @ si @ LWB2017a Serial 2974
Permanent link to this record