Records |
Author |
Marçal Rusiñol; J. Chazalon; Katerine Diaz |
Title |
Augmented Songbook: an Augmented Reality Educational Application for Raising Music Awareness |
Type |
Journal Article |
Year |
2018 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
77 |
Issue |
11 |
Pages |
13773-13798 |
Keywords |
Augmented reality; Document image matching; Educational applications |
Abstract |
This paper presents the development of an Augmented Reality mobile application which aims at sensibilizing young children to abstract concepts of music. Such concepts are, for instance, the musical notation or the idea of rhythm. Recent studies in Augmented Reality for education suggest that such technologies have multiple benefits for students, including younger ones. As mobile document image acquisition and processing gains maturity on mobile platforms, we explore how it is possible to build a markerless and real-time application to augment the physical documents with didactic animations and interactive virtual content. Given a standard image processing pipeline, we compare the performance of different local descriptors at two key stages of the process. Results suggest alternatives to the SIFT local descriptors, regarding result quality and computational efficiency, both for document model identification and perspective transform estimation. All experiments are performed on an original and public dataset we introduce here. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG; ADAS; 600.084; 600.121; 600.118; 600.129 |
Approved |
no |
Call Number |
Admin @ si @ RCD2018 |
Serial |
2996 |
Permanent link to this record |
|
|
|
Author |
Svebor Karaman; Andrew Bagdanov; Lea Landucci; Gianpaolo D'Amico; Andrea Ferracani; Daniele Pezzatini; Alberto del Bimbo |
Title |
Personalized multimedia content delivery on an interactive table by passive observation of museum visitors |
Type |
Journal Article |
Year |
2016 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
75 |
Issue |
7 |
Pages |
3787-3811 |
Keywords |
Computer vision; Video surveillance; Cultural heritage; Multimedia museum; Personalization; Natural interaction; Passive profiling |
Abstract |
The amount of multimedia data collected in museum databases is growing fast, while the capacity of museums to display information to visitors is acutely limited by physical space. Museums must seek the perfect balance of information given on individual pieces in order to provide sufficient information to aid visitor understanding while maintaining sparse usage of the walls and guaranteeing high appreciation of the exhibit. Moreover, museums often target the interests of average visitors instead of the entire spectrum of different interests each individual visitor might have. Finally, visiting a museum should not be an experience contained in the physical space of the museum but a door opened onto a broader context of related artworks, authors, artistic trends, etc. In this paper we describe the MNEMOSYNE system that attempts to address these issues through a new multimedia museum experience. Based on passive observation, the system builds a profile of the artworks of interest for each visitor. These profiles of interest are then used to drive an interactive table that personalizes multimedia content delivery. The natural user interface on the interactive table uses the visitor’s profile, an ontology of museum content and a recommendation system to personalize exploration of multimedia content. At the end of their visit, the visitor can take home a personalized summary of their visit on a custom mobile application. In this article we describe in detail each component of our approach as well as the first field trials of our prototype system built and deployed at our permanent exhibition space at LeMurate (http://www.lemurate.comune.fi.it/lemurate/) in Florence together with the first results of the evaluation process during the official installation in the National Museum of Bargello (http://www.uffizi.firenze.it/musei/?m=bargello). |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
Springer US |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1380-7501 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
LAMP; 601.240; 600.079 |
Approved |
no |
Call Number |
Admin @ si @ KBL2016 |
Serial |
2520 |
Permanent link to this record |
|
|
|
Author |
Naveen Onkarappa; Angel Sappa |
Title |
Synthetic sequences and ground-truth flow field generation for algorithm validation |
Type |
Journal Article |
Year |
2015 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
74 |
Issue |
9 |
Pages |
3121-3135 |
Keywords |
Ground-truth optical flow; Synthetic sequence; Algorithm validation |
Abstract |
Research in computer vision is advancing by the availability of good datasets that help to improve algorithms, validate results and obtain comparative analysis. The datasets can be real or synthetic. For some of the computer vision problems such as optical flow it is not possible to obtain ground-truth optical flow with high accuracy in natural outdoor real scenarios directly by any sensor, although it is possible to obtain ground-truth data of real scenarios in a laboratory setup with limited motion. In this difficult situation computer graphics offers a viable option for creating realistic virtual scenarios. In the current work we present a framework to design virtual scenes and generate sequences as well as ground-truth flow fields. Particularly, we generate a dataset containing sequences of driving scenarios. The sequences in the dataset vary in different speeds of the on-board vision system, different road textures, complex motion of vehicle and independent moving vehicles in the scene. This dataset enables analyzing and adaptation of existing optical flow methods, and leads to invention of new approaches particularly for driver assistance systems. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
Springer US |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1380-7501 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
ADAS; 600.055; 601.215; 600.076 |
Approved |
no |
Call Number |
Admin @ si @ OnS2014b |
Serial |
2472 |
Permanent link to this record |
|
|
|
Author |
Cesar Isaza; Joaquin Salas; Bogdan Raducanu |
Title |
Rendering ground truth data sets to detect shadows cast by static objects in outdoors |
Type |
Journal Article |
Year |
2014 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
70 |
Issue |
1 |
Pages |
557-571 |
Keywords |
Synthetic ground truth data set; Sun position; Shadow detection; Static objects shadow detection |
Abstract |
In our work, we are particularly interested in studying the shadows cast by static objects in outdoor environments, during daytime. To assess the accuracy of a shadow detection algorithm, we need ground truth information. The collection of such information is a very tedious task because it is a process that requires manual annotation. To overcome this severe limitation, we propose in this paper a methodology to automatically render ground truth using a virtual environment. To increase the degree of realism and usefulness of the simulated environment, we incorporate in the scenario the precise longitude, latitude and elevation of the actual location of the object, as well as the sun’s position for a given time and day. To evaluate our method, we consider a qualitative and a quantitative comparison. In the quantitative one, we analyze the shadow cast by a real object in a particular geographical location and its corresponding rendered model. To evaluate qualitatively the methodology, we use some ground truth images obtained both manually and automatically. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
Springer US |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1380-7501 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
LAMP; |
Approved |
no |
Call Number |
Admin @ si @ ISR2014 |
Serial |
2229 |
Permanent link to this record |
|
|
|
Author |
Palaiahnakote Shivakumara; Anjan Dutta; Chew Lim Tan; Umapada Pal |
Title |
Multi-oriented scene text detection in video based on wavelet and angle projection boundary growing |
Type |
Journal Article |
Year |
2014 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
72 |
Issue |
1 |
Pages |
515-539 |
Keywords |
|
Abstract |
In this paper, we address two complex issues: 1) Text frame classification and 2) Multi-oriented text detection in video text frame. We first divide a video frame into 16 blocks and propose a combination of wavelet and median-moments with k-means clustering at the block level to identify probable text blocks. For each probable text block, the method applies the same combination of feature with k-means clustering over a sliding window running through the blocks to identify potential text candidates. We introduce a new idea of symmetry on text candidates in each block based on the observation that pixel distribution in text exhibits a symmetric pattern. The method integrates all blocks containing text candidates in the frame and then all text candidates are mapped on to a Sobel edge map of the original frame to obtain text representatives. To tackle the multi-orientation problem, we present a new method called Angle Projection Boundary Growing (APBG) which is an iterative algorithm and works based on a nearest neighbor concept. APBG is then applied on the text representatives to fix the bounding box for multi-oriented text lines in the video frame. Directional information is used to eliminate false positives. Experimental results on a variety of datasets such as non-horizontal, horizontal, publicly available data (Hua’s data) and ICDAR-03 competition data (camera images) show that the proposed method outperforms existing methods proposed for video and the state of the art methods for scene text as well. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
Springer US |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1380-7501 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG; 600.077 |
Approved |
no |
Call Number |
Admin @ si @ SDT2014 |
Serial |
2357 |
Permanent link to this record |
|
|
|
Author |
Bogdan Raducanu; D. Gatica-Perez |
Title |
Inferring competitive role patterns in reality TV show through nonverbal analysis |
Type |
Journal Article |
Year |
2012 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
56 |
Issue |
1 |
Pages |
207-226 |
Keywords |
|
Abstract |
This paper introduces a new facet of social media, namely that depicting social interaction. More concretely, we address this problem from the perspective of nonverbal behavior-based analysis of competitive meetings. For our study, we made use of “The Apprentice” reality TV show, which features a competition for a real, highly paid corporate job. Our analysis is centered around two tasks regarding a person's role in a meeting: predicting the person with the highest status, and predicting the fired candidates. We address this problem by adopting both supervised and unsupervised strategies. The current study was carried out using nonverbal audio cues. Our approach is based only on the nonverbal interaction dynamics during the meeting without relying on the spoken words. The analysis is based on two types of data: individual and relational measures. Results obtained from the analysis of a full season of the show are promising (up to 85.7% of accuracy in the first case and up to 92.8% in the second case). Our approach has been conveniently compared with the Influence Model, demonstrating its superiority. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1380-7501 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
OR;MV |
Approved |
no |
Call Number |
BCNPCL @ bcnpcl @ RaG2012 |
Serial |
1360 |
Permanent link to this record |
|
|
|
Author |
Kunal Biswas; Palaiahnakote Shivakumara; Umapada Pal; Tong Lu; Michel Blumenstein; Josep Llados |
Title |
Classification of aesthetic natural scene images using statistical and semantic features |
Type |
Journal Article |
Year |
2023 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
82 |
Issue |
9 |
Pages |
13507-13532 |
Keywords |
|
Abstract |
Aesthetic image analysis is essential for improving the performance of multimedia image retrieval systems, especially from a repository of social media and multimedia content stored on mobile devices. This paper presents a novel method for classifying aesthetic natural scene images by studying the naturalness of image content using statistical features, and reading text in the images using semantic features. Unlike existing methods that focus only on image quality with human information, the proposed approach focuses on image features as well as text-based semantic features without human intervention to reduce the gap between subjectivity and objectivity in the classification. The aesthetic classes considered in this work are (i) Very Pleasant, (ii) Pleasant, (iii) Normal and (iv) Unpleasant. The naturalness is represented by features of focus, defocus, perceived brightness, perceived contrast, blurriness and noisiness, while semantics are represented by text recognition, description of the images and labels of images, profile pictures, and banner images. Furthermore, a deep learning model is proposed in a novel way to fuse statistical and semantic features for the classification of aesthetic natural scene images. Experiments on our own dataset and the standard datasets demonstrate that the proposed approach achieves 92.74%, 88.67% and 83.22% average classification rates on our own dataset, AVA dataset and CUHKPQ dataset, respectively. Furthermore, a comparative study of the proposed model with the existing methods shows that the proposed method is effective for the classification of aesthetic social media images. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG |
Approved |
no |
Call Number |
Admin @ si @ BSP2023 |
Serial |
3873 |
Permanent link to this record |
|
|
|
Author |
Anastasios Doulamis; Nikolaos Doulamis; Marco Bertini; Jordi Gonzalez; Thomas B. Moeslund |
Title |
Introduction to the Special Issue on the Analysis and Retrieval of Events/Actions and Workflows in Video Streams |
Type |
Journal Article |
Year |
2016 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
75 |
Issue |
22 |
Pages |
14985-14990 |
Keywords |
|
Abstract |
|
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
ISE; HUPBA |
Approved |
no |
Call Number |
Admin @ si @ DDB2016 |
Serial |
2934 |
Permanent link to this record |
|
|
|
Author |
W.Win; B.Bao; Q.Xu; Luis Herranz; Shuqiang Jiang |
Title |
Editorial Note: Efficient Multimedia Processing Methods and Applications |
Type |
Miscellaneous |
Year |
2019 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
78 |
Issue |
1 |
Pages |
|
Keywords |
|
Abstract |
|
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
LAMP; 600.141; 600.120 |
Approved |
no |
Call Number |
Admin @ si @ WBX2019 |
Serial |
3257 |
Permanent link to this record |
|
|
|
Author |
Rahma Kalboussi; Aymen Azaza; Joost Van de Weijer; Mehrez Abdellaoui; Ali Douik |
Title |
Object proposals for salient object segmentation in videos |
Type |
Journal Article |
Year |
2020 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
79 |
Issue |
13 |
Pages |
8677-8693 |
Keywords |
|
Abstract |
Salient object segmentation in videos is generally broken up in a video segmentation part and a saliency assignment part. Recently, object proposals, which are used to segment the image, have had significant impact on many computer vision applications, including image segmentation, object detection, and recently saliency detection in still images. However, their usage has not yet been evaluated for salient object segmentation in videos. Therefore, in this paper, we investigate the application of object proposals to salient object segmentation in videos. In addition, we propose a new motion feature derived from the optical flow structure tensor for video saliency detection. Experiments on two standard benchmark datasets for video saliency show that the proposed motion feature improves saliency estimation results, and that object proposals are an efficient method for salient object segmentation. Results on the challenging SegTrack v2 and Fukuchi benchmark data sets show that we significantly outperform the state-of-the-art. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
LAMP; 600.120 |
Approved |
no |
Call Number |
KAW2020 |
Serial |
3504 |
Permanent link to this record |
|
|
|
Author |
Henry Velesaca; Gisel Bastidas-Guacho; Mohammad Rouhani; Angel Sappa |
Title |
Multimodal image registration techniques: a comprehensive survey |
Type |
Journal Article |
Year |
2024 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
This manuscript presents a review of state-of-the-art techniques proposed in the literature for multimodal image registration, addressing instances where images from different modalities need to be precisely aligned in the same reference system. This scenario arises when the images to be registered come from different modalities, among the visible and thermal spectral bands, 3D-RGB, or flash-no flash, or NIR-visible. The review spans different techniques from classical approaches to more modern ones based on deep learning, aiming to highlight the particularities required at each step in the registration pipeline when dealing with multimodal images. It is noteworthy that medical images are excluded from this review due to their specific characteristics, including the use of both active and passive sensors or the non-rigid nature of the body contained in the image. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MSIAU |
Approved |
no |
Call Number |
Admin @ si @ VBR2024 |
Serial |
3997 |
Permanent link to this record |
|
|
|
Author |
Razieh Rastgoo; Kourosh Kiani; Sergio Escalera |
Title |
A transformer model for boundary detection in continuous sign language |
Type |
Journal Article |
Year |
2024 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
Sign Language Recognition (SLR) has garnered significant attention from researchers in recent years, particularly the intricate domain of Continuous Sign Language Recognition (CSLR), which presents heightened complexity compared to Isolated Sign Language Recognition (ISLR). One of the prominent challenges in CSLR pertains to accurately detecting the boundaries of isolated signs within a continuous video stream. Additionally, the reliance on handcrafted features in existing models poses a challenge to achieving optimal accuracy. To surmount these challenges, we propose a novel approach utilizing a Transformer-based model. Unlike traditional models, our approach focuses on enhancing accuracy while eliminating the need for handcrafted features. The Transformer model is employed for both ISLR and CSLR. The training process involves using isolated sign videos, where hand keypoint features extracted from the input video are enriched using the Transformer model. Subsequently, these enriched features are forwarded to the final classification layer. The trained model, coupled with a post-processing method, is then applied to detect isolated sign boundaries within continuous sign videos. The evaluation of our model is conducted on two distinct datasets, including both continuous signs and their corresponding isolated signs, demonstrates promising results. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
HUPBA |
Approved |
no |
Call Number |
Admin @ si @ RKE2024 |
Serial |
4016 |
Permanent link to this record |
|
|
|
Author |
Laura Lopez-Fuentes; Joost Van de Weijer; Manuel Gonzalez-Hidalgo; Harald Skinnemoen; Andrew Bagdanov |
Title |
Review on computer vision techniques in emergency situations |
Type |
Journal Article |
Year |
2018 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
77 |
Issue |
13 |
Pages |
17069–17107 |
Keywords |
Emergency management; Computer vision; Decision makers; Situational awareness; Critical situation |
Abstract |
In emergency situations, actions that save lives and limit the impact of hazards are crucial. In order to act, situational awareness is needed to decide what to do. Geolocalized photos and video of the situations as they evolve can be crucial in better understanding them and making decisions faster. Cameras are almost everywhere these days, either in terms of smartphones, installed CCTV cameras, UAVs or others. However, this poses challenges in big data and information overflow. Moreover, most of the time there are no disasters at any given location, so humans aiming to detect sudden situations may not be as alert as needed at any point in time. Consequently, computer vision tools can be an excellent decision support. The number of emergencies where computer vision tools has been considered or used is very wide, and there is a great overlap across related emergency research. Researchers tend to focus on state-of-the-art systems that cover the same emergency as they are studying, obviating important research in other fields. In order to unveil this overlap, the survey is divided along four main axes: the types of emergencies that have been studied in computer vision, the objective that the algorithms can address, the type of hardware needed and the algorithms used. Therefore, this review provides a broad overview of the progress of computer vision covering all sorts of emergencies. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
LAMP; 600.068; 600.120 |
Approved |
no |
Call Number |
Admin @ si @ LWG2018 |
Serial |
3041 |
Permanent link to this record |
|
|
|
Author |
Andre Litvin; Kamal Nasrollahi; Sergio Escalera; Cagri Ozcinar; Thomas B. Moeslund; Gholamreza Anbarjafari |
Title |
A Novel Deep Network Architecture for Reconstructing RGB Facial Images from Thermal for Face Recognition |
Type |
Journal Article |
Year |
2019 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
78 |
Issue |
18 |
Pages |
25259–25271 |
Keywords |
Fully convolutional networks; FusionNet; Thermal imaging; Face recognition |
Abstract |
This work proposes a fully convolutional network architecture for RGB face image generation from a given input thermal face image to be applied in face recognition scenarios. The proposed method is based on the FusionNet architecture and increases robustness against overfitting using dropout after bridge connections, randomised leaky ReLUs (RReLUs), and orthogonal regularization. Furthermore, we propose to use a decoding block with resize convolution instead of transposed convolution to improve final RGB face image generation. To validate our proposed network architecture, we train a face classifier and compare its face recognition rate on the reconstructed RGB images from the proposed architecture, to those when reconstructing images with the original FusionNet, as well as when using the original RGB images. As a result, we are introducing a new architecture which leads to a more accurate network. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
HuPBA; no menciona |
Approved |
no |
Call Number |
Admin @ si @ LNE2019 |
Serial |
3318 |
Permanent link to this record |
|
|
|
Author |
Egils Avots; Meysam Madadi; Sergio Escalera; Jordi Gonzalez; Xavier Baro; Paul Pallin; Gholamreza Anbarjafari |
Title |
From 2D to 3D geodesic-based garment matching |
Type |
Journal Article |
Year |
2019 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
78 |
Issue |
18 |
Pages |
25829–25853 |
Keywords |
Shape matching; Geodesic distance; Texture mapping; RGBD image processing; Gaussian mixture model |
Abstract |
A new approach for 2D to 3D garment retexturing is proposed based on Gaussian mixture models and thin plate splines (TPS). An automatically segmented garment of an individual is matched to a new source garment and rendered, resulting in augmented images in which the target garment has been retextured using the texture of the source garment. We divide the problem into garment boundary matching based on Gaussian mixture models and then interpolate inner points using surface topology extracted through geodesic paths, which leads to a more realistic result than standard approaches. We evaluated and compared our system quantitatively by root mean square error (RMS) and qualitatively using the mean opinion score (MOS), showing the benefits of the proposed methodology on our gathered dataset. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
HuPBA; ISE; 600.098; 600.119; 602.133 |
Approved |
no |
Call Number |
Admin @ si @ AME2019 |
Serial |
3317 |
Permanent link to this record |