Home | << 1 2 3 4 5 6 7 8 9 >> |
Records | |||||
---|---|---|---|---|---|
Author | Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) | ||||
Title | 16th International Conference, 2021, Proceedings, Part I | Type | Book Whole | ||
Year | 2021 | Publication | Document Analysis and Recognition – ICDAR 2021 | Abbreviated Journal | |
Volume | 12821 | Issue | Pages | ||
Keywords | |||||
Abstract | This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: historical document analysis, document analysis systems, handwriting recognition, scene text detection and recognition, document image processing, natural language processing (NLP) for document understanding, and graphics, diagram and math recognition. |
||||
Address | Lausanne, Switzerland, September 5-10, 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Cham | Place of Publication | Editor | Josep Llados; Daniel Lopresti; Seiichi Uchida | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-030-86548-1 | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3725 | ||
Permanent link to this record | |||||
Author | Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) | ||||
Title | 16th International Conference, 2021, Proceedings, Part II | Type | Book Whole | ||
Year | 2021 | Publication | Document Analysis and Recognition – ICDAR 2021 | Abbreviated Journal | |
Volume | 12822 | Issue | Pages | ||
Keywords | |||||
Abstract | This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding. |
||||
Address | Lausanne, Switzerland, September 5-10, 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Cham | Place of Publication | Editor | Josep Llados; Daniel Lopresti; Seiichi Uchida | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-030-86330-2 | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3726 | ||
Permanent link to this record | |||||
Author | Josep Llados | ||||
Title | The 5G of Document Intelligence | Type | Conference Article | ||
Year | 2021 | Publication | 3rd Workshop on Future of Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Lausanne; Suissa; September 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | FDAR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3677 | ||
Permanent link to this record | |||||
Author | Jose Luis Gomez; Gabriel Villalonga; Antonio Lopez | ||||
Title | Co-Training for Deep Object Detection: Comparing Single-Modal and Multi-Modal Approaches | Type | Journal Article | ||
Year | 2021 | Publication | Sensors | Abbreviated Journal | SENS |
Volume | 21 | Issue | 9 | Pages | 3185 |
Keywords | co-training; multi-modality; vision-based object detection; ADAS; self-driving | ||||
Abstract | Top-performing computer vision models are powered by convolutional neural networks (CNNs). Training an accurate CNN highly depends on both the raw sensor data and their associated ground truth (GT). Collecting such GT is usually done through human labeling, which is time-consuming and does not scale as we wish. This data-labeling bottleneck may be intensified due to domain shifts among image sensors, which could force per-sensor data labeling. In this paper, we focus on the use of co-training, a semi-supervised learning (SSL) method, for obtaining self-labeled object bounding boxes (BBs), i.e., the GT to train deep object detectors. In particular, we assess the goodness of multi-modal co-training by relying on two different views of an image, namely, appearance (RGB) and estimated depth (D). Moreover, we compare appearance-based single-modal co-training with multi-modal. Our results suggest that in a standard SSL setting (no domain shift, a few human-labeled data) and under virtual-to-real domain shift (many virtual-world labeled data, no human-labeled data) multi-modal co-training outperforms single-modal. In the latter case, by performing GAN-based domain translation both co-training modalities are on par, at least when using an off-the-shelf depth estimation model not specifically trained on the translated images. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | Admin @ si @ GVL2021 | Serial | 3562 | ||
Permanent link to this record | |||||
Author | Jose Elias Yauri; Aura Hernandez-Sabate; Pau Folch; Debora Gil | ||||
Title | Mental Workload Detection Based on EEG Analysis | Type | Conference Article | ||
Year | 2021 | Publication | Artificial Intelligent Research and Development. Proceedings 23rd International Conference of the Catalan Association for Artificial Intelligence. | Abbreviated Journal | |
Volume | 339 | Issue | Pages | 268-277 | |
Keywords | Cognitive states; Mental workload; EEG analysis; Neural Networks. | ||||
Abstract | The study of mental workload becomes essential for human work efficiency, health conditions and to avoid accidents, since workload compromises both performance and awareness. Although workload has been widely studied using several physiological measures, minimising the sensor network as much as possible remains both a challenge and a requirement.
Electroencephalogram (EEG) signals have shown a high correlation to specific cognitive and mental states like workload. However, there is not enough evidence in the literature to validate how well models generalize in case of new subjects performing tasks of a workload similar to the ones included during model’s training. In this paper we propose a binary neural network to classify EEG features across different mental workloads. Two workloads, low and medium, are induced using two variants of the N-Back Test. The proposed model was validated in a dataset collected from 16 subjects and shown a high level of generalization capability: model reported an average recall of 81.81% in a leave-one-out subject evaluation. |
||||
Address | Virtual; October 20-22 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CCIA | ||
Notes | IAM; 600.139; 600.118; 600.145 | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3723 | ||
Permanent link to this record | |||||
Author | Jorge Charco; Angel Sappa; Boris X. Vintimilla; Henry Velesaca | ||||
Title | Camera pose estimation in multi-view environments: From virtual scenarios to the real world | Type | Journal Article | ||
Year | 2021 | Publication | Image and Vision Computing | Abbreviated Journal | IVC |
Volume | 110 | Issue | Pages | 104182 | |
Keywords | |||||
Abstract | This paper presents a domain adaptation strategy to efficiently train network architectures for estimating the relative camera pose in multi-view scenarios. The network architectures are fed by a pair of simultaneously acquired images, hence in order to improve the accuracy of the solutions, and due to the lack of large datasets with pairs of overlapped images, a domain adaptation strategy is proposed. The domain adaptation strategy consists on transferring the knowledge learned from synthetic images to real-world scenarios. For this, the networks are firstly trained using pairs of synthetic images, which are captured at the same time by a pair of cameras in a virtual environment; and then, the learned weights of the networks are transferred to the real-world case, where the networks are retrained with a few real images. Different virtual 3D scenarios are generated to evaluate the relationship between the accuracy on the result and the similarity between virtual and real scenarios—similarity on both geometry of the objects contained in the scene as well as relative pose between camera and objects in the scene. Experimental results and comparisons are provided showing that the accuracy of all the evaluated networks for estimating the camera pose improves when the proposed domain adaptation strategy is used, highlighting the importance on the similarity between virtual-real scenarios. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MSIAU; 600.130; 600.122 | Approved | no | ||
Call Number | Admin @ si @ CSV2021 | Serial | 3577 | ||
Permanent link to this record | |||||
Author | Joan Codina-Filba; Sergio Escalera; Joan Escudero; Coen Antens; Pau Buch-Cardona; Mireia Farrus | ||||
Title | Mobile eHealth Platform for Home Monitoring of Bipolar Disorder | Type | Conference Article | ||
Year | 2021 | Publication | 27th ACM International Conference on Multimedia Modeling | Abbreviated Journal | |
Volume | 12573 | Issue | Pages | 330-341 | |
Keywords | |||||
Abstract | People suffering Bipolar Disorder (BD) experiment changes in mood status having depressive or manic episodes with normal periods in the middle. BD is a chronic disease with a high level of non-adherence to medication that needs a continuous monitoring of patients to detect when they relapse in an episode, so that physicians can take care of them. Here we present MoodRecord, an easy-to-use, non-intrusive, multilingual, robust and scalable platform suitable for home monitoring patients with BD, that allows physicians and relatives to track the patient state and get alarms when abnormalities occur.
MoodRecord takes advantage of the capabilities of smartphones as a communication and recording device to do a continuous monitoring of patients. It automatically records user activity, and asks the user to answer some questions or to record himself in video, according to a predefined plan designed by physicians. The video is analysed, recognising the mood status from images and bipolar assessment scores are extracted from speech parameters. The data obtained from the different sources are merged periodically to observe if a relapse may start and if so, raise the corresponding alarm. The application got a positive evaluation in a pilot with users from three different countries. During the pilot, the predictions of the voice and image modules showed a coherent correlation with the diagnosis performed by clinicians. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | MMM | ||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ CEE2021 | Serial | 3659 | ||
Permanent link to this record | |||||
Author | Jialuo Chen; Mohamed Ali Souibgui; Alicia Fornes; Beata Megyesi | ||||
Title | Unsupervised Alphabet Matching in Historical Encrypted Manuscript Images | Type | Conference Article | ||
Year | 2021 | Publication | 4th International Conference on Historical Cryptology | Abbreviated Journal | |
Volume | Issue | Pages | 34-37 | ||
Keywords | |||||
Abstract | Historical ciphers contain a wide range ofsymbols from various symbol sets. Iden-tifying the cipher alphabet is a prerequi-site before decryption can take place andis a time-consuming process. In this workwe explore the use of image processing foridentifying the underlying alphabet in ci-pher images, and to compare alphabets be-tween ciphers. The experiments show thatciphers with similar alphabets can be suc-cessfully discovered through clustering. | ||||
Address | Virtual; September 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | HistoCrypt | ||
Notes | DAG; 602.230; 600.140; 600.121 | Approved | no | ||
Call Number | Admin @ si @ CSF2021 | Serial | 3617 | ||
Permanent link to this record | |||||
Author | Javier Marin; Sergio Escalera | ||||
Title | SSSGAN: Satellite Style and Structure Generative Adversarial Networks | Type | Journal Article | ||
Year | 2021 | Publication | Remote Sensing | Abbreviated Journal | |
Volume | 13 | Issue | 19 | Pages | 3984 |
Keywords | |||||
Abstract | This work presents Satellite Style and Structure Generative Adversarial Network (SSGAN), a generative model of high resolution satellite imagery to support image segmentation. Based on spatially adaptive denormalization modules (SPADE) that modulate the activations with respect to segmentation map structure, in addition to global descriptor vectors that capture the semantic information in a vector with respect to Open Street Maps (OSM) classes, this model is able to produce
consistent aerial imagery. By decoupling the generation of aerial images into a structure map and a carefully defined style vector, we were able to improve the realism and geodiversity of the synthesis with respect to the state-of-the-art baseline. Therefore, the proposed model allows us to control the generation not only with respect to the desired structure, but also with respect to a geographic area. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ MaE2021 | Serial | 3651 | ||
Permanent link to this record | |||||
Author | Javier M. Olaso; Alain Vazquez; Leila Ben Letaifa; Mikel de Velasco; Aymen Mtibaa; Mohamed Amine Hmani; Dijana Petrovska-Delacretaz; Gerard Chollet; Cesar Montenegro; Asier Lopez-Zorrilla; Raquel Justo; Roberto Santana; Jofre Tenorio-Laranga; Eduardo Gonzalez-Fraile; Begoña Fernandez-Ruanova; Gennaro Cordasco; Anna Esposito; Kristin Beck Gjellesvik; Anna Torp Johansen; Maria Stylianou Kornes; Colin Pickard; Cornelius Glackin; Gary Cahalane; Pau Buch; Cristina Palmero; Sergio Escalera; Olga Gordeeva; Olivier Deroo; Anaïs Fernandez; Daria Kyslitska; Jose Antonio Lozano; Maria Ines Torres; Stephan Schlogl | ||||
Title | The EMPATHIC Virtual Coach: a demo | Type | Conference Article | ||
Year | 2021 | Publication | 23rd ACM International Conference on Multimodal Interaction | Abbreviated Journal | |
Volume | Issue | Pages | 848-851 | ||
Keywords | |||||
Abstract | The main objective of the EMPATHIC project has been the design and development of a virtual coach to engage the healthy-senior user and to enhance well-being through awareness of personal status. The EMPATHIC approach addresses this objective through multimodal interactions supported by the GROW coaching model. The paper summarizes the main components of the EMPATHIC Virtual Coach (EMPATHIC-VC) and introduces a demonstration of the coaching sessions in selected scenarios. | ||||
Address | Virtual; October 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICMI | ||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ OVB2021 | Serial | 3644 | ||
Permanent link to this record | |||||
Author | Javad Zolfaghari Bengar; Joost Van de Weijer; Bartlomiej Twardowski; Bogdan Raducanu | ||||
Title | Reducing Label Effort: Self- Supervised Meets Active Learning | Type | Conference Article | ||
Year | 2021 | Publication | International Conference on Computer Vision Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 1631-1639 | ||
Keywords | |||||
Abstract | Active learning is a paradigm aimed at reducing the annotation effort by training the model on actively selected informative and/or representative samples. Another paradigm to reduce the annotation effort is self-training that learns from a large amount of unlabeled data in an unsupervised way and fine-tunes on few labeled samples. Recent developments in self-training have achieved very impressive results rivaling supervised learning on some datasets. The current work focuses on whether the two paradigms can benefit from each other. We studied object recognition datasets including CIFAR10, CIFAR100 and Tiny ImageNet with several labeling budgets for the evaluations. Our experiments reveal that self-training is remarkably more efficient than active learning at reducing the labeling effort, that for a low labeling budget, active learning offers no benefit to self-training, and finally that the combination of active learning and self-training is fruitful when the labeling budget is high. The performance gap between active learning trained either with self-training or from scratch diminishes as we approach to the point where almost half of the dataset is labeled. | ||||
Address | October 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCVW | ||
Notes | LAMP; | Approved | no | ||
Call Number | Admin @ si @ ZVT2021 | Serial | 3672 | ||
Permanent link to this record | |||||
Author | Javad Zolfaghari Bengar; Bogdan Raducanu; Joost Van de Weijer | ||||
Title | When Deep Learners Change Their Mind: Learning Dynamics for Active Learning | Type | Conference Article | ||
Year | 2021 | Publication | 19th International Conference on Computer Analysis of Images and Patterns | Abbreviated Journal | |
Volume | 13052 | Issue | 1 | Pages | 403-413 |
Keywords | |||||
Abstract | Active learning aims to select samples to be annotated that yield the largest performance improvement for the learning algorithm. Many methods approach this problem by measuring the informativeness of samples and do this based on the certainty of the network predictions for samples. However, it is well-known that neural networks are overly confident about their prediction and are therefore an untrustworthy source to assess sample informativeness. In this paper, we propose a new informativeness-based active learning method. Our measure is derived from the learning dynamics of a neural network. More precisely we track the label assignment of the unlabeled data pool during the training of the algorithm. We capture the learning dynamics with a metric called label-dispersion, which is low when the network consistently assigns the same label to the sample during the training of the network and high when the assigned label changes frequently. We show that label-dispersion is a promising predictor of the uncertainty of the network, and show on two benchmark datasets that an active learning algorithm based on label-dispersion obtains excellent results. | ||||
Address | September 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CAIP | ||
Notes | LAMP; | Approved | no | ||
Call Number | Admin @ si @ ZRV2021 | Serial | 3673 | ||
Permanent link to this record | |||||
Author | Javad Zolfaghari Bengar | ||||
Title | Reducing Label Effort with Deep Active Learning | Type | Book Whole | ||
Year | 2021 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Deep convolutional neural networks (CNNs) have achieved superior performance in many visual recognition applications, such as image classification, detection and segmentation. Training deep CNNs requires huge amounts of labeled data, which is expensive and labor intensive to collect. Active learning is a paradigm aimed at reducing the annotation effort by training the model on actively selected
informative and/or representative samples. In this thesis we study several aspects of active learning including video object detection for autonomous driving systems, image classification on balanced and imbalanced datasets and the incorporation of self-supervised learning in active learning. We briefly describe our approach in each of these areas to reduce the labeling effort. In chapter two we introduce a novel active learning approach for object detection in videos by exploiting temporal coherence. Our criterion is based on the estimated number of errors in terms of false positives and false negatives. Additionally, we introduce a synthetic video dataset, called SYNTHIA-AL, specially designed to evaluate active learning for video object detection in road scenes. Finally, we show that our approach outperforms active learning baselines tested on two outdoor datasets. In the next chapter we address the well-known problem of over confidence in the neural networks. As an alternative to network confidence, we propose a new informativeness-based active learning method that captures the learning dynamics of neural network with a metric called label-dispersion. This metric is low when the network consistently assigns the same label to the sample during the course of training and high when the assigned label changes frequently. We show that label-dispersion is a promising predictor of the uncertainty of the network, and show on two benchmark datasets that an active learning algorithm based on label-dispersion obtains excellent results. In chapter four, we tackle the problem of sampling bias in active learning methods on imbalanced datasets. Active learning is generally studied on balanced datasets where an equal amount of images per class is available. However, real-world datasets suffer from severe imbalanced classes, the so called longtail distribution. We argue that this further complicates the active learning process, since the imbalanced data pool can result in suboptimal classifiers. To address this problem in the context of active learning, we propose a general optimization framework that explicitly takes class-balancing into account. Results on three datasets show that the method is general (it can be combined with most existing active learning algorithms) and can be effectively applied to boost the performance of both informative and representative-based active learning methods. In addition, we show that also on balanced datasets our method generally results in a performance gain. Another paradigm to reduce the annotation effort is self-training that learns from a large amount of unlabeled data in an unsupervised way and fine-tunes on few labeled samples. Recent advancements in self-training have achieved very impressive results rivaling supervised learning on some datasets. In the last chapter we focus on whether active learning and self supervised learning can benefit from each other. We study object recognition datasets with several labeling budgets for the evaluations. Our experiments reveal that self-training is remarkably more efficient than active learning at reducing the labeling effort, that for a low labeling budget, active learning offers no benefit to self-training, and finally that the combination of active learning and self-training is fruitful when the labeling budget is high. |
||||
Address | December 2021 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | IMPRIMA | Place of Publication | Editor | Joost Van de Weijer;Bogdan Raducanu | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-122714-9-2 | Medium | ||
Area | Expedition | Conference | |||
Notes | LAMP; | Approved | no | ||
Call Number | Admin @ si @ Zol2021 | Serial | 3609 | ||
Permanent link to this record | |||||
Author | Idoia Ruiz; Lorenzo Porzi; Samuel Rota Bulo; Peter Kontschieder; Joan Serrat | ||||
Title | Weakly Supervised Multi-Object Tracking and Segmentation | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 125-133 | ||
Keywords | |||||
Abstract | We introduce the problem of weakly supervised MultiObject Tracking and Segmentation, i.e. joint weakly supervised instance segmentation and multi-object tracking, in which we do not provide any kind of mask annotation.
To address it, we design a novel synergistic training strategy by taking advantage of multi-task learning, i.e. classification and tracking tasks guide the training of the unsupervised instance segmentation. For that purpose, we extract weak foreground localization information, provided by Grad-CAM heatmaps, to generate a partial ground truth to learn from. Additionally, RGB image level information is employed to refine the mask prediction at the edges of the objects. We evaluate our method on KITTI MOTS, the most representative benchmark for this task, reducing the performance gap on the MOTSP metric between the fully supervised and weakly supervised approach to just 12% and 12.7 % for cars and pedestrians, respectively. |
||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACVW | ||
Notes | ADAS; 600.118; 600.124 | Approved | no | ||
Call Number | Admin @ si @ RPR2021 | Serial | 3548 | ||
Permanent link to this record | |||||
Author | Hugo Bertiche; Meysam Madadi; Sergio Escalera | ||||
Title | Deep Parametric Surfaces for 3D Outfit Reconstruction from Single View Image | Type | Conference Article | ||
Year | 2021 | Publication | 16th IEEE International Conference on Automatic Face and Gesture Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1-8 | ||
Keywords | |||||
Abstract | We present a methodology to retrieve analytical surfaces parametrized as a neural network. Previous works on 3D reconstruction yield point clouds, voxelized objects or meshes. Instead, our approach yields 2-manifolds in the euclidean space through deep learning. To this end, we implement a novel formulation for fully connected layers as parametrized manifolds that allows continuous predictions with differential geometry. Based on this property we propose a novel smoothness loss. Results on CLOTH3D++ dataset show the possibility to infer different topologies and the benefits of the smoothness term based on differential geometry. | ||||
Address | Virtual; December 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | FG | ||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ BME2021 | Serial | 3640 | ||
Permanent link to this record |