Home | << 1 2 3 4 5 6 7 8 9 10 >> |
Records | |||||
---|---|---|---|---|---|
Author | Nil Ballus; Bhalaji Nagarajan; Petia Radeva | ||||
Title | Opt-SSL: An Enhanced Self-Supervised Framework for Food Recognition | Type | Conference Article | ||
Year | 2022 | Publication | 10th Iberian Conference on Pattern Recognition and Image Analysis | Abbreviated Journal | |
Volume | 13256 | Issue | Pages | ||
Keywords | Self-supervised; Contrastive learning; Food recognition | ||||
Abstract | Self-supervised Learning has been showing upbeat performance in several computer vision tasks. The popular contrastive methods make use of a Siamese architecture with different loss functions. In this work, we go deeper into two very recent state of the art frameworks, namely, SimSiam and Barlow Twins. Inspired by them, we propose a new self-supervised learning method we call Opt-SSL that combines both image and feature contrasting. We validate the proposed method on the food recognition task, showing that our proposed framework enables the self-learning networks to learn better visual representations. | ||||
Address | Aveiro; Portugal; May 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IbPRIA | ||
Notes | MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ BNR2022 | Serial | 3782 | ||
Permanent link to this record | |||||
Author | Yaxing Wang; Joost Van de Weijer; Lu Yu; Shangling Jui | ||||
Title | Distilling GANs with Style-Mixed Triplets for X2I Translation with Limited Data | Type | Conference Article | ||
Year | 2022 | Publication | 10th International Conference on Learning Representations | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Conditional image synthesis is an integral part of many X2I translation systems, including image-to-image, text-to-image and audio-to-image translation systems. Training these large systems generally requires huge amounts of training data.
Therefore, we investigate knowledge distillation to transfer knowledge from a high-quality unconditioned generative model (e.g., StyleGAN) to a conditioned synthetic image generation modules in a variety of systems. To initialize the conditional and reference branch (from a unconditional GAN) we exploit the style mixing characteristics of high-quality GANs to generate an infinite supply of style-mixed triplets to perform the knowledge distillation. Extensive experimental results in a number of image generation tasks (i.e., image-to-image, semantic segmentation-to-image, text-to-image and audio-to-image) demonstrate qualitatively and quantitatively that our method successfully transfers knowledge to the synthetic image generation modules, resulting in more realistic images than previous methods as confirmed by a significant drop in the FID. |
||||
Address | Virtual | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICLR | ||
Notes | LAMP; 600.147 | Approved | no | ||
Call Number | Admin @ si @ WWY2022 | Serial | 3791 | ||
Permanent link to this record | |||||
Author | Sergi Garcia Bordils; George Tom; Sangeeth Reddy; Minesh Mathew; Marçal Rusiñol; C.V. Jawahar; Dimosthenis Karatzas | ||||
Title | Read While You Drive-Multilingual Text Tracking on the Road | Type | Conference Article | ||
Year | 2022 | Publication | 15th IAPR International workshop on document analysis systems | Abbreviated Journal | |
Volume | 13237 | Issue | Pages | 756–770 | |
Keywords | |||||
Abstract | Visual data obtained during driving scenarios usually contain large amounts of text that conveys semantic information necessary to analyse the urban environment and is integral to the traffic control plan. Yet, research on autonomous driving or driver assistance systems typically ignores this information. To advance research in this direction, we present RoadText-3K, a large driving video dataset with fully annotated text. RoadText-3K is three times bigger than its predecessor and contains data from varied geographical locations, unconstrained driving conditions and multiple languages and scripts. We offer a comprehensive analysis of tracking by detection and detection by tracking methods exploring the limits of state-of-the-art text detection. Finally, we propose a new end-to-end trainable tracking model that yields state-of-the-art results on this challenging dataset. Our experiments demonstrate the complexity and variability of RoadText-3K and establish a new, realistic benchmark for scene text tracking in the wild. | ||||
Address | La Rochelle; France; May 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-031-06554-5 | Medium | ||
Area | Expedition | Conference | DAS | ||
Notes | DAG; 600.155; 611.022; 611.004 | Approved | no | ||
Call Number | Admin @ si @ GTR2022 | Serial | 3783 | ||
Permanent link to this record | |||||
Author | Patricia Suarez; Dario Carpio; Angel Sappa; Henry Velesaca | ||||
Title | Transformer based Image Dehazing | Type | Conference Article | ||
Year | 2022 | Publication | 16th IEEE International Conference on Signal Image Technology & Internet Based System | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | atmospheric light; brightness component; computational cost; dehazing quality; haze-free image | ||||
Abstract | This paper presents a novel approach to remove non homogeneous haze from real images. The proposed method consists mainly of image feature extraction, haze removal, and image reconstruction. To accomplish this challenging task, we propose an architecture based on transformers, which have been recently introduced and have shown great potential in different computer vision tasks. Our model is based on the SwinIR an image restoration architecture based on a transformer, but by modifying the deep feature extraction module, the depth level of the model, and by applying a combined loss function that improves styling and adapts the model for the non-homogeneous haze removal present in images. The obtained results prove to be superior to those obtained by state-of-the-art models. | ||||
Address | Dijon; France; October 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | SITIS | ||
Notes | MSIAU; no proj | Approved | no | ||
Call Number | Admin @ si @ SCS2022 | Serial | 3803 | ||
Permanent link to this record | |||||
Author | Angel Sappa; Patricia Suarez; Henry Velesaca; Dario Carpio | ||||
Title | Domain Adaptation in Image Dehazing: Exploring the Usage of Images from Virtual Scenarios | Type | Conference Article | ||
Year | 2022 | Publication | 16th International Conference on Computer Graphics, Visualization, Computer Vision and Image Processing | Abbreviated Journal | |
Volume | Issue | Pages | 85-92 | ||
Keywords | Domain adaptation; Synthetic hazed dataset; Dehazing | ||||
Abstract | This work presents a novel domain adaptation strategy for deep learning-based approaches to solve the image dehazing
problem. Firstly, a large set of synthetic images is generated by using a realistic 3D graphic simulator; these synthetic images contain different densities of haze, which are used for training the model that is later adapted to any real scenario. The adaptation process requires just a few images to fine-tune the model parameters. The proposed strategy allows overcoming the limitation of training a given model with few images. In other words, the proposed strategy implements the adaptation of a haze removal model trained with synthetic images to real scenarios. It should be noticed that it is quite difficult, if not impossible, to have large sets of pairs of real-world images (with and without haze) to train in a supervised way dehazing algorithms. Experimental results are provided showing the validity of the proposed domain adaptation strategy. |
||||
Address | Lisboa; Portugal; July 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CGVCVIP | ||
Notes | MSIAU; no proj | Approved | no | ||
Call Number | Admin @ si @ SSV2022 | Serial | 3804 | ||
Permanent link to this record | |||||
Author | Andrea Gemelli; Sanket Biswas; Enrico Civitelli; Josep Llados; Simone Marinai | ||||
Title | Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks | Type | Conference Article | ||
Year | 2022 | Publication | 17th European Conference on Computer Vision Workshops | Abbreviated Journal | |
Volume | 13804 | Issue | Pages | 329–344 | |
Keywords | |||||
Abstract | Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis. The application of Graph Neural Networks (GNNs) has become crucial in various document-related tasks since they can unravel important structural patterns, fundamental in key information extraction processes. Previous works in the literature propose task-driven models and do not take into account the full power of graphs. We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model, to solve different tasks given different types of documents. We evaluated our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-031-25068-2 | Medium | ||
Area | Expedition | Conference | ECCV-TiE | ||
Notes | DAG; 600.162; 600.140; 110.312 | Approved | no | ||
Call Number | Admin @ si @ GBC2022 | Serial | 3795 | ||
Permanent link to this record | |||||
Author | Jorge Charco; Angel Sappa; Boris X. Vintimilla | ||||
Title | Human Pose Estimation through a Novel Multi-view Scheme | Type | Conference Article | ||
Year | 2022 | Publication | 17th International Conference on Computer Vision Theory and Applications (VISAPP 2022) | Abbreviated Journal | |
Volume | 5 | Issue | Pages | 855-862 | |
Keywords | Multi-view Scheme; Human Pose Estimation; Relative Camera Pose; Monocular Approach | ||||
Abstract | This paper presents a multi-view scheme to tackle the challenging problem of the self-occlusion in human pose estimation problem. The proposed approach first obtains the human body joints of a set of images, which are captured from different views at the same time. Then, it enhances the obtained joints by using a
multi-view scheme. Basically, the joints from a given view are used to enhance poorly estimated joints from another view, especially intended to tackle the self occlusions cases. A network architecture initially proposed for the monocular case is adapted to be used in the proposed multi-view scheme. Experimental results and comparisons with the state-of-the-art approaches on Human3.6m dataset are presented showing improvements in the accuracy of body joints estimations. |
||||
Address | On line; Feb 6, 2022 – Feb 8, 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2184-4321 | ISBN | 978-989-758-555-5 | Medium | |
Area | Expedition | Conference | VISAPP | ||
Notes | MSIAU; 600.160 | Approved | no | ||
Call Number | Admin @ si @ CSV2022 | Serial | 3689 | ||
Permanent link to this record | |||||
Author | Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla | ||||
Title | Multi-Image Super-Resolution for Thermal Images | Type | Conference Article | ||
Year | 2022 | Publication | 17th International Conference on Computer Vision Theory and Applications (VISAPP 2022) | Abbreviated Journal | |
Volume | 4 | Issue | Pages | 635-642 | |
Keywords | Thermal Images; Multi-view; Multi-frame; Super-Resolution; Deep Learning; Attention Block | ||||
Abstract | This paper proposes a novel CNN architecture for the multi-thermal image super-resolution problem. In the proposed scheme, the multi-images are synthetically generated by downsampling and slightly shifting the given image; noise is also added to each of these synthesized images. The proposed architecture uses two
attention blocks paths to extract high-frequency details taking advantage of the large information extracted from multiple images of the same scene. Experimental results are provided, showing the proposed scheme has overcome the state-of-the-art approaches. |
||||
Address | Online; Feb 6-8, 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | VISAPP | ||
Notes | MSIAU; 601.349 | Approved | no | ||
Call Number | Admin @ si @ RSV2022a | Serial | 3690 | ||
Permanent link to this record | |||||
Author | Bhalaji Nagarajan; Ricardo Marques; Marcos Mejia; Petia Radeva | ||||
Title | Class-conditional Importance Weighting for Deep Learning with Noisy Labels | Type | Conference Article | ||
Year | 2022 | Publication | 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications | Abbreviated Journal | |
Volume | 5 | Issue | Pages | 679-686 | |
Keywords | Noisy Labeling; Loss Correction; Class-conditional Importance Weighting; Learning with Noisy Labels | ||||
Abstract | Large-scale accurate labels are very important to the Deep Neural Networks to train them and assure high performance. However, it is very expensive to create a clean dataset since usually it relies on human interaction. To this purpose, the labelling process is made cheap with a trade-off of having noisy labels. Learning with Noisy Labels is an active area of research being at the same time very challenging. The recent advances in Self-supervised learning and robust loss functions have helped in advancing noisy label research. In this paper, we propose a loss correction method that relies on dynamic weights computed based on the model training. We extend the existing Contrast to Divide algorithm coupled with DivideMix using a new class-conditional weighted scheme. We validate the method using the standard noise experiments and achieved encouraging results. | ||||
Address | Virtual; February 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | VISAPP | ||
Notes | MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ NMM2022 | Serial | 3798 | ||
Permanent link to this record | |||||
Author | Patricia Suarez; Angel Sappa; Dario Carpio; Henry Velesaca; Francisca Burgos; Patricia Urdiales | ||||
Title | Deep Learning Based Shrimp Classification | Type | Conference Article | ||
Year | 2022 | Publication | 17th International Symposium on Visual Computing | Abbreviated Journal | |
Volume | 13598 | Issue | Pages | 36–45 | |
Keywords | Pigmentation; Color space; Light weight network | ||||
Abstract | This work proposes a novel approach based on deep learning to address the classification of shrimp (Pennaeus vannamei) into two classes, according to their level of pigmentation accepted by shrimp commerce. The main goal of this actual study is to support the shrimp industry in terms of price and process. An efficient CNN architecture is proposed to perform image classification through a program that could be set other in mobile devices or in fixed support in the shrimp supply chain. The proposed approach is a lightweight model that uses HSV color space shrimp images. A simple pipeline shows the most important stages performed to determine a pattern that identifies the class to which they belong based on their pigmentation. For the experiments, a database acquired with mobile devices of various brands and models has been used to capture images of shrimp. The results obtained with the images in the RGB and HSV color space allow for testing the effectiveness of the proposed model. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ISVC | ||
Notes | MSIAU; no proj | Approved | no | ||
Call Number | Admin @ si @ SAC2022 | Serial | 3772 | ||
Permanent link to this record | |||||
Author | Mohamed Ali Souibgui; Sanket Biswas; Sana Khamekhem Jemni; Yousri Kessentini; Alicia Fornes; Josep Llados; Umapada Pal | ||||
Title | DocEnTr: An End-to-End Document Image Enhancement Transformer | Type | Conference Article | ||
Year | 2022 | Publication | 26th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1699-1705 | ||
Keywords | Degradation; Head; Optical character recognition; Self-supervised learning; Benchmark testing; Transformers; Magnetic heads | ||||
Abstract | Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties. In this age of digitization, it is important to denoise them for proper usage. To address this challenge, we present a new encoder-decoder architecture based on vision transformers to enhance both machine-printed and handwritten document images, in an end-to-end fashion. The encoder operates directly on the pixel patches with their positional information without the use of any convolutional layers, while the decoder reconstructs a clean image from the encoded patches. Conducted experiments show a superiority of the proposed model compared to the state-of the-art methods on several DIBCO benchmarks. Code and models will be publicly available at: https://github.com/dali92002/DocEnTR | ||||
Address | August 21-25, 2022 , Montréal Québec | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.121; 600.162; 602.230; 600.140 | Approved | no | ||
Call Number | Admin @ si @ SBJ2022 | Serial | 3730 | ||
Permanent link to this record | |||||
Author | Carlos Boned Riera; Oriol Ramos Terrades | ||||
Title | Discriminative Neural Variational Model for Unbalanced Classification Tasks in Knowledge Graph | Type | Conference Article | ||
Year | 2022 | Publication | 26th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 2186-2191 | ||
Keywords | Measurement; Couplings; Semantics; Ear; Benchmark testing; Data models; Pattern recognition | ||||
Abstract | Nowadays the paradigm of link discovery problems has shown significant improvements on Knowledge Graphs. However, method performances are harmed by the unbalanced nature of this classification problem, since many methods are easily biased to not find proper links. In this paper we present a discriminative neural variational auto-encoder model, called DNVAE from now on, in which we have introduced latent variables to serve as embedding vectors. As a result, the learnt generative model approximate better the underlying distribution and, at the same time, it better differentiate the type of relations in the knowledge graph. We have evaluated this approach on benchmark knowledge graph and Census records. Results in this last data set are quite impressive since we reach the highest possible score in the evaluation metrics. However, further experiments are still needed to deeper evaluate the performance of the method in more challenging tasks. | ||||
Address | Montreal; Quebec; Canada; August 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.121; 600.162 | Approved | no | ||
Call Number | Admin @ si @ BoR2022 | Serial | 3741 | ||
Permanent link to this record | |||||
Author | Vacit Oguz Yazici; Joost Van de Weijer; Longlong Yu | ||||
Title | Visual Transformers with Primal Object Queries for Multi-Label Image Classification | Type | Conference Article | ||
Year | 2022 | Publication | 26th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Multi-label image classification is about predicting a set of class labels that can be considered as orderless sequential data. Transformers process the sequential data as a whole, therefore they are inherently good at set prediction. The first vision-based transformer model, which was proposed for the object detection task introduced the concept of object queries. Object queries are learnable positional encodings that are used by attention modules in decoder layers to decode the object classes or bounding boxes using the region of interests in an image. However, inputting the same set of object queries to different decoder layers hinders the training: it results in lower performance and delays convergence. In this paper, we propose the usage of primal object queries that are only provided at the start of the transformer decoder stack. In addition, we improve the mixup technique proposed for multi-label classification. The proposed transformer model with primal object queries improves the state-of-the-art class wise F1 metric by 2.1% and 1.8%; and speeds up the convergence by 79.0% and 38.6% on MS-COCO and NUS-WIDE datasets respectively. | ||||
Address | Montreal; Quebec; Canada; August 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | LAMP; 600.147; 601.309 | Approved | no | ||
Call Number | Admin @ si @ YWY2022 | Serial | 3786 | ||
Permanent link to this record | |||||
Author | Ayan Banerjee; Palaiahnakote Shivakumara; Parikshit Acharya; Umapada Pal; Josep Llados | ||||
Title | TWD: A New Deep E2E Model for Text Watermark Detection in Video Images | Type | Conference Article | ||
Year | 2022 | Publication | 26th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Deep learning; U-Net; FCENet; Scene text detection; Video text detection; Watermark text detection | ||||
Abstract | Text watermark detection in video images is challenging because text watermark characteristics are different from caption and scene texts in the video images. Developing a successful model for detecting text watermark, caption, and scene texts is an open challenge. This study aims at developing a new Deep End-to-End model for Text Watermark Detection (TWD), caption and scene text in video images. To standardize non-uniform contrast, quality, and resolution, we explore the U-Net3+ model for enhancing poor quality text without affecting high-quality text. Similarly, to address the challenges of arbitrary orientation, text shapes and complex background, we explore Stacked Hourglass Encoded Fourier Contour Embedding Network (SFCENet) by feeding the output of the U-Net3+ model as input. Furthermore, the proposed work integrates enhancement and detection models as an end-to-end model for detecting multi-type text in video images. To validate the proposed model, we create our own dataset (named TW-866), which provides video images containing text watermark, caption (subtitles), as well as scene text. The proposed model is also evaluated on standard natural scene text detection datasets, namely, ICDAR 2019 MLT, CTW1500, Total-Text, and DAST1500. The results show that the proposed method outperforms the existing methods. This is the first work on text watermark detection in video images to the best of our knowledge | ||||
Address | Montreal; Quebec; Canada; August 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; | Approved | no | ||
Call Number | Admin @ si @ BSA2022 | Serial | 3788 | ||
Permanent link to this record | |||||
Author | Aitor Alvarez-Gila; Joost Van de Weijer; Yaxing Wang; Estibaliz Garrote | ||||
Title | MVMO: A Multi-Object Dataset for Wide Baseline Multi-View Semantic Segmentation | Type | Conference Article | ||
Year | 2022 | Publication | 29th IEEE International Conference on Image Processing | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | multi-view; cross-view; semantic segmentation; synthetic dataset | ||||
Abstract | We present MVMO (Multi-View, Multi-Object dataset): a synthetic dataset of 116,000 scenes containing randomly placed objects of 10 distinct classes and captured from 25 camera locations in the upper hemisphere. MVMO comprises photorealistic, path-traced image renders, together with semantic segmentation ground truth for every view. Unlike existing multi-view datasets, MVMO features wide baselines between cameras and high density of objects, which lead to large disparities, heavy occlusions and view-dependent object appearance. Single view semantic segmentation is hindered by self and inter-object occlusions that could benefit from additional viewpoints. Therefore, we expect that MVMO will propel research in multi-view semantic segmentation and cross-view semantic transfer. We also provide baselines that show that new research is needed in such fields to exploit the complementary information of multi-view setups 1 . | ||||
Address | Bordeaux; France; October2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICIP | ||
Notes | LAMP | Approved | no | ||
Call Number | Admin @ si @ AWW2022 | Serial | 3781 | ||
Permanent link to this record |