toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links (down)
Author Santi Puch; Irina Sanchez; Aura Hernandez-Sabate; Gemma Piella; Vesna Prckovska edit   pdf
url  openurl
  Title Global Planar Convolutions for Improved Context Aggregation in Brain Tumor Segmentation Type Conference Article
  Year 2018 Publication International MICCAI Brainlesion Workshop Abbreviated Journal  
  Volume 11384 Issue Pages 393-405  
  Keywords Brain tumors; 3D fully-convolutional CNN; Magnetic resonance imaging; Global planar convolution  
  Abstract In this work, we introduce the Global Planar Convolution module as a building-block for fully-convolutional networks that aggregates global information and, therefore, enhances the context perception capabilities of segmentation networks in the context of brain tumor segmentation. We implement two baseline architectures (3D UNet and a residual version of 3D UNet, ResUNet) and present a novel architecture based on these two architectures, ContextNet, that includes the proposed Global Planar Convolution module. We show that the addition of such module eliminates the need of building networks with several representation levels, which tend to be over-parametrized and to showcase slow rates of convergence. Furthermore, we provide a visual demonstration of the behavior of GPC modules via visualization of intermediate representations. We finally participate in the 2018 edition of the BraTS challenge with our best performing models, that are based on ContextNet, and report the evaluation scores on the validation and the test sets of the challenge.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MICCAIW  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ PSH2018 Serial 3251  
Permanent link to this record
 

 
Author Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas edit   pdf
url  openurl
  Title Learning from# Barcelona Instagram data what Locals and Tourists post about its Neighbourhoods Type Conference Article
  Year 2018 Publication 15th European Conference on Computer Vision Workshops Abbreviated Journal  
  Volume 11134 Issue Pages 530-544  
  Keywords  
  Abstract Massive tourism is becoming a big problem for some cities, such as Barcelona, due to its concentration in some neighborhoods. In this work we gather Instagram data related to Barcelona consisting on images-captions pairs and, using the text as a supervisory signal, we learn relations between images, words and neighborhoods. Our goal is to learn which visual elements appear in photos when people is posting about each neighborhood. We perform a language separate treatment of the data and show that it can be extrapolated to a tourists and locals separate analysis, and that tourism is reflected in Social Media at a neighborhood level. The presented pipeline allows analyzing the differences between the images that tourists and locals associate to the different neighborhoods. The proposed method, which can be extended to other cities or subjects, proves that Instagram data can be used to train multi-modal (image and text) machine learning models that are useful to analyze publications about a city at a neighborhood level. We publish the collected dataset, InstaBarcelona and the code used in the analysis.  
  Address Munich; Alemanya; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes DAG; 600.129; 601.338; 600.121 Approved no  
  Call Number Admin @ si @ GGG2018b Serial 3176  
Permanent link to this record
 

 
Author Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas edit   pdf
url  openurl
  Title Learning to Learn from Web Data through Deep Semantic Embeddings Type Conference Article
  Year 2018 Publication 15th European Conference on Computer Vision Workshops Abbreviated Journal  
  Volume 11134 Issue Pages 514-529  
  Keywords  
  Abstract In this paper we propose to learn a multimodal image and text embedding from Web and Social Media data, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the pipeline can learn from images with associated text without supervision and perform a thourough analysis of five different text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text based image retrieval task, and we clearly outperform state of the art in the MIRFlickr dataset when training in the target data. Further we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.  
  Address Munich; Alemanya; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes DAG; 600.129; 601.338; 600.121 Approved no  
  Call Number Admin @ si @ GGG2018a Serial 3175  
Permanent link to this record
 

 
Author Esmitt Ramirez; Carles Sanchez; Agnes Borras; Marta Diez-Ferrer; Antoni Rosell; Debora Gil edit   pdf
url  openurl
  Title Image-Based Bronchial Anatomy Codification for Biopsy Guiding in Video Bronchoscopy Type Conference Article
  Year 2018 Publication OR 2.0 Context-Aware Operating Theaters, Computer Assisted Robotic Endoscopy, Clinical Image-Based Procedures, and Skin Image Analysis Abbreviated Journal  
  Volume 11041 Issue Pages  
  Keywords Biopsy guiding; Bronchoscopy; Lung biopsy; Intervention guiding; Airway codification  
  Abstract Bronchoscopy examinations allow biopsy of pulmonary nodules with minimum risk for the patient. Even for experienced bronchoscopists, it is difficult to guide the bronchoscope to most distal lesions and obtain an accurate diagnosis. This paper presents an image-based codification of the bronchial anatomy for bronchoscopy biopsy guiding. The 3D anatomy of each patient is codified as a binary tree with nodes representing bronchial levels and edges labeled using their position on images projecting the 3D anatomy from a set of branching points. The paths from the root to leaves provide a codification of navigation routes with spatially consistent labels according to the anatomy observes in video bronchoscopy explorations. We evaluate our labeling approach as a guiding system in terms of the number of bronchial levels correctly codified, also in the number of labels-based instructions correctly supplied, using generalized mixed models and computer-generated data. Results obtained for three independent observers prove the consistency and reproducibility of our guiding system. We trust that our codification based on viewer’s projection might be used as a foundation for the navigation process in Virtual Bronchoscopy systems.  
  Address Granada; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MICCAIW  
  Notes IAM; 600.096; 600.075; 601.323; 600.145 Approved no  
  Call Number Admin @ si @ RSB2018b Serial 3137  
Permanent link to this record
 

 
Author Stefan Schurischuster; Beatriz Remeseiro; Petia Radeva; Martin Kampel edit  url
openurl 
  Title A Preliminary Study of Image Analysis for Parasite Detection on Honey Bees Type Conference Article
  Year 2018 Publication 15th International Conference on Image Analysis and Recognition Abbreviated Journal  
  Volume 10882 Issue Pages 465-473  
  Keywords  
  Abstract Varroa destructor is a parasite harming bee colonies. As the worldwide bee population is in danger, beekeepers as well as researchers are looking for methods to monitor the health of bee hives. In this context, we present a preliminary study to detect parasites on bee videos by means of image analysis and machine learning techniques. For this purpose, each video frame is analyzed individually to extract bee image patches, which are then processed to compute image descriptors and finally classified into mite and no mite bees. The experimental results demonstrated the adequacy of the proposed method, which will be a perfect stepping stone for a further bee monitoring system.  
  Address Povoa de Varzim; Portugal; June 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIAR  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ SRR2018a Serial 3110  
Permanent link to this record
 

 
Author Joan Codina-Filba; Sergio Escalera; Joan Escudero; Coen Antens; Pau Buch-Cardona; Mireia Farrus edit  url
openurl 
  Title Mobile eHealth Platform for Home Monitoring of Bipolar Disorder Type Conference Article
  Year 2021 Publication 27th ACM International Conference on Multimedia Modeling Abbreviated Journal  
  Volume 12573 Issue Pages 330-341  
  Keywords  
  Abstract People suffering Bipolar Disorder (BD) experiment changes in mood status having depressive or manic episodes with normal periods in the middle. BD is a chronic disease with a high level of non-adherence to medication that needs a continuous monitoring of patients to detect when they relapse in an episode, so that physicians can take care of them. Here we present MoodRecord, an easy-to-use, non-intrusive, multilingual, robust and scalable platform suitable for home monitoring patients with BD, that allows physicians and relatives to track the patient state and get alarms when abnormalities occur.

MoodRecord takes advantage of the capabilities of smartphones as a communication and recording device to do a continuous monitoring of patients. It automatically records user activity, and asks the user to answer some questions or to record himself in video, according to a predefined plan designed by physicians. The video is analysed, recognising the mood status from images and bipolar assessment scores are extracted from speech parameters. The data obtained from the different sources are merged periodically to observe if a relapse may start and if so, raise the corresponding alarm. The application got a positive evaluation in a pilot with users from three different countries. During the pilot, the predictions of the voice and image modules showed a coherent correlation with the diagnosis performed by clinicians.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MMM  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ CEE2021 Serial 3659  
Permanent link to this record
 

 
Author Estefania Talavera; Nicolai Petkov; Petia Radeva edit   pdf
url  doi
openurl 
  Title Unsupervised Routine Discovery in Egocentric Photo-Streams Type Conference Article
  Year 2019 Publication 18th International Conference on Computer Analysis of Images and Patterns Abbreviated Journal  
  Volume 11678 Issue Pages 576-588  
  Keywords Routine discovery; Lifestyle; Egocentric vision; Behaviour analysis  
  Abstract The routine of a person is defined by the occurrence of activities throughout different days, and can directly affect the person’s health. In this work, we address the recognition of routine related days. To do so, we rely on egocentric images, which are recorded by a wearable camera and allow to monitor the life of the user from a first-person view perspective. We propose an unsupervised model that identifies routine related days, following an outlier detection approach. We test the proposed framework over a total of 72 days in the form of photo-streams covering around 2 weeks of the life of 5 different camera wearers. Our model achieves an average of 76% Accuracy and 68% Weighted F-Score for all the users. Thus, we show that our framework is able to recognise routine related days and opens the door to the understanding of the behaviour of people.  
  Address Salermo; Italy; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CAIP  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ TPR2019a Serial 3367  
Permanent link to this record
 

 
Author Simone Balocco; Mauricio Gonzalez; Ricardo Ñancule; Petia Radeva; Gabriel Thomas edit  url
openurl 
  Title Calcified Plaque Detection in IVUS Sequences: Preliminary Results Using Convolutional Nets Type Conference Article
  Year 2018 Publication International Workshop on Artificial Intelligence and Pattern Recognition Abbreviated Journal  
  Volume 11047 Issue Pages 34-42  
  Keywords Intravascular ultrasound images; Convolutional nets; Deep learning; Medical image analysis  
  Abstract The manual inspection of intravascular ultrasound (IVUS) images to detect clinically relevant patterns is a difficult and laborious task performed routinely by physicians. In this paper, we present a framework based on convolutional nets for the quick selection of IVUS frames containing arterial calcification, a pattern whose detection plays a vital role in the diagnosis of atherosclerosis. Preliminary experiments on a dataset acquired from eighty patients show that convolutional architectures improve detections of a shallow classifier in terms of 𝐹1-measure, precision and recall.  
  Address Cuba; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IWAIPR  
  Notes MILAB; no menciona Approved no  
  Call Number Admin @ si @ BGÑ2018 Serial 3237  
Permanent link to this record
 

 
Author Md. Mostafa Kamal Sarker; Hatem A. Rashwan; Farhan Akram; Syeda Furruka Banu; Adel Saleh; Vivek Kumar Singh; Forhad U. H. Chowdhury; Saddam Abdulwahab; Santiago Romani; Petia Radeva; Domenec Puig edit   pdf
url  openurl
  Title SLSDeep: Skin Lesion Segmentation Based on Dilated Residual and Pyramid Pooling Networks. Type Conference Article
  Year 2018 Publication 21st International Conference on Medical Image Computing & Computer Assisted Intervention Abbreviated Journal  
  Volume 2 Issue Pages 21-29  
  Keywords  
  Abstract Skin lesion segmentation (SLS) in dermoscopic images is a crucial task for automated diagnosis of melanoma. In this paper, we present a robust deep learning SLS model, so-called SLSDeep, which is represented as an encoder-decoder network. The encoder network is constructed by dilated residual layers, in turn, a pyramid pooling network followed by three convolution layers is used for the decoder. Unlike the traditional methods employing a cross-entropy loss, we investigated a loss function by combining both Negative Log Likelihood (NLL) and End Point Error (EPE) to accurately segment the melanoma regions with sharp boundaries. The robustness of the proposed model was evaluated on two public databases: ISBI 2016 and 2017 for skin lesion analysis towards melanoma detection challenge. The proposed model outperforms the state-of-the-art methods in terms of segmentation accuracy. Moreover, it is capable to segment more than 100 images of size 384x384 per second on a recent GPU.  
  Address Granada; Espanya; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MICCAI  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ SRA2018 Serial 3112  
Permanent link to this record
 

 
Author Asma Bensalah; Jialuo Chen; Alicia Fornes; Cristina Carmona_Duarte; Josep Llados; Miguel A. Ferrer edit   pdf
url  openurl
  Title Towards Stroke Patients' Upper-limb Automatic Motor Assessment Using Smartwatches. Type Conference Article
  Year 2020 Publication International Workshop on Artificial Intelligence for Healthcare Applications Abbreviated Journal  
  Volume 12661 Issue Pages 476-489  
  Keywords  
  Abstract Assessing the physical condition in rehabilitation scenarios is a challenging problem, since it involves Human Activity Recognition (HAR) and kinematic analysis methods. In addition, the difficulties increase in unconstrained rehabilitation scenarios, which are much closer to the real use cases. In particular, our aim is to design an upper-limb assessment pipeline for stroke patients using smartwatches. We focus on the HAR task, as it is the first part of the assessing pipeline. Our main target is to automatically detect and recognize four key movements inspired by the Fugl-Meyer assessment scale, which are performed in both constrained and unconstrained scenarios. In addition to the application protocol and dataset, we propose two detection and classification baseline methods. We believe that the proposed framework, dataset and baseline results will serve to foster this research field.  
  Address Virtual; January 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPRW  
  Notes DAG; 600.121; 600.140; Approved no  
  Call Number Admin @ si @ BCF2020 Serial 3508  
Permanent link to this record
 

 
Author Vacit Oguz Yazici; Joost Van de Weijer; Longlong Yu edit   pdf
url  doi
openurl 
  Title Visual Transformers with Primal Object Queries for Multi-Label Image Classification Type Conference Article
  Year 2022 Publication 26th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Multi-label image classification is about predicting a set of class labels that can be considered as orderless sequential data. Transformers process the sequential data as a whole, therefore they are inherently good at set prediction. The first vision-based transformer model, which was proposed for the object detection task introduced the concept of object queries. Object queries are learnable positional encodings that are used by attention modules in decoder layers to decode the object classes or bounding boxes using the region of interests in an image. However, inputting the same set of object queries to different decoder layers hinders the training: it results in lower performance and delays convergence. In this paper, we propose the usage of primal object queries that are only provided at the start of the transformer decoder stack. In addition, we improve the mixup technique proposed for multi-label classification. The proposed transformer model with primal object queries improves the state-of-the-art class wise F1 metric by 2.1% and 1.8%; and speeds up the convergence by 79.0% and 38.6% on MS-COCO and NUS-WIDE datasets respectively.  
  Address Montreal; Quebec; Canada; August 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes LAMP; 600.147; 601.309 Approved no  
  Call Number Admin @ si @ YWY2022 Serial 3786  
Permanent link to this record
 

 
Author Ayan Banerjee; Palaiahnakote Shivakumara; Parikshit Acharya; Umapada Pal; Josep Llados edit  url
doi  openurl
  Title TWD: A New Deep E2E Model for Text Watermark Detection in Video Images Type Conference Article
  Year 2022 Publication 26th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords Deep learning; U-Net; FCENet; Scene text detection; Video text detection; Watermark text detection  
  Abstract Text watermark detection in video images is challenging because text watermark characteristics are different from caption and scene texts in the video images. Developing a successful model for detecting text watermark, caption, and scene texts is an open challenge. This study aims at developing a new Deep End-to-End model for Text Watermark Detection (TWD), caption and scene text in video images. To standardize non-uniform contrast, quality, and resolution, we explore the U-Net3+ model for enhancing poor quality text without affecting high-quality text. Similarly, to address the challenges of arbitrary orientation, text shapes and complex background, we explore Stacked Hourglass Encoded Fourier Contour Embedding Network (SFCENet) by feeding the output of the U-Net3+ model as input. Furthermore, the proposed work integrates enhancement and detection models as an end-to-end model for detecting multi-type text in video images. To validate the proposed model, we create our own dataset (named TW-866), which provides video images containing text watermark, caption (subtitles), as well as scene text. The proposed model is also evaluated on standard natural scene text detection datasets, namely, ICDAR 2019 MLT, CTW1500, Total-Text, and DAST1500. The results show that the proposed method outperforms the existing methods. This is the first work on text watermark detection in video images to the best of our knowledge  
  Address Montreal; Quebec; Canada; August 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; Approved no  
  Call Number Admin @ si @ BSA2022 Serial 3788  
Permanent link to this record
 

 
Author Aitor Alvarez-Gila; Joost Van de Weijer; Yaxing Wang; Estibaliz Garrote edit   pdf
url  doi
openurl 
  Title MVMO: A Multi-Object Dataset for Wide Baseline Multi-View Semantic Segmentation Type Conference Article
  Year 2022 Publication 29th IEEE International Conference on Image Processing Abbreviated Journal  
  Volume Issue Pages  
  Keywords multi-view; cross-view; semantic segmentation; synthetic dataset  
  Abstract We present MVMO (Multi-View, Multi-Object dataset): a synthetic dataset of 116,000 scenes containing randomly placed objects of 10 distinct classes and captured from 25 camera locations in the upper hemisphere. MVMO comprises photorealistic, path-traced image renders, together with semantic segmentation ground truth for every view. Unlike existing multi-view datasets, MVMO features wide baselines between cameras and high density of objects, which lead to large disparities, heavy occlusions and view-dependent object appearance. Single view semantic segmentation is hindered by self and inter-object occlusions that could benefit from additional viewpoints. Therefore, we expect that MVMO will propel research in multi-view semantic segmentation and cross-view semantic transfer. We also provide baselines that show that new research is needed in such fields to exploit the complementary information of multi-view setups 1 .  
  Address Bordeaux; France; October2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes LAMP Approved no  
  Call Number Admin @ si @ AWW2022 Serial 3781  
Permanent link to this record
 

 
Author Guillem Martinez; Maya Aghaei; Martin Dijkstra; Bhalaji Nagarajan; Femke Jaarsma; Jaap van de Loosdrecht; Petia Radeva; Klaas Dijkstra edit   pdf
url  doi
openurl 
  Title Hyper-Spectral Imaging for Overlapping Plastic Flakes Segmentation Type Conference Article
  Year 2022 Publication 47th International Conference on Acoustics, Speech, and Signal Processing Abbreviated Journal  
  Volume Issue Pages  
  Keywords Hyper-spectral imaging; plastic sorting; multi-label segmentation; bitfield encoding  
  Abstract In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms.  
  Address Singapore; May 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICASSP  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ MAD2022 Serial 3767  
Permanent link to this record
 

 
Author Chengyi Zou; Shuai Wan; Marta Mrak; Marc Gorriz Blanch; Luis Herranz; Tiannan Ji edit  url
doi  openurl
  Title Towards Lightweight Neural Network-based Chroma Intra Prediction for Video Coding Type Conference Article
  Year 2022 Publication 29th IEEE International Conference on Image Processing Abbreviated Journal  
  Volume Issue Pages  
  Keywords Video coding; Quantization (signal); Computational modeling; Neural networks; Predictive models; Video compression; Syntactics  
  Abstract In video compression the luma channel can be useful for predicting chroma channels (Cb, Cr), as has been demonstrated with the Cross-Component Linear Model (CCLM) used in Versatile Video Coding (VVC) standard. More recently, it has been shown that neural networks can even better capture the relationship among different channels. In this paper, a new attention-based neural network is proposed for cross-component intra prediction. With the goal to simplify neural network design, the new framework consists of four branches: boundary branch and luma branch for extracting features from reference samples, attention branch for fusing the first two branches, and prediction branch for computing the predicted chroma samples. The proposed scheme is integrated into VVC test model together with one additional binary block-level syntax flag which indicates whether a given block makes use of the proposed method. Experimental results demonstrate 0.31%/2.36%/2.00% BD-rate reductions on Y/Cb/Cr components, respectively, on top of the VVC Test Model (VTM) 7.0 which uses CCLM.  
  Address Bordeaux; France; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes MACO Approved no  
  Call Number Admin @ si @ ZWM2022 Serial 3790  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: