Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

Details

Records
Author	Souhail Bakkali; Zuheng Ming; Mickael Coustaty; Marçal Rusiñol; Oriol Ramos Terrades
Title	VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification			Type	Journal Article
Year	2023	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	139	Issue		Pages	109419
Keywords
Abstract	Multimodal learning from document data has achieved great success lately as it allows to pre-train semantically meaningful features as a prior into a learnable downstream approach. In this paper, we approach the document classification problem by learning cross-modal representations through language and vision cues, considering intra- and inter-modality relationships. Instead of merging features from different modalities into a common representation space, the proposed method exploits high-level interactions and learns relevant semantic information from effective attention flows within and across modalities. The proposed learning objective is devised between intra- and inter-modality alignment tasks, where the similarity distribution per task is computed by contracting positive sample pairs while simultaneously contrasting negative ones in the common feature representation space}. Extensive experiments on public document classification datasets demonstrate the effectiveness and the generalization capacity of our model on both low-scale and large-scale datasets.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	ISSN 0031-3203	ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.140; 600.121			Approved	no
Call Number	Admin @ si @ BMC2023			Serial	3826
Permanent link to this record



Author	Albin Soutif; Antonio Carta; Joost Van de Weijer
Title	Improving Online Continual Learning Performance and Stability with Temporal Ensembles			Type	Conference Article
Year	2023	Publication	2nd Conference on Lifelong Learning Agents	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Neural networks are very effective when trained on large datasets for a large number of iterations. However, when they are trained on non-stationary streams of data and in an online fashion, their performance is reduced (1) by the online setup, which limits the availability of data, (2) due to catastrophic forgetting because of the non-stationary nature of the data. Furthermore, several recent works (Caccia et al., 2022; Lange et al., 2023) arXiv:2205.13452 showed that replay methods used in continual learning suffer from the stability gap, encountered when evaluating the model continually (rather than only on task boundaries). In this article, we study the effect of model ensembling as a way to improve performance and stability in online continual learning. We notice that naively ensembling models coming from a variety of training tasks increases the performance in online continual learning considerably. Starting from this observation, and drawing inspirations from semi-supervised learning ensembling methods, we use a lightweight temporal ensemble that computes the exponential moving average of the weights (EMA) at test time, and show that it can drastically increase the performance and stability when used in combination with several methods from the literature.
Address	Montreal; Canada; August 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	COLLAS
Notes	LAMP			Approved	no
Call Number	Admin @ si @ SCW2023			Serial	3922
Permanent link to this record



Author	Chengyi Zou; Shuai Wan; Tiannan Ji; Marc Gorriz Blanch; Marta Mrak; Luis Herranz
Title	Chroma Intra Prediction with Lightweight Attention-Based Neural Networks			Type	Journal Article
Year	2023	Publication	IEEE Transactions on Circuits and Systems for Video Technology	Abbreviated Journal	TCSVT
Volume	34	Issue	1	Pages	549 - 560
Keywords
Abstract	Neural networks can be successfully used for cross-component prediction in video coding. In particular, attention-based architectures are suitable for chroma intra prediction using luma information because of their capability to model relations between difierent channels. However, the complexity of such methods is still very high and should be further reduced, especially for decoding. In this paper, a cost-effective attention-based neural network is designed for chroma intra prediction. Moreover, with the goal of further improving coding performance, a novel approach is introduced to utilize more boundary information effectively. In addition to improving prediction, a simplification methodology is also proposed to reduce inference complexity by simplifying convolutions. The proposed schemes are integrated into H.266/Versatile Video Coding (VVC) pipeline, and only one additional binary block-level syntax flag is introduced to indicate whether a given block makes use of the proposed method. Experimental results demonstrate that the proposed scheme achieves up to −0.46%/−2.29%/−2.17% BD-rate reduction on Y/Cb/Cr components, respectively, compared with H.266/VVC anchor. Reductions in the encoding and decoding complexity of up to 22% and 61%, respectively, are achieved by the proposed scheme with respect to the previous attention-based chroma intra prediction method while maintaining coding performance.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MACO; LAMP			Approved	no
Call Number	Admin @ si @ ZWJ2023			Serial	3875
Permanent link to this record



Author	Roger Max Calle Quispe; Maya Aghaei Gavari; Eduardo Aguilar Torres
Title	Towards real-time accurate safety helmets detection through a deep learning-based method			Type	Journal
Year	2023	Publication	Ingeniare. Revista chilena de ingenieria	Abbreviated Journal
Volume	31	Issue	12	Pages
Keywords
Abstract	Occupational safety is a fundamental activity in industries and revolves around the management of the necessary controls that must be present to mitigate occupational risks. These controls include verifying the use of Personal Protection Equipment (PPE). Within PPE, safety helmets are vital to reducing severe or fatal consequences caused by head injuries. This problem has been addressed recently by various research based on deep learning to detect the usage of safety helmets by the present people in the industrial field. These works have achieved promising results for safety helmet detection using object detection methods from the YOLO family. In this work, we propose to analyze the performance of Scaled-YOLOv4, a novel model of the YOLO family that has yet to be previously studied for this problem. The performance of the Scaled-YOLOv4 is evaluated on two public databases, carefully selected among the previously proposed datasets for the occupational safety framework. We demonstrate the superiority of Scaled-YOLOv4 in terms of mAP and Fl-score concerning the previous works for both databases. Further, we summarize the currently available datasets for safety helmet detection purposes and discuss their suitability.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB			Approved	no
Call Number	Admin @ si @ CAA2023			Serial	3846
Permanent link to this record



Author	Yi Xiao; Felipe Codevilla; Diego Porres; Antonio Lopez
Title	Scaling Vision-Based End-to-End Autonomous Driving with Multi-View Attention Learning			Type	Conference Article
Year	2023	Publication	International Conference on Intelligent Robots and Systems	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	On end-to-end driving, human driving demonstrations are used to train perception-based driving models by imitation learning. This process is supervised on vehicle signals (e.g., steering angle, acceleration) but does not require extra costly supervision (human labeling of sensor data). As a representative of such vision-based end-to-end driving models, CILRS is commonly used as a baseline to compare with new driving models. So far, some latest models achieve better performance than CILRS by using expensive sensor suites and/or by using large amounts of human-labeled data for training. Given the difference in performance, one may think that it is not worth pursuing vision-based pure end-to-end driving. However, we argue that this approach still has great value and potential considering cost and maintenance. In this paper, we present CIL++, which improves on CILRS by both processing higher-resolution images using a human-inspired HFOV as an inductive bias and incorporating a proper attention mechanism. CIL++ achieves competitive performance compared to models which are more costly to develop. We propose to replace CILRS with CIL++ as a strong vision-based pure end-to-end driving baseline supervised by only vehicle signals and trained by conditional imitation learning.
Address	Detroit; USA; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IROS
Notes	ADAS			Approved	no
Call Number	Admin @ si @ XCP2023			Serial	3930
Permanent link to this record



Author	Albin Soutif; Antonio Carta; Andrea Cossu; Julio Hurtado; Hamed Hemati; Vincenzo Lomonaco; Joost Van de Weijer
Title	A Comprehensive Empirical Evaluation on Online Continual Learning			Type	Conference Article
Year	2023	Publication	Visual Continual Learning (ICCV-W)	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Online continual learning aims to get closer to a live learning experience by learning directly on a stream of data with temporally shifting distribution and by storing a minimum amount of data from that stream. In this empirical evaluation, we evaluate various methods from the literature that tackle online continual learning. More specifically, we focus on the class-incremental setting in the context of image classification, where the learner must learn new classes incrementally from a stream of data. We compare these methods on the Split-CIFAR100 and Split-TinyImagenet benchmarks, and measure their average accuracy, forgetting, stability, and quality of the representations, to evaluate various aspects of the algorithm at the end but also during the whole training period. We find that most methods suffer from stability and underfitting issues. However, the learned representations are comparable to i.i.d. training under the same computational budget. No clear winner emerges from the results and basic experience replay, when properly tuned and implemented, is a very strong baseline. We release our modular and extensible codebase at this https URL based on the avalanche framework to reproduce our results and encourage future research.
Address	Paris; France; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCVW
Notes	LAMP			Approved	no
Call Number	Admin @ si @ SCC2023			Serial	3938
Permanent link to this record



Author	Guillermo Torres; Debora Gil; Antoni Rosell; S. Mena; Carles Sanchez
Title	Virtual Radiomics Biopsy for the Histological Diagnosis of Pulmonary Nodules			Type	Conference Article
Year	2023	Publication	37th International Congress and Exhibition is organized by Computer Assisted Radiology and Surgery	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Pòster
Address	Munich; Germany; June 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CARS
Notes	IAM			Approved	no
Call Number	Admin @ si @ TGR2023a			Serial	3950
Permanent link to this record



Author	Sonia Baeza; Debora Gil; Carles Sanchez; Guillermo Torres; Ignasi Garcia Olive; Ignasi Guasch; Samuel Garcia Reina; Felipe Andreo; Jose Luis Mate; Jose Luis Vercher; Antonio Rosell
Title	Biopsia virtual radiomica para el diagnóstico histológico de nódulos pulmonares – Resultados intermedios del proyecto Radiolung			Type	Conference Article
Year	2023	Publication	SEPAR	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Pòster
Address	Granada; Spain; June 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	SEPAR
Notes	IAM			Approved	no
Call Number	Admin @ si @ BGS2023			Serial	3951
Permanent link to this record



Author	Debora Gil; Guillermo Torres; Carles Sanchez
Title	Transforming radiomic features into radiological words			Type	Conference Article
Year	2023	Publication	IEEE International Symposium on Biomedical Imaging	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Pòster
Address	Cartagena de Indias; Colombia; April 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ISBI
Notes	IAM			Approved	no
Call Number	Admin @ si @ GTS2023			Serial	3952
Permanent link to this record



Author	Guillermo Torres; Debora Gil; Antonio Rosell; Sonia Baeza; Carles Sanchez
Title	A radiomic biopsy for virtual histology of pulmonary nodules			Type	Conference Article
Year	2023	Publication	IEEE International Symposium on Biomedical Imaging	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Pòster
Address	Cartagena de Indias; Colombia; April 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ISBI
Notes	IAM			Approved	no
Call Number	Admin @ si @ TGR2023b			Serial	3954
Permanent link to this record



Author	Roberto Morales; Juan Quispe; Eduardo Aguilar
Title	Exploring multi-food detection using deep learning-based algorithms			Type	Conference Article
Year	2023	Publication	13th International Conference on Pattern Recognition Systems	Abbreviated Journal
Volume		Issue		Pages	1-7
Keywords
Abstract	People are becoming increasingly concerned about their diet, whether for disease prevention, medical treatment or other purposes. In meals served in restaurants, schools or public canteens, it is not easy to identify the ingredients and/or the nutritional information they contain. Currently, technological solutions based on deep learning models have facilitated the recording and tracking of food consumed based on the recognition of the main dish present in an image. Considering that sometimes there may be multiple foods served on the same plate, food analysis should be treated as a multi-class object detection problem. EfficientDet and YOLOv5 are object detection algorithms that have demonstrated high mAP and real-time performance on general domain data. However, these models have not been evaluated and compared on public food datasets. Unlike general domain objects, foods have more challenging features inherent in their nature that increase the complexity of detection. In this work, we performed a performance evaluation of Efficient-Det and YOLOv5 on three public food datasets: UNIMIB2016, UECFood256 and ChileanFood64. From the results obtained, it can be seen that YOLOv5 provides a significant difference in terms of both mAP and response time compared to EfficientDet in all datasets. Furthermore, YOLOv5 outperforms the state-of-the-art on UECFood256, achieving an improvement of more than 4% in terms of mAP@.50 over the best reported.
Address	Guayaquil; Ecuador; July 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRS
Notes	MILAB			Approved	no
Call Number	Admin @ si @ MQA2023			Serial	3843
Permanent link to this record



Author	Dipam Goswami; Yuyang Liu ; Bartlomiej Twardowski; Joost Van de Weijer
Title	FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning			Type	Conference Article
Year	2023	Publication	37th Annual Conference on Neural Information Processing Systems	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Poster
Address	New Orleans; USA; December 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	NEURIPS
Notes	LAMP			Approved	no
Call Number	Admin @ si @ GLT2023			Serial	3934
Permanent link to this record



Author	Kai Wang; Fei Yang; Shiqi Yang; Muhammad Atif Butt; Joost Van de Weijer
Title	Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing			Type	Conference Article
Year	2023	Publication	37th Annual Conference on Neural Information Processing Systems	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Poster
Address	New Orleans; USA; December 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	NEURIPS
Notes	LAMP			Approved	no
Call Number	Admin @ si @ WYY2023			Serial	3935
Permanent link to this record



Author	Wenwen Yu; Mingyu Liu; Mingrui Chen; Ning Lu; Yinlong We; Yuliang Liu; Dimosthenis Karatzas; Xiang Bai
Title	ICDAR 2023 Competition on Reading the Seal Title			Type	Conference Article
Year	2023	Publication	17th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume	14188	Issue		Pages	522–535
Keywords
Abstract	Reading seal title text is a challenging task due to the variable shapes of seals, curved text, background noise, and overlapped text. However, this important element is commonly found in official and financial scenarios, and has not received the attention it deserves in the field of OCR technology. To promote research in this area, we organized ICDAR 2023 competition on reading the seal title (ReST), which included two tasks: seal title text detection (Task 1) and end-to-end seal title recognition (Task 2). We constructed a dataset of 10,000 real seal data, covering the most common classes of seals, and labeled all seal title texts with text polygons and text contents. The competition opened on 30th December, 2022 and closed on 20th March, 2023. The competition attracted 53 participants and received 135 submissions from academia and industry, including 28 participants and 72 submissions for Task 1, and 25 participants and 63 submissions for Task 2, which demonstrated significant interest in this challenging task. In this report, we present an overview of the competition, including the organization, challenges, and results. We describe the dataset and tasks, and summarize the submissions and evaluation results. The results show that significant progress has been made in the field of seal title text reading, and we hope that this competition will inspire further research and development in this important area of OCR technology.
Address	San Jose; CA; USA; August 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG			Approved	no
Call Number	Admin @ si @ YLC2023			Serial	3897
Permanent link to this record



Author	Senmao Li; Joost Van de Weijer; Yaxing Wang; Fahad Shahbaz Khan; Meiqin Liu; Jian Yang
Title	3D-Aware Multi-Class Image-to-Image Translation with NeRFs			Type	Conference Article
Year	2023	Publication	36th IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	12652-12662
Keywords
Abstract	Recent advances in 3D-aware generative models (3D-aware GANs) combined with Neural Radiance Fields (NeRF) have achieved impressive results. However no prior works investigate 3D-aware GANs for 3D consistent multiclass image-to-image (3D-aware 121) translation. Naively using 2D-121 translation methods suffers from unrealistic shape/identity change. To perform 3D-aware multiclass 121 translation, we decouple this learning process into a multiclass 3D-aware GAN step and a 3D-aware 121 translation step. In the first step, we propose two novel techniques: a new conditional architecture and an effective training strategy. In the second step, based on the well-trained multiclass 3D-aware GAN architecture, that preserves view-consistency, we construct a 3D-aware 121 translation system. To further reduce the view-consistency problems, we propose several new techniques, including a U-net-like adaptor network design, a hierarchical representation constrain and a relative regularization loss. In exten-sive experiments on two datasets, quantitative and qualitative results demonstrate that we successfully perform 3D-aware 121 translation with multi-view consistency. Code is available in 3DI2I.
Address	Vancouver; Canada; June 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPR
Notes	LAMP			Approved	no
Call Number	Admin @ si @ LWW2023b			Serial	3920
Permanent link to this record