Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

Details

Records
Author	Francesc Net; Marc Folia; Pep Casals; Lluis Gomez
Title	Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections			Type	Conference Article
Year	2023	Publication	17th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume	14191	Issue		Pages	3-17
Keywords	Image deduplication; Near-duplicate images detection; Transductive Learning; Photographic Archives; Deep Learning
Abstract	This paper presents a comparative study of near-duplicate image detection techniques in a real-world use case scenario, where a document management company is commissioned to manually annotate a collection of scanned photographs. Detecting duplicate and near-duplicate photographs can reduce the time spent on manual annotation by archivists. This real use case differs from laboratory settings as the deployment dataset is available in advance, allowing the use of transductive learning. We propose a transductive learning approach that leverages state-of-the-art deep learning architectures such as convolutional neural networks (CNNs) and Vision Transformers (ViTs). Our approach involves pre-training a deep neural network on a large dataset and then fine-tuning the network on the unlabeled target collection with self-supervised learning. The results show that the proposed approach outperforms the baseline methods in the task of near-duplicate image detection in the UKBench and an in-house private dataset.
Address	San Jose; CA; USA; August 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG			Approved	no
Call Number	Admin @ si @ NFC2023			Serial	3859
Permanent link to this record



Author	Albert Tatjer; Bhalaji Nagarajan; Ricardo Marques; Petia Radeva
Title	CCLM: Class-Conditional Label Noise Modelling			Type	Conference Article
Year	2023	Publication	11th Iberian Conference on Pattern Recognition and Image Analysis	Abbreviated Journal
Volume	14062	Issue		Pages	3-14
Keywords
Abstract	The performance of deep neural networks highly depends on the quality and volume of the training data. However, cost-effective labelling processes such as crowdsourcing and web crawling often lead to data with noisy (i.e., wrong) labels. Making models robust to this label noise is thus of prime importance. A common approach is using loss distributions to model the label noise. However, the robustness of these methods highly depends on the accuracy of the division of training set into clean and noisy samples. In this work, we dive in this research direction highlighting the existing problem of treating this distribution globally and propose a class-conditional approach to split the clean and noisy samples. We apply our approach to the popular DivideMix algorithm and show how the local treatment fares better with respect to the global treatment of loss distribution. We validate our hypothesis on two popular benchmark datasets and show substantial improvements over the baseline experiments. We further analyze the effectiveness of the proposal using two different metrics – Noise Division Accuracy and Classiness.
Address	Alicante; Spain; June 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IbPRIA
Notes	MILAB			Approved	no
Call Number	Admin @ si @ TNM2023			Serial	3925
Permanent link to this record



Author	Mohamed Ali Souibgui; Pau Torras; Jialuo Chen; Alicia Fornes
Title	An Evaluation of Handwritten Text Recognition Methods for Historical Ciphered Manuscripts			Type	Conference Article
Year	2023	Publication	7th International Workshop on Historical Document Imaging and Processing	Abbreviated Journal
Volume		Issue		Pages	7-12
Keywords
Abstract	This paper investigates the effectiveness of different deep learning HTR families, including LSTM, Seq2Seq, and transformer-based approaches with self-supervised pretraining, in recognizing ciphered manuscripts from different historical periods and cultures. The goal is to identify the most suitable method or training techniques for recognizing ciphered manuscripts and to provide insights into the challenges and opportunities in this field of research. We evaluate the performance of these models on several datasets of ciphered manuscripts and discuss their results. This study contributes to the development of more accurate and efficient methods for recognizing historical manuscripts for the preservation and dissemination of our cultural heritage.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	HIP
Notes	DAG			Approved	no
Call Number	Admin @ si @ STC2023			Serial	3849
Permanent link to this record



Author	Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z Li
Title	Face Presentation Attack Detection (PAD) Challenges			Type	Book Chapter
Year	2023	Publication	Advances in Face Presentation Attack Detection	Abbreviated Journal
Volume		Issue		Pages	17–35
Keywords
Abstract	In recent years, the security of face recognition systems has been increasingly threatened. Face Anti-spoofing (FAS) is essential to secure face recognition systems primarily from various attacks. In order to attract researchers and push forward the state of the art in Face Presentation Attack Detection (PAD), we organized three editions of Face Anti-spoofing Workshop and Competition at CVPR 2019, CVPR 2020, and ICCV 2021, which have attracted more than 800 teams from academia and industry, and greatly promoted the algorithms to overcome many challenging problems. In this chapter, we introduce the detailed competition process, including the challenge phases, timeline and evaluation metrics. Along with the workshop, we will introduce the corresponding dataset for each competition including data acquisition details, data processing, statistics, and evaluation protocol. Finally, we provide the available link to download the datasets used in the challenges.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	SLCV
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ WGE2023b			Serial	3956
Permanent link to this record



Author	Luca Ginanni Corradini; Simone Balocco; Luciano Maresca; Silvio Vitale; Matteo Stefanini
Title	Anatomical Modifications After Stent Implantation: A Comparative Analysis Between CGuard, Wallstent, and Roadsaver Carotid Stents			Type	Journal Article
Year	2023	Publication	Journal of Endovascular Therapy	Abbreviated Journal
Volume	30	Issue	1	Pages	18-24
Keywords	Ginanni Corradini L, Balocco S, Maresca L, Vitale S, Stefanini M.
Abstract	Abstract Purpose: Carotid revascularization can be associated with modifications of the vascular geometry, which may lead to complications. The changes on the vessel angulation before and after a carotid WallStent (WS) implantation are compared against 2 new dual-layer devices, CGuard (CG) and RoadSaver (RS). Materials and Methods: The study prospectively recruited 217 consecutive patients (112 GC, 73 WS, and 32 RS, respectively). Angiography projections were explored and the one having a higher arterial angle was selected as a basal view. After stent implantation, a stent control angiography was performed selecting the projection having the maximal angle. The same procedure is followed in all the 3 stent types to guarantee comparable conditions. The angulation changes on the stented segments were quantified from both angiographies. The statistical analysis quantitatively compared the pre-and post-angles for the 3 stent types. The results are qualitatively illustrated using boxplots. Finally, the relation between pre- and post-angles measurements is analyzed using linear regression. Results: For CG, no statistical difference in the axial vessel geometry between the basal and postprocedural angles was found. For WS and RS, statistical difference was found between pre- and post-angles. The regression analysis shows that CG induces lower changes from the original curvature with respect to WS and RS. Conclusion: Based on our results, CG determines minor changes over the basal morphology than WS and RS stents. Hence, CG respects better the native vessel anatomy than the other stents. Level of Evidence: Level 4, Case Series.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	xxx			Approved	no
Call Number	Admin @ si @ GBM2023			Serial	4006
Permanent link to this record



Author	Gisel Bastidas-Guacho; Patricio Moreno; Boris X. Vintimilla; Angel Sappa
Title	Application on the Loop of Multimodal Image Fusion: Trends on Deep-Learning Based Approaches			Type	Conference Article
Year	2023	Publication	13th International Conference on Pattern Recognition Systems	Abbreviated Journal
Volume	14234	Issue		Pages	25–36
Keywords
Abstract	Multimodal image fusion allows the combination of information from different modalities, which is useful for tasks such as object detection, edge detection, and tracking, to name a few. Using the fused representation for applications results in better task performance. There are several image fusion approaches, which have been summarized in surveys. However, the existing surveys focus on image fusion approaches where the application on the loop of multimodal image fusion is not considered. On the contrary, this study summarizes deep learning-based multimodal image fusion for computer vision (e.g., object detection) and image processing applications (e.g., semantic segmentation), that is, approaches where the application module leverages the multimodal fusion process to enhance the final result. Firstly, we introduce image fusion and the existing general frameworks for image fusion tasks such as multifocus, multiexposure and multimodal. Then, we describe the multimodal image fusion approaches. Next, we review the state-of-the-art deep learning multimodal image fusion approaches for vision applications. Finally, we conclude our survey with the trends of task-driven multimodal image fusion.
Address	Guayaquil; Ecuador; July 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRS
Notes	MSIAU			Approved	no
Call Number	Admin @ si @ BMV2023			Serial	3932
Permanent link to this record



Author	Christian Keilstrup Ingwersen; Artur Xarles; Albert Clapes; Meysam Madadi; Janus Nortoft Jensen; Morten Rieger Hannemose; Anders Bjorholm Dahl; Sergio Escalera
Title	Video-based Skill Assessment for Golf: Estimating Golf Handicap			Type	Conference Article
Year	2023	Publication	Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports	Abbreviated Journal
Volume		Issue		Pages	31-39
Keywords
Abstract	Automated skill assessment in sports using video-based analysis holds great potential for revolutionizing coaching methodologies. This paper focuses on the problem of skill determination in golfers by leveraging deep learning models applied to a large database of video recordings of golf swings. We investigate different regression, ranking and classification based methods and compare to a simple baseline approach. The performance is evaluated using mean squared error (MSE) as well as computing the percentages of correctly ranked pairs based on the Kendall correlation. Our results demonstrate an improvement over the baseline, with a 35% lower mean squared error and 68% correctly ranked pairs. However, achieving fine-grained skill assessment remains challenging. This work contributes to the development of AI-driven coaching systems and advances the understanding of video-based skill determination in the context of golf.
Address	Otawa; Canada; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	MMSports
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ KXC2023			Serial	3929
Permanent link to this record



Author	Yael Tudela; Ana Garcia Rodriguez; Gloria Fernandez Esparrach; Jorge Bernal
Title	Towards Fine-Grained Polyp Segmentation and Classification			Type	Conference Article
Year	2023	Publication	Workshop on Clinical Image-Based Procedures	Abbreviated Journal
Volume	14242	Issue		Pages	32-42
Keywords	Medical image segmentation; Colorectal Cancer; Vision Transformer; Classification
Abstract	Colorectal cancer is one of the main causes of cancer death worldwide. Colonoscopy is the gold standard screening tool as it allows lesion detection and removal during the same procedure. During the last decades, several efforts have been made to develop CAD systems to assist clinicians in lesion detection and classification. Regarding the latter, and in order to be used in the exploration room as part of resect and discard or leave-in-situ strategies, these systems must identify correctly all different lesion types. This is a challenging task, as the data used to train these systems presents great inter-class similarity, high class imbalance, and low representation of clinically relevant histology classes such as serrated sessile adenomas. In this paper, a new polyp segmentation and classification method, Swin-Expand, is introduced. Based on Swin-Transformer, it uses a simple and lightweight decoder. The performance of this method has been assessed on a novel dataset, comprising 1126 high-definition images representing the three main histological classes. Results show a clear improvement in both segmentation and classification performance, also achieving competitive results when tested in public datasets. These results confirm that both the method and the data are important to obtain more accurate polyp representations.
Address	Vancouver; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	MICCAIW
Notes	ISE			Approved	no
Call Number	Admin @ si @ TGF2023			Serial	3837
Permanent link to this record



Author	Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z Li
Title	Best Solutions Proposed in the Context of the Face Anti-spoofing Challenge Series			Type	Book Chapter
Year	2023	Publication	Advances in Face Presentation Attack Detection	Abbreviated Journal
Volume		Issue		Pages	37–78
Keywords
Abstract	The PAD competitions we organized attracted more than 835 teams from home and abroad, most of them from the industry, which shows that the topic of face anti-spoofing is closely related to daily life, and there is an urgent need for advanced algorithms to solve its application needs. Specifically, the Chalearn LAP multi-modal face anti-spoofing attack detection challenge attracted more than 300 teams for the development phase with a total of 13 teams qualifying for the final round; the Chalearn Face Anti-spoofing Attack Detection Challenge attracted 340 teams in the development stage, and finally, 11 and 8 teams have submitted their codes in the single-modal and multi-modal face anti-spoofing recognition challenges, respectively; the 3D High-Fidelity Mask Face Presentation Attack Detection Challenge attracted 195 teams for the development phase with a total of 18 teams qualifying for the final round. All the results were verified and re-run by the organizing team, and the results were used for the final ranking. In this chapter, we briefly the methods developed by the teams participating in each competition, and introduce the algorithm details of the top-three ranked teams in detail.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ WGE2023d			Serial	3958
Permanent link to this record



Author	Alejandro Ariza-Casabona; Bartlomiej Twardowski; Tri Kurniawan Wijaya
Title	Exploiting Graph Structured Cross-Domain Representation for Multi-domain Recommendation			Type	Conference Article
Year	2023	Publication	European Conference on Information Retrieval – ECIR 2023: Advances in Information Retrieval	Abbreviated Journal
Volume	13980	Issue		Pages	49–65
Keywords
Abstract	Multi-domain recommender systems benefit from cross-domain representation learning and positive knowledge transfer. Both can be achieved by introducing a specific modeling of input data (i.e. disjoint history) or trying dedicated training regimes. At the same time, treating domains as separate input sources becomes a limitation as it does not capture the interplay that naturally exists between domains. In this work, we efficiently learn multi-domain representation of sequential users’ interactions using graph neural networks. We use temporal intra- and inter-domain interactions as contextual information for our method called MAGRec (short for Multi-dom Ain Graph-based Recommender). To better capture all relations in a multi-domain setting, we learn two graph-based sequential representations simultaneously: domain-guided for recent user interest, and general for long-term interest. This approach helps to mitigate the negative knowledge transfer problem from multiple domains and improve overall representation. We perform experiments on publicly available datasets in different scenarios where MAGRec consistently outperforms state-of-the-art methods. Furthermore, we provide an ablation study and discuss further extensions of our method.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECIR
Notes	LAMP			Approved	no
Call Number	Admin @ si @ ATK2023			Serial	3933
Permanent link to this record



Author	Pau Torras; Mohamed Ali Souibgui; Sanket Biswas; Alicia Fornes
Title	Segmentation-Free Alignment of Arbitrary Symbol Transcripts to Images			Type	Conference Article
Year	2023	Publication	Document Analysis and Recognition – ICDAR 2023 Workshops	Abbreviated Journal
Volume	14193	Issue		Pages	83-93
Keywords	Historical Manuscripts; Symbol Alignment
Abstract	Developing arbitrary symbol recognition systems is a challenging endeavour. Even using content-agnostic architectures such as few-shot models, performance can be substantially improved by providing a number of well-annotated examples into training. In some contexts, transcripts of the symbols are available without any position information associated to them, which enables using line-level recognition architectures. A way of providing this position information to detection-based architectures is finding systems that can align the input symbols with the transcription. In this paper we discuss some symbol alignment techniques that are suitable for low-data scenarios and provide an insight on their perceived strengths and weaknesses. In particular, we study the usage of Connectionist Temporal Classification models, Attention-Based Sequence to Sequence models and we compare them with the results obtained on a few-shot recognition system.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG			Approved	no
Call Number	Admin @ si @ TSS2023			Serial	3850
Permanent link to this record



Author	Patricia Suarez; Henry Velesaca; Dario Carpio; Angel Sappa
Title	Corn kernel classification from few training samples			Type	Journal
Year	2023	Publication	Artificial Intelligence in Agriculture	Abbreviated Journal
Volume	9	Issue		Pages	89-99
Keywords
Abstract	This article presents an efficient approach to classify a set of corn kernels in contact, which may contain good, or defective kernels along with impurities. The proposed approach consists of two stages, the first one is a next-generation segmentation network, trained by using a set of synthesized images that is applied to divide the given image into a set of individual instances. An ad-hoc lightweight CNN architecture is then proposed to classify each instance into one of three categories (ie good, defective, and impurities). The segmentation network is trained using a strategy that avoids the time-consuming and human-error-prone task of manual data annotation. Regarding the classification stage, the proposed ad-hoc network is designed with only a few sets of layers to result in a lightweight architecture capable of being used in integrated solutions. Experimental results and comparisons with previous approaches showing both the improvement in accuracy and the reduction in time are provided. Finally, the segmentation and classification approach proposed can be easily adapted for use with other cereal types.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MSIAU			Approved	no
Call Number	Admin @ si @ SVC2023			Serial	3892
Permanent link to this record



Author	Artur Xarles; Sergio Escalera; Thomas B. Moeslund; Albert Clapes
Title	ASTRA: An Action Spotting TRAnsformer for Soccer Videos			Type	Conference Article
Year	2023	Publication	Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports	Abbreviated Journal
Volume		Issue		Pages	93–102
Keywords
Abstract	In this paper, we introduce ASTRA, a Transformer-based model designed for the task of Action Spotting in soccer matches. ASTRA addresses several challenges inherent in the task and dataset, including the requirement for precise action localization, the presence of a long-tail data distribution, non-visibility in certain actions, and inherent label noise. To do so, ASTRA incorporates (a) a Transformer encoder-decoder architecture to achieve the desired output temporal resolution and to produce precise predictions, (b) a balanced mixup strategy to handle the long-tail distribution of the data, (c) an uncertainty-aware displacement head to capture the label variability, and (d) input audio signal to enhance detection of non-visible actions. Results demonstrate the effectiveness of ASTRA, achieving a tight Average-mAP of 66.82 on the test set. Moreover, in the SoccerNet 2023 Action Spotting challenge, we secure the 3rd position with an Average-mAP of 70.21 on the challenge set.
Address	Otawa; Canada; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	MMSports
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ XEM2023			Serial	3970
Permanent link to this record



Author	Sergi Garcia Bordils; Dimosthenis Karatzas; Marçal Rusiñol
Title	Accelerating Transformer-Based Scene Text Detection and Recognition via Token Pruning			Type	Conference Article
Year	2023	Publication	17th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume	14192	Issue		Pages	106-121
Keywords	Scene Text Detection; Scene Text Recognition; Transformer Acceleration
Abstract	Scene text detection and recognition is a crucial task in computer vision with numerous real-world applications. Transformer-based approaches are behind all current state-of-the-art models and have achieved excellent performance. However, the computational requirements of the transformer architecture makes training these methods slow and resource heavy. In this paper, we introduce a new token pruning strategy that significantly decreases training and inference times without sacrificing performance, striking a balance between accuracy and speed. We have applied this pruning technique to our own end-to-end transformer-based scene text understanding architecture. Our method uses a separate detection branch to guide the pruning of uninformative image features, which significantly reduces the number of tokens at the input of the transformer. Experimental results show how our network is able to obtain competitive results on multiple public benchmarks while running at significantly higher speeds.
Address	San Jose; CA; USA; August 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG			Approved	no
Call Number	Admin @ si @ GKR2023a			Serial	3907
Permanent link to this record



Author	Patricia Suarez; Angel Sappa
Title	Toward a Thermal Image-Like Representation			Type	Conference Article
Year	2023	Publication	Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications	Abbreviated Journal
Volume		Issue		Pages	133-140
Keywords
Abstract	This paper proposes a novel model to obtain thermal image-like representations to be used as an input in any thermal image compressive sensing approach (e.g., thermal image: filtering, enhancing, super-resolution). Thermal images offer interesting information about the objects in the scene, in addition to their temperature. Unfortunately, in most of the cases thermal cameras acquire low resolution/quality images. Hence, in order to improve these images, there are several state-of-the-art approaches that exploit complementary information from a low-cost channel (visible image) to increase the image quality of an expensive channel (infrared image). In these SOTA approaches visible images are fused at different levels without paying attention the images acquire information at different bands of the spectral. In this paper a novel approach is proposed to generate thermal image-like representations from a low cost visible images, by means of a contrastive cycled GAN network. Obtained representations (synthetic thermal image) can be later on used to improve the low quality thermal image of the same scene. Experimental results on different datasets are presented.
Address	Lisboa; Portugal; February 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	VISIGRAPP
Notes	MSIAU			Approved	no
Call Number	Admin @ si @ SuS2023b			Serial	3927
Permanent link to this record