Publicacions CVC -- Query Results

[1–10] << 11 12 13 14 15 16 17 18 19 20 >> [21–30]

Details

Records
Author	Patricia Suarez; Angel Sappa; Dario Carpio; Henry Velesaca; Francisca Burgos; Patricia Urdiales
Title	Deep Learning Based Shrimp Classification			Type	Conference Article
Year	2022	Publication	17th International Symposium on Visual Computing	Abbreviated Journal
Volume	13598	Issue		Pages	36–45
Keywords	Pigmentation; Color space; Light weight network
Abstract	This work proposes a novel approach based on deep learning to address the classification of shrimp (Pennaeus vannamei) into two classes, according to their level of pigmentation accepted by shrimp commerce. The main goal of this actual study is to support the shrimp industry in terms of price and process. An efficient CNN architecture is proposed to perform image classification through a program that could be set other in mobile devices or in fixed support in the shrimp supply chain. The proposed approach is a lightweight model that uses HSV color space shrimp images. A simple pipeline shows the most important stages performed to determine a pattern that identifies the class to which they belong based on their pigmentation. For the experiments, a database acquired with mobile devices of various brands and models has been used to capture images of shrimp. The results obtained with the images in the RGB and HSV color space allow for testing the effectiveness of the proposed model.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ISVC
Notes	MSIAU; no proj			Approved	no
Call Number	Admin @ si @ SAC2022			Serial	3772
Permanent link to this record



Author	Alicia Fornes; Asma Bensalah; Cristina Carmona_Duarte; Jialuo Chen; Miguel A. Ferrer; Andreas Fischer; Josep Llados; Cristina Martin; Eloy Opisso; Rejean Plamondon; Anna Scius-Bertrand; Josep Maria Tormos
Title	The RPM3D Project: 3D Kinematics for Remote Patient Monitoring			Type	Conference Article
Year	2022	Publication	Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022	Abbreviated Journal
Volume	13424	Issue		Pages	217-226
Keywords	Healthcare applications; Kinematic; Theory of Rapid Human Movements; Human activity recognition; Stroke rehabilitation; 3D kinematics
Abstract	This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches. We base our analysis on the Kinematic Theory of Rapid Human Movement. We have validated our research in a real case scenario for stroke rehabilitation at the Guttmann Institute (https://www.guttmann.com/en/) (neurorehabilitation hospital), showing promising results. Our work could have a great impact in remote healthcare applications, improving the medical efficiency and reducing the healthcare costs. Future steps include more clinical validation, developing multi-modal analysis architectures (analysing data from sensors, images, audio, etc.), and exploring the application of our technology to monitor other neurodegenerative diseases.
Address	June 7-9, 2022, Las Palmas de Gran Canaria, Spain
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IGS
Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
Call Number	Admin @ si @ FBC2022			Serial	3739
Permanent link to this record



Author	Smriti Joshi; Richard Osuala; Carlos Martin-Isla; Victor M.Campello; Carla Sendra-Balcells; Karim Lekadir; Sergio Escalera
Title	nn-UNet Training on CycleGAN-Translated Images for Cross-modal Domain Adaptation in Biomedical Imaging			Type	Conference Article
Year	2022	Publication	International MICCAI Brainlesion Workshop	Abbreviated Journal
Volume	12963	Issue		Pages	540–551
Keywords	Domain adaptation; Vestibular schwannoma (VS); Deep learning; nn-UNet; CycleGAN
Abstract	In recent years, deep learning models have considerably advanced the performance of segmentation tasks on Brain Magnetic Resonance Imaging (MRI). However, these models show a considerable performance drop when they are evaluated on unseen data from a different distribution. Since annotation is often a hard and costly task requiring expert supervision, it is necessary to develop ways in which existing models can be adapted to the unseen domains without any additional labelled information. In this work, we explore one such technique which extends the CycleGAN [2] architecture to generate label-preserving data in the target domain. The synthetic target domain data is used to train the nn-UNet [3] framework for the task of multi-label segmentation. The experiments are conducted and evaluated on the dataset [1] provided in the ‘Cross-Modality Domain Adaptation for Medical Image Segmentation’ challenge [23] for segmentation of vestibular schwannoma (VS) tumour and cochlea on contrast enhanced (ceT1) and high resolution (hrT2) MRI scans. In the proposed approach, our model obtains dice scores (DSC) 0.73 and 0.49 for tumour and cochlea respectively on the validation set of the dataset. This indicates the applicability of the proposed technique to real-world problems where data may be obtained by different acquisition protocols as in [1] where hrT2 images are more reliable, safer, and lower-cost alternative to ceT1.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	MICCAIW
Notes	HUPBA; no menciona			Approved	no
Call Number	Admin @ si @ JOM2022			Serial	3800
Permanent link to this record



Author	Sergi Garcia Bordils; George Tom; Sangeeth Reddy; Minesh Mathew; Marçal Rusiñol; C.V. Jawahar; Dimosthenis Karatzas
Title	Read While You Drive-Multilingual Text Tracking on the Road			Type	Conference Article
Year	2022	Publication	15th IAPR International workshop on document analysis systems	Abbreviated Journal
Volume	13237	Issue		Pages	756–770
Keywords
Abstract	Visual data obtained during driving scenarios usually contain large amounts of text that conveys semantic information necessary to analyse the urban environment and is integral to the traffic control plan. Yet, research on autonomous driving or driver assistance systems typically ignores this information. To advance research in this direction, we present RoadText-3K, a large driving video dataset with fully annotated text. RoadText-3K is three times bigger than its predecessor and contains data from varied geographical locations, unconstrained driving conditions and multiple languages and scripts. We offer a comprehensive analysis of tracking by detection and detection by tracking methods exploring the limits of state-of-the-art text detection. Finally, we propose a new end-to-end trainable tracking model that yields state-of-the-art results on this challenging dataset. Our experiments demonstrate the complexity and variability of RoadText-3K and establish a new, realistic benchmark for scene text tracking in the wild.
Address	La Rochelle; France; May 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-031-06554-5	Medium
Area		Expedition		Conference	DAS
Notes	DAG; 600.155; 611.022; 611.004			Approved	no
Call Number	Admin @ si @ GTR2022			Serial	3783
Permanent link to this record



Author	Marc Oliu; Sarah Adel Bargal; Stan Sclaroff; Xavier Baro; Sergio Escalera
Title	Multi-varied Cumulative Alignment for Domain Adaptation			Type	Conference Article
Year	2022	Publication	6th International Conference on Image Analysis and Processing	Abbreviated Journal
Volume	13232	Issue		Pages	324–334
Keywords	Domain Adaptation; Computer vision; Neural networks
Abstract	Domain Adaptation methods can be classified into two basic families of approaches: non-parametric and parametric. Non-parametric approaches depend on statistical indicators such as feature covariances to minimize the domain shift. Non-parametric approaches tend to be fast to compute and require no additional parameters, but they are unable to leverage probability density functions with complex internal structures. Parametric approaches, on the other hand, use models of the probability distributions as surrogates in minimizing the domain shift, but they require additional trainable parameters to model these distributions. In this work, we propose a new statistical approach to minimizing the domain shift based on stochastically projecting and evaluating the cumulative density function in both domains. As with non-parametric approaches, there are no additional trainable parameters. As with parametric approaches, the internal structure of both domains’ probability distributions is considered, thus leveraging a higher amount of information when reducing the domain shift. Evaluation on standard datasets used for Domain Adaptation shows better performance of the proposed model compared to non-parametric approaches while being competitive with parametric ones. (Code available at: https://github.com/moliusimon/mca).
Address	Indonesia; October 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICIAP
Notes	HuPBA; no menciona			Approved	no
Call Number	Admin @ si @ OAS2022			Serial	3777
Permanent link to this record



Author	Nil Ballus; Bhalaji Nagarajan; Petia Radeva
Title	Opt-SSL: An Enhanced Self-Supervised Framework for Food Recognition			Type	Conference Article
Year	2022	Publication	10th Iberian Conference on Pattern Recognition and Image Analysis	Abbreviated Journal
Volume	13256	Issue		Pages
Keywords	Self-supervised; Contrastive learning; Food recognition
Abstract	Self-supervised Learning has been showing upbeat performance in several computer vision tasks. The popular contrastive methods make use of a Siamese architecture with different loss functions. In this work, we go deeper into two very recent state of the art frameworks, namely, SimSiam and Barlow Twins. Inspired by them, we propose a new self-supervised learning method we call Opt-SSL that combines both image and feature contrasting. We validate the proposed method on the food recognition task, showing that our proposed framework enables the self-learning networks to learn better visual representations.
Address	Aveiro; Portugal; May 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IbPRIA
Notes	MILAB; no menciona			Approved	no
Call Number	Admin @ si @ BNR2022			Serial	3782
Permanent link to this record



Author	Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla
Title	Thermal Image Super-Resolution: A Novel Unsupervised Approach			Type	Conference Article
Year	2022	Publication	International Joint Conference on Computer Vision, Imaging and Computer Graphics	Abbreviated Journal
Volume	1474	Issue		Pages	495–506
Keywords
Abstract	This paper proposes the use of a CycleGAN architecture for thermal image super-resolution under a transfer domain strategy, where middle-resolution images from one camera are transferred to a higher resolution domain of another camera. The proposed approach is trained with a large dataset acquired using three thermal cameras at different resolutions. An unsupervised learning process is followed to train the architecture. Additional loss function is proposed trying to improve results from the state of the art approaches. Following the first thermal image super-resolution challenge (PBVS-CVPR2020) evaluations are performed. A comparison with previous works is presented showing the proposed approach reaches the best results.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	VISIGRAPP
Notes	MSIAU; 600.130			Approved	no
Call Number	Admin @ si @ RSV2022d			Serial	3776
Permanent link to this record



Author	Patricia Suarez; Dario Carpio; Angel Sappa
Title	Non-homogeneous Haze Removal Through a Multiple Attention Module Architecture			Type	Conference Article
Year	2021	Publication	16th International Symposium on Visual Computing	Abbreviated Journal
Volume	13018	Issue		Pages	178–190
Keywords
Abstract	This paper presents a novel attention based architecture to remove non-homogeneous haze. The proposed model is focused on obtaining the most representative characteristics of the image, at each learning cycle, by means of adaptive attention modules coupled with a residual learning convolutional network. The latter is based on the Res2Net model. The proposed architecture is trained with just a few set of images. Its performance is evaluated on a public benchmark—images from the non-homogeneous haze NTIRE 2021 challenge—and compared with state of the art approaches reaching the best result.
Address	Virtual; October 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ISVC
Notes	MSIAU			Approved	no
Call Number	Admin @ si @ SCS2021			Serial	3668
Permanent link to this record



Author	Javad Zolfaghari Bengar; Bogdan Raducanu; Joost Van de Weijer
Title	When Deep Learners Change Their Mind: Learning Dynamics for Active Learning			Type	Conference Article
Year	2021	Publication	19th International Conference on Computer Analysis of Images and Patterns	Abbreviated Journal
Volume	13052	Issue	1	Pages	403-413
Keywords
Abstract	Active learning aims to select samples to be annotated that yield the largest performance improvement for the learning algorithm. Many methods approach this problem by measuring the informativeness of samples and do this based on the certainty of the network predictions for samples. However, it is well-known that neural networks are overly confident about their prediction and are therefore an untrustworthy source to assess sample informativeness. In this paper, we propose a new informativeness-based active learning method. Our measure is derived from the learning dynamics of a neural network. More precisely we track the label assignment of the unlabeled data pool during the training of the algorithm. We capture the learning dynamics with a metric called label-dispersion, which is low when the network consistently assigns the same label to the sample during the training of the network and high when the assigned label changes frequently. We show that label-dispersion is a promising predictor of the uncertainty of the network, and show on two benchmark datasets that an active learning algorithm based on label-dispersion obtains excellent results.
Address	September 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CAIP
Notes	LAMP;			Approved	no
Call Number	Admin @ si @ ZRV2021			Serial	3673
Permanent link to this record



Author	Ruben Tito; Dimosthenis Karatzas; Ernest Valveny
Title	Document Collection Visual Question Answering			Type	Conference Article
Year	2021	Publication	16th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume	12822	Issue		Pages	778-792
Keywords	Document collection; Visual Question Answering
Abstract	Current tasks and methods in Document Understanding aims to process documents as single elements. However, documents are usually organized in collections (historical records, purchase invoices), that provide context useful for their interpretation. To address this problem, we introduce Document Collection Visual Question Answering (DocCVQA) a new dataset and related task, where questions are posed over a whole collection of document images and the goal is not only to provide the answer to the given question, but also to retrieve the set of documents that contain the information needed to infer the answer. Along with the dataset we propose a new evaluation metric and baselines which provide further insights to the new dataset and task.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ TKV2021			Serial	3622
Permanent link to this record



Author	Albert Suso; Pau Riba; Oriol Ramos Terrades; Josep Llados
Title	A Self-supervised Inverse Graphics Approach for Sketch Parametrization			Type	Conference Article
Year	2021	Publication	16th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume	12916	Issue		Pages	28-42
Keywords
Abstract	The study of neural generative models of handwritten text and human sketches is a hot topic in the computer vision field. The landmark SketchRNN provided a breakthrough by sequentially generating sketches as a sequence of waypoints, and more recent articles have managed to generate fully vector sketches by coding the strokes as Bézier curves. However, the previous attempts with this approach need them all a ground truth consisting in the sequence of points that make up each stroke, which seriously limits the datasets the model is able to train in. In this work, we present a self-supervised end-to-end inverse graphics approach that learns to embed each image to its best fit of Bézier curves. The self-supervised nature of the training process allows us to train the model in a wider range of datasets, but also to perform better after-training predictions by applying an overfitting process on the input binary image. We report qualitative an quantitative evaluations on the MNIST and the Quick, Draw! datasets.
Address	Lausanne; Suissa; September 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ SRR2021			Serial	3675
Permanent link to this record



Author	Pau Torras; Mohamed Ali Souibgui; Jialuo Chen; Alicia Fornes
Title	A Transcription Is All You Need: Learning to Align through Attention			Type	Conference Article
Year	2021	Publication	14th IAPR International Workshop on Graphics Recognition	Abbreviated Journal
Volume	12916	Issue		Pages	141–146
Keywords
Abstract	Historical ciphered manuscripts are a type of document where graphical symbols are used to encrypt their content instead of regular text. Nowadays, expert transcriptions can be found in libraries alongside the corresponding manuscript images. However, those transcriptions are not aligned, so these are barely usable for training deep learning-based recognition methods. To solve this issue, we propose a method to align each symbol in the transcript of an image with its visual representation by using an attention-based Sequence to Sequence (Seq2Seq) model. The core idea is that, by learning to recognise symbols sequence within a cipher line image, the model also identifies their position implicitly through an attention mechanism. Thus, the resulting symbol segmentation can be later used for training algorithms. The experimental evaluation shows that this method is promising, especially taking into account the small size of the cipher dataset.
Address	Virtual; September 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	GREC
Notes	DAG; 602.230; 600.140; 600.121			Approved	no
Call Number	Admin @ si @ TSC2021			Serial	3619
Permanent link to this record



Author	Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal
Title	Graph-Based Deep Generative Modelling for Document Layout Generation			Type	Conference Article
Year	2021	Publication	16th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume	12917	Issue		Pages	525-537
Keywords
Abstract	One of the major prerequisites for any deep learning approach is the availability of large-scale training data. When dealing with scanned document images in real world scenarios, the principal information of its content is stored in the layout itself. In this work, we have proposed an automated deep generative model using Graph Neural Networks (GNNs) to generate synthetic data with highly variable and plausible document layouts that can be used to train document interpretation systems, in this case, specially in digital mailroom applications. It is also the first graph-based approach for document layout generation task experimented on administrative document images, in this case, invoices.
Address	Lausanne; Suissa; September 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121; 600.140; 110.312			Approved	no
Call Number	Admin @ si @ BRL2021			Serial	3676
Permanent link to this record



Author	Bartlomiej Twardowski; Pawel Zawistowski; Szymon Zaborowski
Title	Metric Learning for Session-Based Recommendations			Type	Conference Article
Year	2021	Publication	43rd edition of the annual BCS-IRSG European Conference on Information Retrieval	Abbreviated Journal
Volume	12656	Issue		Pages	650-665
Keywords	Session-based recommendations; Deep metric learning; Learning to rank
Abstract	Session-based recommenders, used for making predictions out of users’ uninterrupted sequences of actions, are attractive for many applications. Here, for this task we propose using metric learning, where a common embedding space for sessions and items is created, and distance measures dissimilarity between the provided sequence of users’ events and the next action. We discuss and compare metric learning approaches to commonly used learning-to-rank methods, where some synergies exist. We propose a simple architecture for problem analysis and demonstrate that neither extensively big nor deep architectures are necessary in order to outperform existing methods. The experimental results against strong baselines on four datasets are provided with an ablation study.
Address	Virtual; March 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECIR
Notes	LAMP; 600.120			Approved	no
Call Number	Admin @ si @ TZZ2021			Serial	3586
Permanent link to this record



Author	Tomas Sixta; Julio C. S. Jacques Junior; Pau Buch Cardona; Eduard Vazquez; Sergio Escalera
Title	FairFace Challenge at ECCV 2020: Analyzing Bias in Face Recognition			Type	Conference Article
Year	2020	Publication	ECCV Workshops	Abbreviated Journal
Volume	12540	Issue		Pages	463-481
Keywords
Abstract	This work summarizes the 2020 ChaLearn Looking at People Fair Face Recognition and Analysis Challenge and provides a description of the top-winning solutions and analysis of the results. The aim of the challenge was to evaluate accuracy and bias in gender and skin colour of submitted algorithms on the task of 1:1 face verification in the presence of other confounding attributes. Participants were evaluated using an in-the-wild dataset based on reannotated IJB-C, further enriched 12.5K new images and additional labels. The dataset is not balanced, which simulates a real world scenario where AI-based models supposed to present fair outcomes are trained and evaluated on imbalanced data. The challenge attracted 151 participants, who made more 1.8K submissions in total. The final phase of the challenge attracted 36 active teams out of which 10 exceeded 0.999 AUC-ROC while achieving very low scores in the proposed bias metrics. Common strategies by the participants were face pre-processing, homogenization of data distributions, the use of bias aware loss functions and ensemble models. The analysis of top-10 teams shows higher false positive rates (and lower false negative rates) for females with dark skin tone as well as the potential of eyeglasses and young age to increase the false positive rates too.
Address	Virtual; August 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ SJB2020			Serial	3499
Permanent link to this record