Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >>

Details

Records
Author	Debora Gil; Aura Hernandez-Sabate; Julien Enconniere; Saryani Asmayawati; Pau Folch; Juan Borrego-Carazo; Miquel Angel Piera
Title	E-Pilots: A System to Predict Hard Landing During the Approach Phase of Commercial Flights			Type	Journal Article
Year	2022	Publication	IEEE Access	Abbreviated Journal	ACCESS
Volume	10	Issue		Pages	7489-7503
Keywords
Abstract	More than half of all commercial aircraft operation accidents could have been prevented by executing a go-around. Making timely decision to execute a go-around manoeuvre can potentially reduce overall aviation industry accident rate. In this paper, we describe a cockpit-deployable machine learning system to support flight crew go-around decision-making based on the prediction of a hard landing event. This work presents a hybrid approach for hard landing prediction that uses features modelling temporal dependencies of aircraft variables as inputs to a neural network. Based on a large dataset of 58177 commercial flights, the results show that our approach has 85% of average sensitivity with 74% of average specificity at the go-around point. It follows that our approach is a cockpit-deployable recommendation system that outperforms existing approaches.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM; 600.139; 600.118; 600.145			Approved	no
Call Number	Admin @ si @ GHE2022			Serial	3721
Permanent link to this record



Author	Julio C. S. Jacques Junior; Yagmur Gucluturk; Marc Perez; Umut Guçlu; Carlos Andujar; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Marcel A. J. van Gerven; Rob van Lier; Sergio Escalera
Title	First Impressions: A Survey on Vision-Based Apparent Personality Trait Analysis			Type	Journal Article
Year	2022	Publication	IEEE Transactions on Affective Computing	Abbreviated Journal	TAC
Volume	13	Issue	1	Pages	75-95
Keywords	Personality computing; first impressions; person perception; big-five; subjective bias; computer vision; machine learning; nonverbal signals; facial expression; gesture; speech analysis; multi-modal recognition
Abstract	Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.
Address	1 Jan.-March 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA			Approved	no
Call Number	Admin @ si @ JGP2022			Serial	3724
Permanent link to this record



Author	Pau Riba; Lutz Goldmann; Oriol Ramos Terrades; Diede Rusticus; Alicia Fornes; Josep Llados
Title	Table detection in business document images by message passing networks			Type	Journal Article
Year	2022	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	127	Issue		Pages	108641
Keywords
Abstract	Tabular structures in business documents offer a complementary dimension to the raw textual data. For instance, there is information about the relationships among pieces of information. Nowadays, digital mailroom applications have become a key service for workflow automation. Therefore, the detection and interpretation of tables is crucial. With the recent advances in information extraction, table detection and recognition has gained interest in document image analysis, in particular, with the absence of rule lines and unknown information about rows and columns. However, business documents usually contain sensitive contents limiting the amount of public benchmarking datasets. In this paper, we propose a graph-based approach for detecting tables in document images which do not require the raw content of the document. Hence, the sensitive content can be previously removed and, instead of using the raw image or textual content, we propose a purely structural approach to keep sensitive data anonymous. Our framework uses graph neural networks (GNNs) to describe the local repetitive structures that constitute a table. In particular, our main application domain are business documents. We have carefully validated our approach in two invoice datasets and a modern document benchmark. Our experiments demonstrate that tables can be detected by purely structural approaches.
Address	July 2022
Corporate Author				Thesis
Publisher	Elsevier	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.162; 600.121			Approved	no
Call Number	Admin @ si @ RGR2022			Serial	3729
Permanent link to this record



Author	Mohamed Ali Souibgui; Sanket Biswas; Sana Khamekhem Jemni; Yousri Kessentini; Alicia Fornes; Josep Llados; Umapada Pal
Title	DocEnTr: An End-to-End Document Image Enhancement Transformer			Type	Conference Article
Year	2022	Publication	26th International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	1699-1705
Keywords	Degradation; Head; Optical character recognition; Self-supervised learning; Benchmark testing; Transformers; Magnetic heads
Abstract	Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties. In this age of digitization, it is important to denoise them for proper usage. To address this challenge, we present a new encoder-decoder architecture based on vision transformers to enhance both machine-printed and handwritten document images, in an end-to-end fashion. The encoder operates directly on the pixel patches with their positional information without the use of any convolutional layers, while the decoder reconstructs a clean image from the encoded patches. Conducted experiments show a superiority of the proposed model compared to the state-of the-art methods on several DIBCO benchmarks. Code and models will be publicly available at: https://github.com/dali92002/DocEnTR
Address	August 21-25, 2022 , Montréal Québec
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
Call Number	Admin @ si @ SBJ2022			Serial	3730
Permanent link to this record



Author	Giacomo Magnifico; Beata Megyesi; Mohamed Ali Souibgui; Jialuo Chen; Alicia Fornes
Title	Lost in Transcription of Graphic Signs in Ciphers			Type	Conference Article
Year	2022	Publication	International Conference on Historical Cryptology (HistoCrypt 2022)	Abbreviated Journal
Volume		Issue		Pages	153-158
Keywords	transcription of ciphers; hand-written text recognition of symbols; graphic signs
Abstract	Hand-written Text Recognition techniques with the aim to automatically identify and transcribe hand-written text have been applied to historical sources including ciphers. In this paper, we compare the performance of two machine learning architectures, an unsupervised method based on clustering and a deep learning method with few-shot learning. Both models are tested on seen and unseen data from historical ciphers with different symbol sets consisting of various types of graphic signs. We compare the models and highlight their differences in performance, with their advantages and shortcomings.
Address	Amsterdam, Netherlands, June 20-22, 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	HystoCrypt
Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
Call Number	Admin @ si @ MBS2022			Serial	3731
Permanent link to this record



Author	Giuseppe De Gregorio; Sanket Biswas; Mohamed Ali Souibgui; Asma Bensalah; Josep Llados; Alicia Fornes; Angelo Marcelli
Title	A Few Shot Multi-representation Approach for N-Gram Spotting in Historical Manuscripts			Type	Conference Article
Year	2022	Publication	Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022)	Abbreviated Journal
Volume	13639	Issue		Pages	3-12
Keywords	N-gram spotting; Few-shot learning; Multimodal understanding; Historical handwritten collections
Abstract	Despite recent advances in automatic text recognition, the performance remains moderate when it comes to historical manuscripts. This is mainly because of the scarcity of available labelled data to train the data-hungry Handwritten Text Recognition (HTR) models. The Keyword Spotting System (KWS) provides a valid alternative to HTR due to the reduction in error rate, but it is usually limited to a closed reference vocabulary. In this paper, we propose a few-shot learning paradigm for spotting sequences of a few characters (N-gram) that requires a small amount of labelled training data. We exhibit that recognition of important n-grams could reduce the system’s dependency on vocabulary. In this case, an out-of-vocabulary (OOV) word in an input handwritten line image could be a sequence of n-grams that belong to the lexicon. An extensive experimental evaluation of our proposed multi-representation approach was carried out on a subset of Bentham’s historical manuscript collections to obtain some really promising results in this direction.
Address	December 04 – 07, 2022; Hyderabad, India
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICFHR
Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
Call Number	Admin @ si @ GBS2022			Serial	3733
Permanent link to this record



Author	Arnau Baro; Carles Badal; Pau Torras; Alicia Fornes
Title	Handwritten Historical Music Recognition through Sequence-to-Sequence with Attention Mechanism			Type	Conference Article
Year	2022	Publication	3rd International Workshop on Reading Music Systems (WoRMS2021)	Abbreviated Journal
Volume		Issue		Pages	55-59
Keywords	Optical Music Recognition; Digits; Image Classification
Abstract	Despite decades of research in Optical Music Recognition (OMR), the recognition of old handwritten music scores remains a challenge because of the variabilities in the handwriting styles, paper degradation, lack of standard notation, etc. Therefore, the research in OMR systems adapted to the particularities of old manuscripts is crucial to accelerate the conversion of music scores existing in archives into digital libraries, fostering the dissemination and preservation of our music heritage. In this paper we explore the adaptation of sequence-to-sequence models with attention mechanism (used in translation and handwritten text recognition) and the generation of specific synthetic data for recognizing old music scores. The experimental validation demonstrates that our approach is promising, especially when compared with long short-term memory neural networks.
Address	July 23, 2021, Alicante (Spain)
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WoRMS
Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
Call Number	Admin @ si @ BBT2022			Serial	3734
Permanent link to this record



Author	Pau Torras; Arnau Baro; Alicia Fornes; Lei Kang
Title	Improving Handwritten Music Recognition through Language Model Integration			Type	Conference Article
Year	2022	Publication	4th International Workshop on Reading Music Systems (WoRMS2022)	Abbreviated Journal
Volume		Issue		Pages	42-46
Keywords	optical music recognition; historical sources; diversity; music theory; digital humanities
Abstract	Handwritten Music Recognition, especially in the historical domain, is an inherently challenging endeavour; paper degradation artefacts and the ambiguous nature of handwriting make recognising such scores an error-prone process, even for the current state-of-the-art Sequence to Sequence models. In this work we propose a way of reducing the production of statistically implausible output sequences by fusing a Language Model into a recognition Sequence to Sequence model. The idea is leveraging visually-conditioned and context-conditioned output distributions in order to automatically find and correct any mistakes that would otherwise break context significantly. We have found this approach to improve recognition results to 25.15 SER (%) from a previous best of 31.79 SER (%) in the literature.
Address	November 18, 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WoRMS
Notes	DAG; 600.121; 600.162; 602.230			Approved	no
Call Number	Admin @ si @ TBF2022			Serial	3735
Permanent link to this record



Author	Mohamed Ali Souibgui; Alicia Fornes; Yousri Kessentini; Beata Megyesi
Title	Few shots are all you need: A progressive learning approach for low resource handwritten text recognition			Type	Journal Article
Year	2022	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	160	Issue		Pages	43-49
Keywords
Abstract	Handwritten text recognition in low resource scenarios, such as manuscripts with rare alphabets, is a challenging problem. In this paper, we propose a few-shot learning-based handwriting recognition approach that significantly reduces the human annotation process, by requiring only a few images of each alphabet symbols. The method consists of detecting all the symbols of a given alphabet in a textline image and decoding the obtained similarity scores to the final sequence of transcribed symbols. Our model is first pretrained on synthetic line images generated from an alphabet, which could differ from the alphabet of the target domain. A second training step is then applied to reduce the gap between the source and the target data. Since this retraining would require annotation of thousands of handwritten symbols together with their bounding boxes, we propose to avoid such human effort through an unsupervised progressive learning approach that automatically assigns pseudo-labels to the unlabeled data. The evaluation on different datasets shows that our model can lead to competitive results with a significant reduction in human effort. The code will be publicly available in the following repository: https://github.com/dali92002/HTRbyMatching
Address
Corporate Author				Thesis
Publisher	Elsevier	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121; 600.162; 602.230			Approved	no
Call Number	Admin @ si @ SFK2022			Serial	3736
Permanent link to this record



Author	Joana Maria Pujadas-Mora; Alicia Fornes; Oriol Ramos Terrades; Josep Llados; Jialuo Chen; Miquel Valls-Figols; Anna Cabre
Title	The Barcelona Historical Marriage Database and the Baix Llobregat Demographic Database. From Algorithms for Handwriting Recognition to Individual-Level Demographic and Socioeconomic Data			Type	Journal
Year	2022	Publication	Historical Life Course Studies	Abbreviated Journal	HLCS
Volume	12	Issue		Pages	99-132
Keywords	Individual demographic databases; Computer vision, Record linkage; Social mobility; Inequality; Migration; Word spotting; Handwriting recognition; Local censuses; Marriage Licences
Abstract	The Barcelona Historical Marriage Database (BHMD) gathers records of the more than 600,000 marriages celebrated in the Diocese of Barcelona and their taxation registered in Barcelona Cathedral's so-called Marriage Licenses Books for the long period 1451–1905 and the BALL Demographic Database brings together the individual information recorded in the population registers, censuses and fiscal censuses of the main municipalities of the county of Baix Llobregat (Barcelona). In this ongoing collection 263,786 individual observations have been assembled, dating from the period between 1828 and 1965 by December 2020. The two databases started as part of different interdisciplinary research projects at the crossroads of Historical Demography and Computer Vision. Their construction uses artificial intelligence and computer vision methods as Handwriting Recognition to reduce the time of execution. However, its current state still requires some human intervention which explains the implemented crowdsourcing and game sourcing experiences. Moreover, knowledge graph techniques have allowed the application of advanced record linkage to link the same individuals and families across time and space. Moreover, we will discuss the main research lines using both databases developed so far in historical demography.
Address	June 23, 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
Call Number	Admin @ si @ PFR2022			Serial	3737
Permanent link to this record



Author	Asma Bensalah; Alicia Fornes; Cristina Carmona_Duarte; Josep Llados
Title	Easing Automatic Neurorehabilitation via Classification and Smoothness Analysis			Type	Conference Article
Year	2022	Publication	Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022	Abbreviated Journal
Volume	13424	Issue		Pages	336-348
Keywords	Neurorehabilitation; Upper-lim; Movement classification; Movement smoothness; Deep learning; Jerk
Abstract	Assessing the quality of movements for post-stroke patients during the rehabilitation phase is vital given that there is no standard stroke rehabilitation plan for all the patients. In fact, it depends basically on the patient’s functional independence and its progress along the rehabilitation sessions. To tackle this challenge and make neurorehabilitation more agile, we propose an automatic assessment pipeline that starts by recognising patients’ movements by means of a shallow deep learning architecture, then measuring the movement quality using jerk measure and related measures. A particularity of this work is that the dataset used is clinically relevant, since it represents movements inspired from Fugl-Meyer a well common upper-limb clinical stroke assessment scale for stroke patients. We show that it is possible to detect the contrast between healthy and patients movements in terms of smoothness, besides achieving conclusions about the patients’ progress during the rehabilitation sessions that correspond to the clinicians’ findings about each case.
Address	June 7-9, 2022, Las Palmas de Gran Canaria, Spain
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IGS
Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
Call Number	Admin @ si @ BFC2022			Serial	3738
Permanent link to this record



Author	Alicia Fornes; Asma Bensalah; Cristina Carmona_Duarte; Jialuo Chen; Miguel A. Ferrer; Andreas Fischer; Josep Llados; Cristina Martin; Eloy Opisso; Rejean Plamondon; Anna Scius-Bertrand; Josep Maria Tormos
Title	The RPM3D Project: 3D Kinematics for Remote Patient Monitoring			Type	Conference Article
Year	2022	Publication	Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022	Abbreviated Journal
Volume	13424	Issue		Pages	217-226
Keywords	Healthcare applications; Kinematic; Theory of Rapid Human Movements; Human activity recognition; Stroke rehabilitation; 3D kinematics
Abstract	This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches. We base our analysis on the Kinematic Theory of Rapid Human Movement. We have validated our research in a real case scenario for stroke rehabilitation at the Guttmann Institute (https://www.guttmann.com/en/) (neurorehabilitation hospital), showing promising results. Our work could have a great impact in remote healthcare applications, improving the medical efficiency and reducing the healthcare costs. Future steps include more clinical validation, developing multi-modal analysis architectures (analysing data from sensors, images, audio, etc.), and exploring the application of our technology to monitor other neurodegenerative diseases.
Address	June 7-9, 2022, Las Palmas de Gran Canaria, Spain
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IGS
Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
Call Number	Admin @ si @ FBC2022			Serial	3739
Permanent link to this record



Author	Arnau Baro; Pau Riba; Alicia Fornes
Title	Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network			Type	Conference Article
Year	2022	Publication	Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022)	Abbreviated Journal
Volume	13639	Issue		Pages	171-184
Keywords	Object detection; Optical music recognition; Graph neural network
Abstract	During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results.
Address	December 04 – 07, 2022; Hyderabad, India
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICFHR
Notes	DAG; 600.162; 600.140; 602.230			Approved	no
Call Number	Admin @ si @ BRF2022b			Serial	3740
Permanent link to this record



Author	Carlos Boned Riera; Oriol Ramos Terrades
Title	Discriminative Neural Variational Model for Unbalanced Classification Tasks in Knowledge Graph			Type	Conference Article
Year	2022	Publication	26th International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	2186-2191
Keywords	Measurement; Couplings; Semantics; Ear; Benchmark testing; Data models; Pattern recognition
Abstract	Nowadays the paradigm of link discovery problems has shown significant improvements on Knowledge Graphs. However, method performances are harmed by the unbalanced nature of this classification problem, since many methods are easily biased to not find proper links. In this paper we present a discriminative neural variational auto-encoder model, called DNVAE from now on, in which we have introduced latent variables to serve as embedding vectors. As a result, the learnt generative model approximate better the underlying distribution and, at the same time, it better differentiate the type of relations in the knowledge graph. We have evaluated this approach on benchmark knowledge graph and Census records. Results in this last data set are quite impressive since we reach the highest possible score in the evaluation metrics. However, further experiments are still needed to deeper evaluate the performance of the method in more challenging tasks.
Address	Montreal; Quebec; Canada; August 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	DAG; 600.121; 600.162			Approved	no
Call Number	Admin @ si @ BoR2022			Serial	3741
Permanent link to this record



Author	Penny Tarling; Mauricio Cantor; Albert Clapes; Sergio Escalera
Title	Deep learning with self-supervision and uncertainty regularization to count fish in underwater images			Type	Journal Article
Year	2022	Publication	PloS One	Abbreviated Journal	Plos
Volume	17	Issue	5	Pages	e0267759
Keywords
Abstract	Effective conservation actions require effective population monitoring. However, accurately counting animals in the wild to inform conservation decision-making is difficult. Monitoring populations through image sampling has made data collection cheaper, wide-reaching and less intrusive but created a need to process and analyse this data efficiently. Counting animals from such data is challenging, particularly when densely packed in noisy images. Attempting this manually is slow and expensive, while traditional computer vision methods are limited in their generalisability. Deep learning is the state-of-the-art method for many computer vision tasks, but it has yet to be properly explored to count animals. To this end, we employ deep learning, with a density-based regression approach, to count fish in low-resolution sonar images. We introduce a large dataset of sonar videos, deployed to record wild Lebranche mullet schools (Mugil liza), with a subset of 500 labelled images. We utilise abundant unlabelled data in a self-supervised task to improve the supervised counting task. For the first time in this context, by introducing uncertainty quantification, we improve model training and provide an accompanying measure of prediction uncertainty for more informed biological decision-making. Finally, we demonstrate the generalisability of our proposed counting framework through testing it on a recent benchmark dataset of high-resolution annotated underwater images from varying habitats (DeepFish). From experiments on both contrasting datasets, we demonstrate our network outperforms the few other deep learning models implemented for solving this task. By providing an open-source framework along with training data, our study puts forth an efficient deep learning template for crowd counting aquatic animals thereby contributing effective methods to assess natural populations from the ever-increasing visual data.
Address
Corporate Author				Thesis
Publisher	Public Library of Science	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA			Approved	no
Call Number	Admin @ si @ TCC2022			Serial	3743
Permanent link to this record