Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	1–15 of 140 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >>

List View

Citations

Details

	Records
	Author	Ana Garcia Rodriguez; Yael Tudela; Henry Cordova; S. Carballal; I. Ordas; L. Moreira; E. Vaquero; O. Ortiz; L. Rivero; F. Javier Sanchez; Miriam Cuatrecasas; Maria Pellise; Jorge Bernal; Gloria Fernandez Esparrach
	Title	First in Vivo Computer-Aided Diagnosis of Colorectal Polyps using White Light Endoscopy			Type	Journal Article
	Year	2022	Publication	Endoscopy	Abbreviated Journal	END
	Volume	54	Issue		Pages
	Keywords
	Abstract
	Address	2022/04/14
	Corporate Author				Thesis
	Publisher	Georg Thieme Verlag KG	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ISE			Approved	no
	Call Number	Admin @ si @ GTC2022a			Serial	3746
Permanent link to this record



	Author	Y. Mori; M.Misawa; Jorge Bernal; M. Bretthauer; S.Kudo; A. Rastogi; Gloria Fernandez Esparrach
	Title	Artificial Intelligence for Disease Diagnosis-the Gold Standard Challenge			Type	Journal Article
	Year	2022	Publication	Gastrointestinal Endoscopy	Abbreviated Journal
	Volume	96	Issue	2	Pages	370-372
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ISE			Approved	no
	Call Number	Admin @ si @ MMB2022			Serial	3701
Permanent link to this record



	Author	Utkarsh Porwal; Alicia Fornes; Faisal Shafait (eds)
	Title	Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022			Type	Book Whole
	Year	2022	Publication	Frontiers in Handwriting Recognition.	Abbreviated Journal
	Volume	13639	Issue		Pages
	Keywords
	Abstract
	Address	ICFHR 2022, Hyderabad, India, December 4–7, 2022
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor	Utkarsh Porwal; Alicia Fornes; Faisal Shafait
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-031-21648-0	Medium
	Area		Expedition		Conference	ICFHR
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ PFS2022			Serial	3809
Permanent link to this record



	Author	Juan Borrego-Carazo; Carles Sanchez; David Castells; Jordi Carrabina; Debora Gil
	Title	A benchmark for the evaluation of computational methods for bronchoscopic navigation			Type	Journal Article
	Year	2022	Publication	International Journal of Computer Assisted Radiology and Surgery	Abbreviated Journal	IJCARS
	Volume	17	Issue	1	Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	IAM			Approved	no
	Call Number	Admin @ si @ BSC2022			Serial	3832
Permanent link to this record



	Author	Antoni Rosell; Sonia Baeza; S. Garcia-Reina; JL. Mate; Ignasi Guasch; I. Nogueira; I. Garcia-Olive; Guillermo Torres; Carles Sanchez; Debora Gil
	Title	EP01.05-001 Radiomics to Increase the Effectiveness of Lung Cancer Screening Programs. Radiolung Preliminary Results			Type	Journal Article
	Year	2022	Publication	Journal of Thoracic Oncology	Abbreviated Journal	JTO
	Volume	17	Issue	9	Pages	S182
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	IAM			Approved	no
	Call Number	Admin @ si @ RBG2022b			Serial	3834
Permanent link to this record



	Author	Antoni Rosell; Sonia Baeza; S. Garcia-Reina; JL. Mate; Ignasi Guasch; I. Nogueira; I. Garcia-Olive; Guillermo Torres; Carles Sanchez; Debora Gil
	Title	Radiomics to increase the effectiveness of lung cancer screening programs. Radiolung preliminary results.			Type	Journal Article
	Year	2022	Publication	European Respiratory Journal	Abbreviated Journal	ERJ
	Volume	60	Issue	66	Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	IAM			Approved	no
	Call Number	Admin @ si @ RBG2022c			Serial	3835
Permanent link to this record



	Author	Akhil Gurram
	Title	Monocular Depth Estimation for Autonomous Driving			Type	Book Whole
	Year	2022	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	3D geometric information is essential for on-board perception in autonomous driving and driver assistance. Autonomous vehicles (AVs) are equipped with calibrated sensor suites. As part of these suites, we can find LiDARs, which are expensive active sensors in charge of providing the 3D geometric information. Depending on the operational conditions for the AV, calibrated stereo rigs may be also sufficient for obtaining 3D geometric information, being these rigs less expensive and easier to install than LiDARs. However, ensuring a proper maintenance and calibration of these types of sensors is not trivial. Accordingly, there is an increasing interest on performing monocular depth estimation (MDE) to obtain 3D geometric information on-board. MDE is very appealing since it allows for appearance and depth being on direct pixelwise correspondence without further calibration. Moreover, a set of single cameras with MDE capabilities would still be a cheap solution for on-board perception, relatively easy to integrate and maintain in an AV. Best MDE models are based on Convolutional Neural Networks (CNNs) trained in a supervised manner, i.e., assuming pixelwise ground truth (GT). Accordingly, the overall goal of this PhD is to study methods for improving CNN-based MDE accuracy under different training settings. More specifically, this PhD addresses different research questions that are described below. When we started to work in this PhD, state-of-theart methods for MDE were already based on CNNs. In fact, a promising line of work consisted in using image-based semantic supervision (i.e., pixel-level class labels) while training CNNs for MDE using LiDAR-based supervision (i.e., depth). It was common practice to assume that the same raw training data are complemented by both types of supervision, i.e., with depth and semantic labels. However, in practice, it was more common to find heterogeneous datasets with either only depth supervision or only semantic supervision. Therefore, our first work was to research if we could train CNNs for MDE by leveraging depth and semantic information from heterogeneous datasets. We show that this is indeed possible, and we surpassed the state-of-the-art results on MDE at the time we did this research. To achieve our results, we proposed a particular CNN architecture and a new training protocol. After this research, it was clear that the upper-bound setting to train CNN-based MDE models consists in using LiDAR data as supervision. However, it would be cheaper and more scalable if we would be able to train such models from monocular sequences. Obviously, this is far more challenging, but worth to research. Training MDE models using monocular sequences is possible by relying on structure-from-motion (SfM) principles to generate self-supervision. Nevertheless, problems of camouflaged objects, visibility changes, static-camera intervals, textureless areas, and scale ambiguity, diminish the usefulness of such self-supervision. To alleviate these problems, we perform MDE by virtual-world supervision and real-world SfM self-supervision. We call our proposalMonoDEVSNet. We compensate the SfM self-supervision limitations by leveraging virtual-world images with accurate semantic and depth supervision, as well as addressing the virtual-to-real domain gap. MonoDEVSNet outperformed previous MDE CNNs trained on monocular and even stereo sequences. We have publicly released MonoDEVSNet at <https://github.com/HMRC-AEL/MonoDEVSNet>. Finally, since MDE is performed to produce 3D information for being used in downstream tasks related to on-board perception. We also address the question of whether the standard metrics for MDE assessment are a good indicator for future MDE-based driving-related perception tasks. By using 3D object detection on point clouds as proxy of on-board perception, we conclude that, indeed, MDE evaluation metrics give rise to a ranking of methods which reflects relatively well the 3D object detection results we may expect.
	Address	March, 2022
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	IMPRIMA	Place of Publication		Editor	Antonio Lopez;Onay Urfalioglu
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-124793-0-0	Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ Gur2022			Serial	3712
Permanent link to this record



	Author	Joakim Bruslund Haurum; Meysam Madadi; Sergio Escalera; Thomas B. Moeslund
	Title	Multi-scale hybrid vision transformer and Sinkhorn tokenizer for sewer defect classification			Type	Journal Article
	Year	2022	Publication	Automation in Construction	Abbreviated Journal	AC
	Volume	144	Issue		Pages	104614
	Keywords	Sewer Defect Classification; Vision Transformers; Sinkhorn-Knopp; Convolutional Neural Networks; Closed-Circuit Television; Sewer Inspection
	Abstract	A crucial part of image classification consists of capturing non-local spatial semantics of image content. This paper describes the multi-scale hybrid vision transformer (MSHViT), an extension of the classical convolutional neural network (CNN) backbone, for multi-label sewer defect classification. To better model spatial semantics in the images, features are aggregated at different scales non-locally through the use of a lightweight vision transformer, and a smaller set of tokens was produced through a novel Sinkhorn clustering-based tokenizer using distinct cluster centers. The proposed MSHViT and Sinkhorn tokenizer were evaluated on the Sewer-ML multi-label sewer defect classification dataset, showing consistent performance improvements of up to 2.53 percentage points.
	Address	Dec 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA			Approved	no
	Call Number	Admin @ si @ BME2022c			Serial	3780
Permanent link to this record



	Author	Danna Xue; Fei Yang; Pei Wang; Luis Herranz; Jinqiu Sun; Yu Zhu; Yanning Zhang
	Title	SlimSeg: Slimmable Semantic Segmentation with Boundary Supervision			Type	Conference Article
	Year	2022	Publication	30th ACM International Conference on Multimedia	Abbreviated Journal
	Volume		Issue		Pages	6539-6548
	Keywords
	Abstract	Accurate semantic segmentation models typically require significant computational resources, inhibiting their use in practical applications. Recent works rely on well-crafted lightweight models to achieve fast inference. However, these models cannot flexibly adapt to varying accuracy and efficiency requirements. In this paper, we propose a simple but effective slimmable semantic segmentation (SlimSeg) method, which can be executed at different capacities during inference depending on the desired accuracy-efficiency tradeoff. More specifically, we employ parametrized channel slimming by stepwise downward knowledge distillation during training. Motivated by the observation that the differences between segmentation results of each submodel are mainly near the semantic borders, we introduce an additional boundary guided semantic segmentation loss to further improve the performance of each submodel. We show that our proposed SlimSeg with various mainstream networks can produce flexible models that provide dynamic adjustment of computational cost and better performance than independent models. Extensive experiments on semantic segmentation benchmarks, Cityscapes and CamVid, demonstrate the generalization ability of our framework.
	Address	Lisboa, Portugal, October 2022
	Corporate Author				Thesis
	Publisher	Association for Computing Machinery	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-1-4503-9203-7	Medium
	Area		Expedition		Conference	MM
	Notes	MACO; 600.161; 601.400			Approved	no
	Call Number	Admin @ si @ XYW2022			Serial	3758
Permanent link to this record



	Author	Javad Zolfaghari Bengar; Joost Van de Weijer; Laura Lopez-Fuentes; Bogdan Raducanu
	Title	Class-Balanced Active Learning for Image Classification			Type	Conference Article
	Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Active learning aims to reduce the labeling effort that is required to train algorithms by learning an acquisition function selecting the most relevant data for which a label should be requested from a large unlabeled data pool. Active learning is generally studied on balanced datasets where an equal amount of images per class is available. However, real-world datasets suffer from severe imbalanced classes, the so called long-tail distribution. We argue that this further complicates the active learning process, since the imbalanced data pool can result in suboptimal classifiers. To address this problem in the context of active learning, we proposed a general optimization framework that explicitly takes class-balancing into account. Results on three datasets showed that the method is general (it can be combined with most existing active learning algorithms) and can be effectively applied to boost the performance of both informative and representative-based active learning methods. In addition, we showed that also on balanced datasets our method 1 generally results in a performance gain.
	Address	Virtual; Waikoloa; Hawai; USA; January 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	LAMP; 602.200; 600.147; 600.120			Approved	no
	Call Number	Admin @ si @ ZWL2022			Serial	3703
Permanent link to this record



	Author	Michael Teutsch; Angel Sappa; Riad I. Hammoud
	Title	Cross-Spectral Image Processing			Type	Book Chapter
	Year	2022	Publication	Computer Vision in the Infrared Spectrum. Synthesis Lectures on Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	23-34
	Keywords
	Abstract	Although this book is on IR computer vision and its main focus lies on IR image and video processing and analysis, a special attention is dedicated to cross-spectral image processing due to the increasing number of publications and applications in this domain. In these cross-spectral frameworks, IR information is used together with information from other spectral bands to tackle some specific problems by developing more robust solutions. Tasks considered for cross-spectral processing are for instance dehazing, segmentation, vegetation index estimation, or face recognition. This increasing number of applications is motivated by cross- and multi-spectral camera setups available already on the market like for example smartphones, remote sensing multispectral cameras, or multi-spectral cameras for automotive systems or drones. In this chapter, different cross-spectral image processing techniques will be reviewed together with possible applications. Initially, image registration approaches for the cross-spectral case are reviewed: the registration stage is the first image processing task, which is needed to align images acquired by different sensors within the same reference coordinate system. Then, recent cross-spectral image colorization approaches, which are intended to colorize infrared images for different applications are presented. Finally, the cross-spectral image enhancement problem is tackled by including guided super resolution techniques, image dehazing approaches, cross-spectral filtering and edge detection. Figure 3.1 illustrates cross-spectral image processing stages as well as their possible connections. Table 3.1 presents some of the available public cross-spectral datasets generally used as reference data to evaluate cross-spectral image registration, colorization, enhancement, or exploitation results.
	Address
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	SLCV
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-031-00698-2	Medium
	Area		Expedition		Conference
	Notes	MSIAU; MACO			Approved	no
	Call Number	Admin @ si @ TSH2022b			Serial	3805
Permanent link to this record



	Author	Vacit Oguz Yazici
	Title	Towards Smart Fashion: Visual Recognition of Products and Attributes			Type	Book Whole
	Year	2022	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Artificial intelligence is innovating the fashion industry by proposing new applications and solutions to the problems encountered by researchers and engineers working in the industry. In this thesis, we address three of these problems. In the first part of the thesis, we tackle the problem of multi-label image classification which is very related to fashion attribute recognition. In the second part of the thesis, we address two problems that are specific to fashion. Firstly, we address the problem of main product detection which is the task of associating correct image parts (e.g. bounding boxes) with the fashion product being sold. Secondly, we address the problem of color naming for multicolored fashion items. The task of multi-label image classification consists in assigning various concepts such as objects or attributes to images. Usually, there are dependencies that can be learned between the concepts to capture label correlations (chair and table classes are more likely to co-exist than chair and giraffe). If we treat the multi-label image classification problem as an orderless set prediction problem, we can exploit recurrent neural networks (RNN) to capture label correlations. However, RNNs are trained to predict ordered sequences of tokens, so if the order of the predicted sequence is different than the order of the ground truth sequence, there will be penalization although the predictions are correct. Therefore, in the first part of the thesis, we propose an orderless loss function which will order the labels in the ground truth sequence dynamically in a way that the minimum loss is achieved. This results in a significant improvement of RNN models on multi-label image classification over the previous methods. However, RNNs suffer from long term dependencies when the cardinality of set grows bigger. The decoding process might stop early if the current hidden state cannot find any object and outputs the termination token. This would cause the remaining classes not to be predicted and lower recall metric. Transformers can be used to avoid the long term dependency problem exploiting their selfattention modules that process sequential data simultaneously. Consequently, we propose a novel transformer model for multi-label image classification which surpasses the state-of-the-art results by a large margin. In the second part of thesis, we focus on two fashion-specific problems. Main product detection is the task of associating image parts with the fashion product that is being sold, generally using associated textual metadata (product title or description). Normally, in fashion e-commerces, products are represented by multiple images where a person wears the product along with other fashion items. If all the fashion items in the images are marked with bounding boxes, we can use the textual metadata to decide which item is the main product. The initial work treated each of these images independently, discarding the fact that they all belong to the same product. In this thesis, we represent the bounding boxes from all the images as nodes in a fully connected graph. This allows the algorithm to learn relations between the nodes during training and take the entire context into account for the final decision. Our algorithm results in a significant improvement of the state-ofthe-art. Moreover, we address the problem of color naming for multicolored fashion items, which is a challenging task due to the external factors such as illumination changes or objects that act as clutter. In the context of multi-label classification, the vaguely defined lines between the classes in the color space cause ambiguity. For example, a shade of blue which is very close to green might cause the model to incorrectly predict the color blue and green at the same time. Based on this, models trained for color naming are expected to recognize the colors and their quantities in both single colored and multicolored fashion items. Therefore, in this thesis, we propose a novel architecture with an additional head that explicitly estimates the number of colors in fashion items. This removes the ambiguity problem and results in better color naming performance.
	Address	January 2022
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	IMPRIMA	Place of Publication		Editor	Joost Van de Weijer;Arnau Ramisa
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-122714-6-1	Medium
	Area		Expedition		Conference
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ Ogu2022			Serial	3631
Permanent link to this record



	Author	Asma Bensalah; Alicia Fornes; Cristina Carmona_Duarte; Josep Llados
	Title	Easing Automatic Neurorehabilitation via Classification and Smoothness Analysis			Type	Conference Article
	Year	2022	Publication	Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022	Abbreviated Journal
	Volume	13424	Issue		Pages	336-348
	Keywords	Neurorehabilitation; Upper-lim; Movement classification; Movement smoothness; Deep learning; Jerk
	Abstract	Assessing the quality of movements for post-stroke patients during the rehabilitation phase is vital given that there is no standard stroke rehabilitation plan for all the patients. In fact, it depends basically on the patient’s functional independence and its progress along the rehabilitation sessions. To tackle this challenge and make neurorehabilitation more agile, we propose an automatic assessment pipeline that starts by recognising patients’ movements by means of a shallow deep learning architecture, then measuring the movement quality using jerk measure and related measures. A particularity of this work is that the dataset used is clinically relevant, since it represents movements inspired from Fugl-Meyer a well common upper-limb clinical stroke assessment scale for stroke patients. We show that it is possible to detect the contrast between healthy and patients movements in terms of smoothness, besides achieving conclusions about the patients’ progress during the rehabilitation sessions that correspond to the clinicians’ findings about each case.
	Address	June 7-9, 2022, Las Palmas de Gran Canaria, Spain
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	IGS
	Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
	Call Number	Admin @ si @ BFC2022			Serial	3738
Permanent link to this record



	Author	Michael Teutsch; Angel Sappa; Riad I. Hammoud
	Title	Detection, Classification, and Tracking			Type	Book Chapter
	Year	2022	Publication	Computer Vision in the Infrared Spectrum. Synthesis Lectures on Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	35-58
	Keywords
	Abstract	Automatic image and video exploitation or content analysis is a technique to extract higher-level information from a scene such as objects, behavior, (inter-)actions, environment, or even weather conditions. The relevant information is assumed to be contained in the two-dimensional signal provided in an image (width and height in pixels) or the three-dimensional signal provided in a video (width, height, and time). But also intermediate-level information such as object classes [196], locations [197], or motion [198] can help applications to fulfill certain tasks such as intelligent compression [199], video summarization [200], or video retrieval [201]. Usually, videos with their temporal dimension are a richer source of data compared to single images [202] and thus certain video content can be extracted from videos only such as object motion or object behavior. Often, machine learning or nowadays deep learning techniques are utilized to model prior knowledge about object or scene appearance using labeled training samples [203, 204]. After a learning phase, these models are then applied in real world applications, which is called inference.
	Address
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	SLCV
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-031-00698-2	Medium
	Area		Expedition		Conference
	Notes	MSIAU; MACO			Approved	no
	Call Number	Admin @ si @ TSH2022c			Serial	3806
Permanent link to this record



	Author	Victoria Ruiz; Angel Sanchez; Jose F. Velez; Bogdan Raducanu
	Title	Waste Classification with Small Datasets and Limited Resources			Type	Book Chapter
	Year	2022	Publication	ICT Applications for Smart Cities. Intelligent Systems Reference Library	Abbreviated Journal
	Volume	224	Issue		Pages	185-203
	Keywords
	Abstract	Automatic waste recycling has become a very important societal challenge nowadays, raising people’s awareness for a cleaner environment and a more sustainable lifestyle. With the transition to Smart Cities, and thanks to advanced ICT solutions, this problem has received a new impulse. The waste recycling focus has shifted from general waste treating facilities to an individual responsibility, where each person should become aware of selective waste separation. The surge of the mobile devices, accompanied by a significant increase in computation power, has potentiated and facilitated this individual role. An automated image-based waste classification mechanism can help with a more efficient recycling and a reduction of contamination from residuals. Despite the good results achieved with the deep learning methodologies for this task, the Achille’s heel is that they require large neural networks which need significant computational resources for training and therefore are not suitable for mobile devices. To circumvent this apparently intractable problem, we will rely on knowledge distillation in order to transfer the network’s knowledge from a larger network (called ‘teacher’) to a smaller, more compact one, (referred as ‘student’) and thus making it possible the task of image classification on a device with limited resources. For evaluation, we considered as ‘teachers’ large architectures such as InceptionResNet or DenseNet and as ‘students’, several configurations of the MobileNets. We used the publicly available TrashNet dataset to demonstrate that the distillation process does not significantly affect system’s performance (e.g. classification accuracy) of the student network.
	Address	September 2022
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	ISRL
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-031-06306-0	Medium
	Area		Expedition		Conference
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @			Serial	3813
Permanent link to this record