Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	166–180 of 3396 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

[1–10] << 11 12 13 14 15 16 17 18 19 20 >> [21–30]

List View

Citations

Details

	Records
	Author	Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
	Title	A Non-Anatomical Graph Structure for isolated hand gesture separation in continuous gesture sequences			Type	Miscellaneous
	Year	2022	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Continuous Hand Gesture Recognition (CHGR) has been extensively studied by researchers in the last few decades. Recently, one model has been presented to deal with the challenge of the boundary detection of isolated gestures in a continuous gesture video [17]. To enhance the model performance and also replace the handcrafted feature extractor in the presented model in [17], we propose a GCN model and combine it with the stacked Bi-LSTM and Attention modules to push the temporal information in the video stream. Considering the breakthroughs of GCN models for skeleton modality, we propose a two-layer GCN model to empower the 3D hand skeleton features. Finally, the class probabilities of each isolated gesture are fed to the post-processing module, borrowed from [17]. Furthermore, we replace the anatomical graph structure with some non-anatomical graph structures. Due to the lack of a large dataset, including both the continuous gesture sequences and the corresponding isolated gestures, three public datasets in Dynamic Hand Gesture Recognition (DHGR), RKS-PERSIANSIGN, and ASLVID, are used for evaluation. Experimental results show the superiority of the proposed model in dealing with isolated gesture boundaries detection in continuous gesture sequences
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA; no menciona			Approved	no
	Call Number	Admin @ si @ RKE2022d			Serial	3828
Permanent link to this record



	Author	Marco Cotogni; Fei Yang; Claudio Cusano; Andrew Bagdanov; Joost Van de Weijer
	Title	Gated Class-Attention with Cascaded Feature Drift Compensation for Exemplar-free Continual Learning of Vision Transformers			Type	Miscellaneous
	Year	2022	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Marco Cotogni, Fei Yang, Claudio Cusano, Andrew D. Bagdanov, Joost van de Weijer
	Abstract	We propose a new method for exemplar-free class incremental training of ViTs. The main challenge of exemplar-free continual learning is maintaining plasticity of the learner without causing catastrophic forgetting of previously learned tasks. This is often achieved via exemplar replay which can help recalibrate previous task classifiers to the feature drift which occurs when learning new tasks. Exemplar replay, however, comes at the cost of retaining samples from previous tasks which for many applications may not be possible. To address the problem of continual ViT training, we first propose gated class-attention to minimize the drift in the final ViT transformer block. This mask-based gating is applied to class-attention mechanism of the last transformer block and strongly regulates the weights crucial for previous tasks. Importantly, gated class-attention does not require the task-ID during inference, which distinguishes it from other parameter isolation methods. Secondly, we propose a new method of feature drift compensation that accommodates feature drift in the backbone when learning new tasks. The combination of gated class-attention and cascaded feature drift compensation allows for plasticity towards new tasks while limiting forgetting of previous ones. Extensive experiments performed on CIFAR-100, Tiny-ImageNet and ImageNet100 demonstrate that our exemplar-free method obtains competitive results when compared to rehearsal based ViT methods.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; no proj			Approved	no
	Call Number	Admin @ si @ CYC2022			Serial	3827
Permanent link to this record



	Author	Souhail Bakkali; Zuheng Ming; Mickael Coustaty; Marçal Rusiñol; Oriol Ramos Terrades
	Title	VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification			Type	Journal Article
	Year	2023	Publication	Pattern Recognition	Abbreviated Journal	PR
	Volume	139	Issue		Pages	109419
	Keywords
	Abstract	Multimodal learning from document data has achieved great success lately as it allows to pre-train semantically meaningful features as a prior into a learnable downstream approach. In this paper, we approach the document classification problem by learning cross-modal representations through language and vision cues, considering intra- and inter-modality relationships. Instead of merging features from different modalities into a common representation space, the proposed method exploits high-level interactions and learns relevant semantic information from effective attention flows within and across modalities. The proposed learning objective is devised between intra- and inter-modality alignment tasks, where the similarity distribution per task is computed by contracting positive sample pairs while simultaneously contrasting negative ones in the common feature representation space}. Extensive experiments on public document classification datasets demonstrate the effectiveness and the generalization capacity of our model on both low-scale and large-scale datasets.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	ISSN 0031-3203	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.140; 600.121			Approved	no
	Call Number	Admin @ si @ BMC2023			Serial	3826
Permanent link to this record



	Author	Ruben Tito; Dimosthenis Karatzas; Ernest Valveny
	Title	Hierarchical multimodal transformers for Multi-Page DocVQA			Type	Journal Article
	Year	2023	Publication	Pattern Recognition	Abbreviated Journal	PR
	Volume	144	Issue		Pages	109834
	Keywords
	Abstract	Document Visual Question Answering (DocVQA) refers to the task of answering questions from document images. Existing work on DocVQA only considers single-page documents. However, in real scenarios documents are mostly composed of multiple pages that should be processed altogether. In this work we extend DocVQA to the multi-page scenario. For that, we first create a new dataset, MP-DocVQA, where questions are posed over multi-page documents instead of single pages. Second, we propose a new hierarchical method, Hi-VT5, based on the T5 architecture, that overcomes the limitations of current methods to process long multi-page documents. The proposed method is based on a hierarchical transformer architecture where the encoder summarizes the most relevant information of every page and then, the decoder takes this summarized information to generate the final answer. Through extensive experimentation, we demonstrate that our method is able, in a single stage, to answer the questions and provide the page that contains the relevant information to find the answer, which can be used as a kind of explainability measure.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	ISSN 0031-3203	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.155; 600.121			Approved	no
	Call Number	Admin @ si @ TKV2023			Serial	3825
Permanent link to this record



	Author	Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
	Title	Word separation in continuous sign language using isolated signs and post-processing			Type	Miscellaneous
	Year	2022	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Continuous Sign Language Recognition (CSLR) is a long challenging task in Computer Vision due to the difficulties in detecting the explicit boundaries between the words in a sign sentence. To deal with this challenge, we propose a two-stage model. In the first stage, the predictor model, which includes a combination of CNN, SVD, and LSTM, is trained with the isolated signs. In the second stage, we apply a post-processing algorithm to the Softmax outputs obtained from the first part of the model in order to separate the isolated signs in the continuous signs. Due to the lack of a large dataset, including both the sign sequences and the corresponding isolated signs, two public datasets in Isolated Sign Language Recognition (ISLR), RKS-PERSIANSIGN and ASLVID, are used for evaluation. Results of the continuous sign videos confirm the efficiency of the proposed model to deal with isolated sign boundaries detection.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no menciona			Approved	no
	Call Number	Admin @ si @ RKE2022b			Serial	3824
Permanent link to this record



	Author	Javier Selva; Anders S. Johansen; Sergio Escalera; Kamal Nasrollahi; Thomas B. Moeslund; Albert Clapes
	Title	Video transformers: A survey			Type	Journal Article
	Year	2023	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	45	Issue	11	Pages	12922-12943
	Keywords	Artificial Intelligence; Computer Vision; Self-Attention; Transformers; Video Representations
	Abstract	Transformer models have shown great success handling long-range interactions, making them a promising tool for modeling video. However, they lack inductive biases and scale quadratically with input length. These limitations are further exacerbated when dealing with the high dimensionality introduced by the temporal dimension. While there are surveys analyzing the advances of Transformers for vision, none focus on an in-depth analysis of video-specific designs. In this survey, we analyze the main contributions and trends of works leveraging Transformers to model video. Specifically, we delve into how videos are handled at the input level first. Then, we study the architectural changes made to deal with video more efficiently, reduce redundancy, re-introduce useful inductive biases, and capture long-term temporal dynamics. In addition, we provide an overview of different training regimes and explore effective self-supervised learning strategies for video. Finally, we conduct a performance comparison on the most common benchmark for Video Transformers (i.e., action classification), finding them to outperform 3D ConvNets even with less computational complexity.
	Address	1 Nov. 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no menciona			Approved	no
	Call Number	Admin @ si @ SJE2023			Serial	3823
Permanent link to this record



	Author	Arya Farkhondeh; Cristina Palmero; Simone Scardapane; Sergio Escalera
	Title	Towards Self-Supervised Gaze Estimation			Type	Miscellaneous
	Year	2022	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Recent joint embedding-based self-supervised methods have surpassed standard supervised approaches on various image recognition tasks such as image classification. These self-supervised methods aim at maximizing agreement between features extracted from two differently transformed views of the same image, which results in learning an invariant representation with respect to appearance and geometric image transformations. However, the effectiveness of these approaches remains unclear in the context of gaze estimation, a structured regression task that requires equivariance under geometric transformations (e.g., rotations, horizontal flip). In this work, we propose SwAT, an equivariant version of the online clustering-based self-supervised approach SwAV, to learn more informative representations for gaze estimation. We demonstrate that SwAT, with ResNet-50 and supported with uncurated unlabeled face images, outperforms state-of-the-art gaze estimation methods and supervised baselines in various experiments. In particular, we achieve up to 57% and 25% improvements in cross-dataset and within-dataset evaluation tasks on existing benchmarks (ETH-XGaze, Gaze360, and MPIIFaceGaze).
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no menciona			Approved	no
	Call Number	Admin @ si @ FPS2022			Serial	3822
Permanent link to this record



	Author	Ruben Ballester; Xavier Arnal Clemente; Carles Casacuberta; Meysam Madadi; Ciprian Corneanu
	Title	Towards explaining the generalization gap in neural networks using topological data analysis			Type	Miscellaneous
	Year	2022	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Understanding how neural networks generalize on unseen data is crucial for designing more robust and reliable models. In this paper, we study the generalization gap of neural networks using methods from topological data analysis. For this purpose, we compute homological persistence diagrams of weighted graphs constructed from neuron activation correlations after a training phase, aiming to capture patterns that are linked to the generalization capacity of the network. We compare the usefulness of different numerical summaries from persistence diagrams and show that a combination of some of them can accurately predict and partially explain the generalization gap without the need of a test set. Evaluation on two computer vision recognition tasks (CIFAR10 and SVHN) shows competitive generalization gap prediction when compared against state-of-the-art methods.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no menciona			Approved	no
	Call Number	Admin @ si @ BAC2022			Serial	3821
Permanent link to this record



	Author	Simone Zini; Alex Gomez-Villa; Marco Buzzelli; Bartlomiej Twardowski; Andrew D. Bagdanov; Joost Van de Weijer
	Title	Planckian Jitter: countering the color-crippling effects of color jitter on self-supervised training			Type	Conference Article
	Year	2023	Publication	11th International Conference on Learning Representations	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Several recent works on self-supervised learning are trained by mapping different augmentations of the same image to the same feature representation. The data augmentations used are of crucial importance to the quality of learned feature representations. In this paper, we analyze how the color jitter traditionally used in data augmentation negatively impacts the quality of the color features in learned feature representations. To address this problem, we propose a more realistic, physics-based color data augmentation – which we call Planckian Jitter – that creates realistic variations in chromaticity and produces a model robust to illumination changes that can be commonly observed in real life, while maintaining the ability to discriminate image content based on color information. Experiments confirm that such a representation is complementary to the representations learned with the currently-used color jitter augmentation and that a simple concatenation leads to significant performance gains on a wide range of downstream datasets. In addition, we present a color sensitivity analysis that documents the impact of different training methods on model neurons and shows that the performance of the learned features is robust with respect to illuminant variations.
	Address	1 -5 May 2023, Kigali, Ruanda
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICLR
	Notes	LAMP; 600.147; 611.008; 5300006			Approved	no
	Call Number	Admin @ si @ ZGB2023			Serial	3820
Permanent link to this record



	Author	Saiping Zhang, Luis Herranz, Marta Mrak, Marc Gorriz Blanch, Shuai Wan, Fuzheng Yang
	Title	PeQuENet: Perceptual Quality Enhancement of Compressed Video with Adaptation-and Attention-based Network			Type	Miscellaneous
	Year	2022	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In this paper we propose a generative adversarial network (GAN) framework to enhance the perceptual quality of compressed videos. Our framework includes attention and adaptation to different quantization parameters (QPs) in a single model. The attention module exploits global receptive fields that can capture and align long-range correlations between consecutive frames, which can be beneficial for enhancing perceptual quality of videos. The frame to be enhanced is fed into the deep network together with its neighboring frames, and in the first stage features at different depths are extracted. Then extracted features are fed into attention blocks to explore global temporal correlations, followed by a series of upsampling and convolution layers. Finally, the resulting features are processed by the QP-conditional adaptation module which leverages the corresponding QP information. In this way, a single model can be used to enhance adaptively to various QPs without requiring multiple models specific for every QP value, while having similar performance. Experimental results demonstrate the superior performance of the proposed PeQuENet compared with the state-of-the-art compressed video quality enhancement algorithms.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MACO; no proj			Approved	no
	Call Number	Admin @ si @ ZHM2022b			Serial	3819
Permanent link to this record



	Author	Shiqi Yang; Yaxing Wang; Kai Wang; Shangling Jui; Joost Van de Weijer
	Title	One Ring to Bring Them All: Towards Open-Set Recognition under Domain Shift			Type	Miscellaneous
	Year	2022	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In this paper, we investigate model adaptation under domain and category shift, where the final goal is to achieve (SF-UNDA), which addresses the situation where there exist both domain and category shifts between source and target domains. Under the SF-UNDA setting, the model cannot access source data anymore during target adaptation, which aims to address data privacy concerns. We propose a novel training scheme to learn a ( +1)-way classifier to predict the source classes and the unknown class, where samples of only known source categories are available for training. Furthermore, for target adaptation, we simply adopt a weighted entropy minimization to adapt the source pretrained model to the unlabeled target domain without source data. In experiments, we show: After source training, the resulting source model can get excellent performance for ; After target adaptation, our method surpasses current UNDA approaches which demand source data during adaptation. The versatility to several different tasks strongly proves the efficacy and generalization ability of our method. When augmented with a closed-set domain adaptation approach during target adaptation, our source-free method further outperforms the current state-of-the-art UNDA method by 2.5%, 7.2% and 13% on Office-31, Office-Home and VisDA respectively.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; no proj			Approved	no
	Call Number	Admin @ si @ YWW2022c			Serial	3818
Permanent link to this record



	Author	Ali Furkan Biten; Ruben Tito; Lluis Gomez; Ernest Valveny; Dimosthenis Karatzas
	Title	OCR-IDL: OCR Annotations for Industry Document Library Dataset			Type	Conference Article
	Year	2022	Publication	ECCV Workshop on Text in Everything	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Pretraining has proven successful in Document Intelligence tasks where deluge of documents are used to pretrain the models only later to be finetuned on downstream tasks. One of the problems of the pretraining approaches is the inconsistent usage of pretraining data with different OCR engines leading to incomparable results between models. In other words, it is not obvious whether the performance gain is coming from diverse usage of amount of data and distinct OCR engines or from the proposed models. To remedy the problem, we make public the OCR annotations for IDL documents using commercial OCR engine given their superior performance over open source OCR models. The contributed dataset (OCR-IDL) has an estimated monetary value over 20K US$. It is our hope that OCR-IDL can be a starting point for future works on Document Intelligence. All of our data and its collection process with the annotations can be found in this https URL.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCV
	Notes	DAG; no proj			Approved	no
	Call Number	Admin @ si @ BTG2022			Serial	3817
Permanent link to this record



	Author	Shiqi Yang; Yaxing Wang; Kai Wang; Shangling Jui; Joost Van de Weijer
	Title	Local Prediction Aggregation: A Frustratingly Easy Source-free Domain Adaptation Method			Type	Miscellaneous
	Year	2022	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	We propose a simple but effective source-free domain adaptation (SFDA) method. Treating SFDA as an unsupervised clustering problem and following the intuition that local neighbors in feature space should have more similar predictions than other features, we propose to optimize an objective of prediction consistency. This objective encourages local neighborhood features in feature space to have similar predictions while features farther away in feature space have dissimilar predictions, leading to efficient feature clustering and cluster assignment simultaneously. For efficient training, we seek to optimize an upper-bound of the objective resulting in two simple terms. Furthermore, we relate popular existing methods in domain adaptation, source-free domain adaptation and contrastive learning via the perspective of discriminability and diversity. The experimental results prove the superiority of our method, and our method can be adopted as a simple but strong baseline for future research in SFDA. Our method can be also adapted to source-free open-set and partial-set DA which further shows the generalization ability of our method. Code is available in this https URL.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.147			Approved	no
	Call Number	Admin @ si @ YWW2022b			Serial	3815
Permanent link to this record



	Author	Swathikiran Sudhakaran; Sergio Escalera; Oswald Lanz
	Title	Gate-Shift-Fuse for Video Action Recognition			Type	Journal Article
	Year	2023	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	45	Issue	9	Pages	10913-10928
	Keywords	Action Recognition; Video Classification; Spatial Gating; Channel Fusion
	Abstract	Convolutional Neural Networks are the de facto models for image recognition. However 3D CNNs, the straight forward extension of 2D CNNs for video recognition, have not achieved the same success on standard action recognition benchmarks. One of the main reasons for this reduced performance of 3D CNNs is the increased computational complexity requiring large scale annotated datasets to train them in scale. 3D kernel factorization approaches have been proposed to reduce the complexity of 3D CNNs. Existing kernel factorization approaches follow hand-designed and hard-wired techniques. In this paper we propose Gate-Shift-Fuse (GSF), a novel spatio-temporal feature extraction module which controls interactions in spatio-temporal decomposition and learns to adaptively route features through time and combine them in a data dependent manner. GSF leverages grouped spatial gating to decompose input tensor and channel weighting to fuse the decomposed tensors. GSF can be inserted into existing 2D CNNs to convert them into an efficient and high performing spatio-temporal feature extractor, with negligible parameter and compute overhead. We perform an extensive analysis of GSF using two popular 2D CNN families and achieve state-of-the-art or competitive performance on five standard action recognition benchmarks.
	Address	1 Sept. 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no menciona			Approved	no
	Call Number	Admin @ si @ SEL2023			Serial	3814
Permanent link to this record



	Author	Victoria Ruiz; Angel Sanchez; Jose F. Velez; Bogdan Raducanu
	Title	Waste Classification with Small Datasets and Limited Resources			Type	Book Chapter
	Year	2022	Publication	ICT Applications for Smart Cities. Intelligent Systems Reference Library	Abbreviated Journal
	Volume	224	Issue		Pages	185-203
	Keywords
	Abstract	Automatic waste recycling has become a very important societal challenge nowadays, raising people’s awareness for a cleaner environment and a more sustainable lifestyle. With the transition to Smart Cities, and thanks to advanced ICT solutions, this problem has received a new impulse. The waste recycling focus has shifted from general waste treating facilities to an individual responsibility, where each person should become aware of selective waste separation. The surge of the mobile devices, accompanied by a significant increase in computation power, has potentiated and facilitated this individual role. An automated image-based waste classification mechanism can help with a more efficient recycling and a reduction of contamination from residuals. Despite the good results achieved with the deep learning methodologies for this task, the Achille’s heel is that they require large neural networks which need significant computational resources for training and therefore are not suitable for mobile devices. To circumvent this apparently intractable problem, we will rely on knowledge distillation in order to transfer the network’s knowledge from a larger network (called ‘teacher’) to a smaller, more compact one, (referred as ‘student’) and thus making it possible the task of image classification on a device with limited resources. For evaluation, we considered as ‘teachers’ large architectures such as InceptionResNet or DenseNet and as ‘students’, several configurations of the MobileNets. We used the publicly available TrashNet dataset to demonstrate that the distillation process does not significantly affect system’s performance (e.g. classification accuracy) of the student network.
	Address	September 2022
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	ISRL
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-031-06306-0	Medium
	Area		Expedition		Conference
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @			Serial	3813
Permanent link to this record