Distilling from professors: Enhancing the knowledge distillation of teachers

Cited 9 time in webofscience Cited 0 time in scopus
  • Hit : 241
  • Download : 0
Knowledge distillation (KD) is a successful technique for transferring knowledge from one machine learning model to another model. Specifically, the idea of KD has been widely used for various tasks such as model compression and knowledge transfer between different models. However, existing studies in KD have overlooked the possibility that dark knowledge (i.e., soft targets) obtained from a complex and large model (a.k.a., a teacher model) may be either incorrect or insufficient. Such knowledge can hinder the effective learning of another small model (a.k.a., a student model). In this paper, we propose the professor model, which refines the soft target from the teacher model to improve KD. The professor model aims to achieve two goals; 1) improving the prediction accuracy and 2) capturing the inter-class correlation of the soft target from the teacher model. We first design the professor model by reformulating a conditional adversarial autoencoder (CAAE). Then, we devise two KD strategies using both teacher and professor models. Our empirical study demonstrates that the professor model effectively improves KD in three benchmark datasets: CIFAR100, TinyImagenet, and ILSVRC2015. Moreover, our comprehensive analysis shows that the professor model is much more effective than employing the stronger teacher model, in which parameters are greater than the sum of the teacher's and professor's parameters. Since the proposed model is model-agnostic, our model can be combined with any KD algorithm and consistently improves various KD techniques. (c) 2021 Elsevier Inc. All rights reserved.
Publisher
ELSEVIER SCIENCE INC
Issue Date
2021-10
Language
English
Article Type
Article
Citation

INFORMATION SCIENCES, v.576, pp.743 - 755

ISSN
0020-0255
DOI
10.1016/j.ins.2021.08.020
URI
http://hdl.handle.net/10203/297164
Appears in Collection
AI-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 9 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0