Weight Decay Scheduling and Knowledge Distillation for Active Learning

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 141
  • Download : 0
Although convolutional neural networks perform extremely well for numerous computer vision tasks, a considerably large amount of labeled data is required to ensure a good outcome. Data labeling is labor-intensive, and in some cases, the labeling budget may be limited. Active learning is a technique that can reduce the labeling required. With this technique, the neural network selects on its own the unlabeled data most helpful for learning, and then requests the human annotator for the labels. Most existing active learning methods have focused on acquisition functions for an effective selection of the informative samples. However, in this paper, we focus on the data-incremental nature of active learning, and propose a method for properly tuning the weight decay as the amount of data increases. We also demonstrate that the performance can be improved by knowledge distillation using a low-performance teacher model trained from the previous acquisition step. In addition, we present a novel perspective of the weight decay, which provides a regularization effect by limiting the number of effective parameters and channels in the convolutional filter. We validate our methods on the MNIST, CIFAR-10, and CIFAR-100 datasets using convolutional neural networks of various sizes.
Publisher
Springer Verlag
Issue Date
2020-08-23
Language
English
Citation

16th European Conference on Computer Vision, ECCV 2020, pp.431 - 447

ISSN
0302-9743
DOI
10.1007/978-3-030-58574-7_26
URI
http://hdl.handle.net/10203/278495
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0