Data valuation without training of a model학습 없이 데이터의 가치를 평가하는 알고리즘

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 83
  • Download : 0
Many recent works on understanding deep learning try to quantify how much individual data instances influence the optimization and generalization of a model, either by analyzing the behavior of the model during training or by measuring the performance gap of the model when the instance is removed from the dataset. Such approaches reveal characteristics and importance of individual instances, which may provide useful information in diagnosing and improving deep learning. However, most of the existing works on data valuation require actual training of a model, which often demands high-computational cost. In this paper, we provide a training-free data valuation score, called complexity-gap score, which is a data centric score to quantify the influence of individual instances in generalization of two-layer overparameterized neural networks. The proposed score can quantify irregularity of the instances and measure how much each data instance contributes in the total movement of the network parameters during training. We theoretically analyze and empirically demonstrate the effectiveness of the complexitygap score in finding ‘irregular or mislabeled’ data instances, and also provide applications of the score in analyzing datasets and diagnosing training dynamics.
Advisors
Chung, Hyewonresearcher정혜원researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2023
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2023.2,[v, 35 p. :]

Keywords

Data valuation▼aGeneralization error bounds▼aComplexity-gap score▼aData pruning▼aTraining dynamics; 데이터 가치평가▼a일반화 에러▼a복잡도 격차▼a데이터 제거 훈련법▼a학습 동향

URI
http://hdl.handle.net/10203/309423
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1032867&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0