Development and application of efficient data pruning techniques in deep learning딥러닝을 위한 효과적인 데이터 프루닝 기법의 개발과 적용

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 19
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisor정혜원-
dc.contributor.authorChoi, Hoyong-
dc.contributor.author최호용-
dc.date.accessioned2024-08-08T19:31:34Z-
dc.date.available2024-08-08T19:31:34Z-
dc.date.issued2024-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1100046&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/322146-
dc.description학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2024.2,[iv, 68 p. :]-
dc.description.abstractRecent advancements in deep learning technology have brought innovations across various fields. However, as the technology has evolved towards using more data and larger models for performance improvement, there has been an exponential increase in the required computational costs. Consequently, the importance of efficient learning techniques, especially in data pruning, is becoming increasingly significant. Nevertheless, there are two key issues with existing data pruning methodologies: the necessity of training with the entire dataset for data selection and the variance in each methodology's performance depending on the data selection ratio. This research proposes methodologies to address these two critical issues in data selection. To tackle the first issue, we propose a `CG-score' (Complexity Gap score), which allows for the understanding of data characteristics without training, and demonstrate that the data selection performance based on this score is comparable to that of existing methodologies. By utilizing the Neural Tangent Kernel, which can mathematically approximate the learning process without directly training deep learning models, we quantified the characteristics of the data using only the training data. For the second issue, we proposed a `BWS' (best window selection) methodology, which involves sorting data by difficulty score and adjusting the selection range according to the selection ratio. We theoretically verify that changing the selection region according to different ratios enables optimal data selection and empirically confirm that this approach outperforms existing methodologies across all selection ratios.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subject딥러닝▼a효율적 학습기법▼a데이터 프루닝▼a데이터 선별▼a뉴럴 탄젠트 커널-
dc.subjectDeep learning▼aEfficient learning▼aData pruning▼aData subset selection▼aNeural tangent kernel-
dc.titleDevelopment and application of efficient data pruning techniques in deep learning-
dc.title.alternative딥러닝을 위한 효과적인 데이터 프루닝 기법의 개발과 적용-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전기및전자공학부,-
dc.contributor.alternativeauthorChung, Hye Won-
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0