Bayesian weight decay for deep convolutional neural networks : approximation and generalization심층 회선 신경망의 베이지언 가중치 감쇠 : 근사화와 일반화

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 195
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorJo, Sungho Jo-
dc.contributor.advisor조성호-
dc.contributor.authorPark, Jung-Guk-
dc.date.accessioned2021-05-12T19:40:11Z-
dc.date.available2021-05-12T19:40:11Z-
dc.date.issued2020-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=909372&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/284154-
dc.description학위논문(박사) - 한국과학기술원 : 전산학부, 2020.2,[iv, 59 p. :]-
dc.description.abstractThis study determines the weight decay parameter value of a deep convolutional neural network (CNN) that yields a good generalization. Although the weight decay is theoretically related to generalization error, determining a value of the weight decay is known to be a challenging issue. Deep CNNs are widely used in vision applications and guaranteeing their classification accuracy on unseen data is important. To obtain such a CNN in general, numerical trials with different weight decay values are needed. However, the larger the CNN architecture, the higher the computational cost of the trials. To address this problem, this study derives an analytical form for the decay parameter through a proposed objective function in conjunction with Bayesian probability distributions. For computational efficiency, a novel method to approximate this form is suggested. This method uses a small amount of information in the Hessian matrix. Under general conditions, the approximate form is guaranteed by a provable bound and is obtained by a proposed algorithm with discretized information, where its time complexity is linear in terms of the depth and width of the CNN. The bound provides a consistent result of the proposed learning scheme. Also, the generalization error of CNN trained by the proposed algorithm is analyzed with statistical learning theory and the analysis on computational complexity shows the rate of efficiency. By reducing the computational cost of determining the decay value, the approximation allows for the fast investigation of a deep CNN which yields a small generalization error. Experimental results show that the assumption verified with different deep CNNs is suitable for real-world image datasets. In addition, the method can show a remarkable time complexity reduction with achieving good classification accuracy when it is applied to deeper classification neural networks, more complex training methods, and/or objective functions requiring the high computational cost. The proposed method has an advantage in that it can be applied to any deep classification network trained by a loss function which satisfies mild conditions.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectBayesian method▼aconvolutional neural networks▼acomputational complexity▼ainverse Hessian matrix▼aregularization▼aweight decay-
dc.subject베이지언 기법▼a계산 복잡도▼a회선 신경망▼a역 헤시안 행렬▼a학습 규제▼a가중치 감소-
dc.titleBayesian weight decay for deep convolutional neural networks-
dc.title.alternative심층 회선 신경망의 베이지언 가중치 감쇠 : 근사화와 일반화-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전산학부,-
dc.contributor.alternativeauthor박정국-
dc.title.subtitleapproximation and generalization-
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0