DSpace at KOASAS: Generalization of deep neural networks via discovering flatter loss surfaces

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Ph.D.(박사논문)

Generalization of deep neural networks via discovering flatter loss surfaces편평도가 더 높은 손실 평면을 발견함을 통한 딥뉴럴네트워크의 일반화

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 341
Download : 0

Export

Seong, Sihyeon

Achieving generalization is one of a core problem in DNNs(Deep Neural Networks). DNNs have extremely large number of parameters, resulting in high model complexity. Therefore, any well-conditioned training problem can be ﬁt with DNNs, but high model complexity makes solution of DNNs underdetermined, meaning DNNs has too many solutions for the target training problem. To reduce the solution space of this underdetermined system, numerous regularization concepts have been proposed. In this work, the ﬂat minima theory is adopted as a constraint of optimization problem. The ﬁrst concept of ﬂat minima is described in [19, 18]. In this paper, we give more concrete theoretical explanations on why ﬂat minima works better. A classic viewpoint of generalization is described in output robustness with respect to input perturbations. We analyze the ﬂatness of loss surfaces through the lens of robustness to input perturbations and advocate that gradient descent should be guided to reach ﬂatter region of loss surfaces to achieve generalization. By doing so, we show the relation of learning rate and generalization. Furthermore, we developed a method which can discover ﬂatter minima to improve the optimization of DNNs. Whereas optimizing deep neural networks using stochastic gradient descent has shown great performances in practice, the rule for setting step size (i.e. learning rate) of gradient descent is not well studied. Although it appears that some intriguing learning rate rules such as ADAM [26] have since been developed, they concentrated on improving convergence, not on improving generalization capabilities. Recently, the improved generalization property of the ﬂat minima was revisited, and this research guides us towards promising solutions to many current optimization problems. We suggest a learning rate rule for escaping sharp regions of loss surfaces and propose a concept of learning rate scheduling called peak learning stage. Based on peak learning stage, we propose an adaptive-perparameter version of learning rate scehduling called Adapeak. Finally, we demonstrate the capacity of our approach by performing numerous experiments. To experimentally verify our theories, we performed many perturbation analysis on both input space and weight space. DNNs are extensively high-dimensional model, so it is hard to observe the ﬂatness of its weight space. Therefore, we evaluate the subspace of high-dimensional loss surfaces and propose some eﬀective methods for selecting subspaces of high-dimensional loss surfaces to estimate the generalization capability of the DNN model.

Advisors: Kim, Junmo researcher; 김준모 researcher

Description: 한국과학기술원 :전기및전자공학부,

Publisher: 한국과학기술원

Issue Date: 2019

Identifier: 325007

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2019.2,[v, 60 p. :]

Keywords: Deep learning▼alearning rate▼ageneralization▼aloss Surfaces; 딥러닝▼a학습률▼a일반화▼a손실 평면

URI: http://hdl.handle.net/10203/265163

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=842392&flag=dissertation

Appears in Collection: EE-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Generalization of deep neural networks via discovering flatter loss surfaces편평도가 더 높은 손실 평면을 발견함을 통한 딥뉴럴네트워크의 일반화

KOASAS

Communities & Collections