DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 최재식 | - |
dc.contributor.author | Paik, Inyoung | - |
dc.contributor.author | 백인영 | - |
dc.date.accessioned | 2024-07-25T19:30:48Z | - |
dc.date.available | 2024-07-25T19:30:48Z | - |
dc.date.issued | 2023 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1045739&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/320551 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 김재철AI대학원, 2023.8,[iv, 26 p. :] | - |
dc.description.abstract | Deep neural networks, which employ batch normalization and ReLU-like activation functions, suffer from instability in the early stages of training due to the high gradient induced by temporal gradient explosion. In this study, we analyze the occurrence and mitigation of gradient explosion both theoretically and empirically, and discover that the correlation between activations plays a key role in preventing the gradient explosion from persisting throughout the training. Finally, based on our observations, we propose an improved adaptive learning rate algorithm to effectively control the training instability | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | 심층학습▼a기울기 폭발▼a학습 불안정성▼aWarmUp▼aLARS | - |
dc.subject | Deep learning▼aGradient explosion▼aTraining instability▼aWarmUp▼aLARS | - |
dc.title | (The) disharmony between batch normalization and ReLU causes the gradient explosion, but is offset by the correlation between activations | - |
dc.title.alternative | 배치 정규화와 정류 선형 유닛 간의 부조화로 인한 기울기 폭발과 입력 신호 간의 상관관계로 인한 상쇄 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :김재철AI대학원, | - |
dc.contributor.alternativeauthor | Choi, Jaesik | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.