Dying ReLU and Initialization: Theory and Numerical Examples

Cited 107 time in webofscience Cited 0 time in scopus
  • Hit : 231
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorLu, Luko
dc.contributor.authorShin, Yeonjongko
dc.contributor.authorSu, Yanhuiko
dc.contributor.authorKarniadakis, George Emko
dc.date.accessioned2022-07-06T02:00:31Z-
dc.date.available2022-07-06T02:00:31Z-
dc.date.created2022-07-06-
dc.date.issued2020-11-
dc.identifier.citationCOMMUNICATIONS IN COMPUTATIONAL PHYSICS, v.28, no.5, pp.1671 - 1706-
dc.identifier.issn1815-2406-
dc.identifier.urihttp://hdl.handle.net/10203/297250-
dc.description.abstractThe dying ReLU refers to the problem when ReLU neurons become inactive and only output 0 for any input. There are many empirical and heuristic explanations of why ReLU neurons die. However, little is known about its theoretical analysis. In this paper, we rigorously prove that a deep ReLU network will eventually die in probability as the depth goes to infinite. Several methods have been proposed to alleviate the dying ReLU. Perhaps, one of the simplest treatments is to modify the initialization procedure. One common way of initializing weights and biases uses symmetric probability distributions, which suffers from the dying ReLU. We thus propose a new initialization procedure, namely, a randomized asymmetric initialization. We show that the new initialization can effectively prevent the dying ReLU. All parameters required for the new initialization are theoretically designed. Numerical examples are provided to demonstrate the effectiveness of the new initialization procedure.-
dc.languageEnglish-
dc.publisherGLOBAL SCIENCE PRESS-
dc.titleDying ReLU and Initialization: Theory and Numerical Examples-
dc.typeArticle-
dc.identifier.wosid000592624200003-
dc.identifier.scopusid2-s2.0-85097467094-
dc.type.rimsART-
dc.citation.volume28-
dc.citation.issue5-
dc.citation.beginningpage1671-
dc.citation.endingpage1706-
dc.citation.publicationnameCOMMUNICATIONS IN COMPUTATIONAL PHYSICS-
dc.identifier.doi10.4208/cicp.OA-2020-0165-
dc.contributor.localauthorShin, Yeonjong-
dc.contributor.nonIdAuthorLu, Lu-
dc.contributor.nonIdAuthorSu, Yanhui-
dc.contributor.nonIdAuthorKarniadakis, George Em-
dc.description.isOpenAccessN-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorNeural network-
dc.subject.keywordAuthorDying ReLU-
dc.subject.keywordAuthorVanishing/Exploding gradient-
dc.subject.keywordAuthorRandomized asym-metric initialization-
dc.subject.keywordPlusDEEP NEURAL-NETWORKS-
dc.subject.keywordPlusERROR-
Appears in Collection
MA-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 107 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0