TRAINABILITY OF ReLU NETWORKS AND DATA-DEPENDENT INITIALIZATION

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 188
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorShin, Yeonjongko
dc.contributor.authorKarniadakis, George Emko
dc.date.accessioned2022-07-06T02:00:35Z-
dc.date.available2022-07-06T02:00:35Z-
dc.date.created2022-07-06-
dc.date.issued2020-
dc.identifier.citationJournal of Machine Learning for Modeling and Computing, v.1, no.1, pp.39 - 74-
dc.identifier.issn2689-3967-
dc.identifier.urihttp://hdl.handle.net/10203/297251-
dc.description.abstractIn this paper we study the trainability of rectified linear unit (ReLU) networks at initialization. A ReLU neuron is said to be dead if it only outputs a constant for any input. Two death states of neurons are introduced−tentative and permanent death. A network is then said to be trainable if the number of permanently dead neurons is sufficiently small for a learning task. We refer to the probability of a randomly initialized network being trainable as trainability. We show that a network being trainable is a necessary condition for successful training, and the trainability serves as an upper bound of training success rates. In order to quantify the trainability, we study the probability distribution of the number of active neurons at initialization. In many applications, overspecified or overparameterized neural networks are successfully employed and shown to be trained effectively. With the notion of trainability, we show that overparameterization is both a necessary and a sufficient condition for achieving a zero training loss. Furthermore, we propose a data-dependent initialization method in an overparameterized setting. Numerical examples are provided to demonstrate the effectiveness of the method and our theoretical findings.-
dc.languageEnglish-
dc.publisherBEGELL HOUSE Inc.-
dc.titleTRAINABILITY OF ReLU NETWORKS AND DATA-DEPENDENT INITIALIZATION-
dc.typeArticle-
dc.type.rimsART-
dc.citation.volume1-
dc.citation.issue1-
dc.citation.beginningpage39-
dc.citation.endingpage74-
dc.citation.publicationnameJournal of Machine Learning for Modeling and Computing-
dc.identifier.doi10.1615/.2020034126-
dc.contributor.localauthorShin, Yeonjong-
dc.contributor.nonIdAuthorKarniadakis, George Em-
dc.description.isOpenAccessN-
Appears in Collection
MA-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0