On the Angular Update and Hyperparameter Tuning of a Scale-Invariant Network

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 53
  • Download : 0
Modern deep neural networks are equipped with normalization layers such as batch normalization or layer normalization to enhance and stabilize training dynamics. If a network contains such normalization layers, the optimization objective is invariant to the scale of the neural network parameters. The scale-invariance induces the neural network's output to be only affected by the weights' direction and not the weights' scale. We first find a common feature of good hyperparameter combinations on such a scale-invariant network, including learning rate, weight decay, number of data samples, and batch size. Then we observe that hyperparameter setups that lead to good performance show similar degrees of angular update during one epoch. Using a stochastic differential equation, we analyze the angular update and show how each hyperparameter affects it. With this relationship, we can derive a simple hyperparameter tuning method and apply it to the efficient hyperparameter search.
Publisher
Springer Verlag
Issue Date
2022-10
Language
English
Citation

European Conference on Computer Vision, ECCV 2022, pp.121 - 136

ISSN
0302-9743
DOI
10.1007/978-3-031-19775-8_8
URI
http://hdl.handle.net/10203/300177
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0