DC Field | Value | Language |
---|---|---|
dc.contributor.author | Song, Minhak | ko |
dc.contributor.author | Yun, Chulhee | ko |
dc.date.accessioned | 2024-02-04T15:00:21Z | - |
dc.date.available | 2024-02-04T15:00:21Z | - |
dc.date.created | 2024-02-04 | - |
dc.date.issued | 2023-12-13 | - |
dc.identifier.citation | 37th Annual Conference on Neural Information Processing Systems | - |
dc.identifier.uri | http://hdl.handle.net/10203/317998 | - |
dc.description.abstract | Cohen et al. (2021) empirically study the evolution of the largest eigenvalue of the loss Hessian, also known as sharpness, along the gradient descent (GD) trajectory and observe a phenomenon called the Edge of Stability (EoS). The sharpness increases at the early phase of training (referred to as progressive sharpening), and eventually saturates close to the threshold of 2/(step size). In this paper, we start by demonstrating through empirical studies that when the EoS phenomenon occurs, different GD trajectories (after a proper reparameterization) align on a specific bifurcation diagram independent of initialization. We then rigorously prove this trajectory alignment phenomenon for a two-layer fully-connected linear network and a single-neuron nonlinear network trained with a single data point. Our trajectory alignment analysis establishes both progressive sharpening and EoS phenomena, encompassing and extending recent findings in the literature. | - |
dc.language | English | - |
dc.publisher | Neural Information Processing Systems | - |
dc.title | Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory | - |
dc.type | Conference | - |
dc.type.rims | CONF | - |
dc.citation.publicationname | 37th Annual Conference on Neural Information Processing Systems | - |
dc.identifier.conferencecountry | US | - |
dc.identifier.conferencelocation | New Orleans, LA | - |
dc.contributor.localauthor | Yun, Chulhee | - |
dc.contributor.nonIdAuthor | Song, Minhak | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.