SGD on Neural Networks Learns
Functions of Increasing Complexity
Preetum Nakkiran Gal Kaplun Dimitris Kalimeris Tristan Yang
Harvard University Harvard University Harvard University Harvard University
Benjamin L. Edelman Fred Zhang Boaz Barak
Harvard University Harvard University Harvard University
Abstract
We perform an experimental study of the dynamics of Stochastic Gradient Descent ...
附件列表