Quit Emailing Yourself

# deep-learning → sharpness → stability → gradient-descent

1 link tagged with all of: deep-learning + sharpness + stability + gradient-descent

How does gradient descent work?

The paper discusses the limitations of traditional gradient descent analysis in deep learning and introduces a new understanding of its dynamics, particularly how gradient descent operates effectively in regions where the sharpness of the loss landscape is less than a certain threshold. It highlights the phenomenon of training at the edge of stability, where gradient descent oscillates but eventually stabilizes, challenging conventional optimization theories.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

gradient-descent ✓ deep-learning ✓ + optimization stability ✓ sharpness ✓

Links

How does gradient descent work?