Subhaditya's KB

❯

❯

Learning Rate Decay

Learning Rate Decay

Sep 18, 20241 min read

temp
deeplearning

Learning Rate Decaydeeplearning

Scale of loss landscape changes
Reduce step size near optima
Factor $α_{i + 1} = d \cdot α_{i}$
Cosine Learning Rate Decay

…

Graph View

Learning Rate Decaydeeplearning
…

Backlinks

Learning Rate Scheduling
No bias decay
Optimization
_Index_of_KB

Created with Quartz v4.3.1 © 2025

GitHub