Learning Rate Decaydeeplearning Scale of loss landscape changes Reduce step size near optima Factor αi+1=d⋅αi Cosine Learning Rate Decay …