Subhaditya's KB

❯

❯

No bias decay

Sep 18, 20241 min read

regularization

No Bias Decay

No Learning Rate Decay tricks
Equivalent to Lp Regularization L2 to all parameters to drive the values towards 0
Only apply Regularization to the weights
Leave Batch Normalization Layers alone
LARS

Graph View

Backlinks

Large Batch Training
regularize

Created with Quartz v4.3.1 © 2025

GitHub