Subhaditya's KB

❯

❯

Adam

Sep 18, 20241 min read

regularization

Adam

Supervised learning
Rmsprop + Momentum
Corrects bias in exponentially weighted averages
Struggles with large no of params → Over smooths the gradient
$\begin{align} & s_n = \rho_1 s_{n-1} + (1-\rho_1) g_n \\ & r_n = \rho_2 r_{n-1} + (1-\rho_2) g_n \odot g_n \\ & \Theta_{n+1} = \Theta_n - \alpha \frac{s_n}{\epsilon + \sqrt{r_n}} \frac{1-\rho_2^n}{1-\rho^n_1} \end{align}$
First and second moments

Graph View

Backlinks

Chapter 6 - Fitting models
Amsgrad
Gradient Descent
Optimization
regularize

Created with Quartz v4.3.1 © 2025

GitHub