Subhaditya's KB

❯

❯

He Initialization

He Initialization

Sep 18, 20241 min read

regularization

He Initialization

bring the variance of those outputs to approximately one
However, Kumar indeed proves mathematically that for the Relu activation function, the best weight Initialization strategy is to initialize the weights randomly but with this variance:
- $\begin{equation} v^{2} = 2/N \end{equation}$
For Sigmoid based activation functions

Graph View

Backlinks

Initialization
regularize

Created with Quartz v4.3.1 © 2025

GitHub