Layer Normalization

  • For RNNs etc
  • Mean and variance calculated independantly for each element of the batch by aggregating over the Features dimensions.
  • (Compared to Batch Normalization)

Problem

  • From [Visualizing the Loss Landscape of Neural Nets](Visualizing the Loss Landscape of Neural Nets.md),
  • ![images/Pasted%20image%20 20230327130254.png](images/Pasted%20image%20 20230327130254.png)