Position Wise Feed Forward FFN(x)=max(0,xW1+b1)W2+b2 Dense Layers are applied along the last (512) dims