Gradient Descent Backprop Gradient Direction Gradient Magnitude Edge Strength ∣∣▽f∣∣=(∂x∂f)2+(∂y∂f)2 Params θ Minimize loss function L(θ)=Σn=1Nln(θ) Simple Gradient Descent SGD Mini Batch GD SGD Momentum Adagrad Nesterov Momentum AdaDelta Rmsprop Adam Implicit regularization