Recipe for constructing loss functions

Using Maximum Likelihood
For training data $x_{i}, y_{i}$
- Choose a probability distribution $P r y ∣ θ$ defined over the domain of the predictions y with distribution parameters $θ$
- Choose an ML model $f ∣ x, ϕ ∣$ where $θ = f ∣ x, ϕ ∣$ and $P r (y ∣ θ) = P r (y ∣ f ∣ x, ϕ ∣)$
- Training → Find the parameters $ϕ$ that minimize the Negative Log Likelihood over the training data $x_{i}, y_{i}$
- Inference → Either return $P r (y ∣ f [x, \hat{ϕ}])$ or the value where this distribution is minimized
If data is differently distributed and there is no loss associated, just transform the distribution beforehand

Subhaditya's KB