Loss for univariate regression

Predict a single scalar output $y \in R$
Use univariate Normal Distribution
Find parameters $\hat{ϕ}$ that minimize $L [ϕ]$ \
Least squares loss

Inference

Predict the mean $μ = f [x, ϕ]$ of the Normal Distribution over y
We find the single best point estimate $\overset{y}{^}$ and we take max of the predicted distribution $\overset{y}{^} = y a r g ma x [P r (y ∣ f ∣ x, \hat{ϕ}, σ^{2})] = f [x, \hat{ϕ}]$

Estimating if variance constant everywhere

Homoscedatic
Since the equation does not depend on variance, we pretend $σ^{2}$ is a learned parameter and minimize it wrt $ϕ, σ^{2}$

Estimating if variance is not constant

Heteroscedatic
Train a network that computes both mean and variance
Variance should be positive, but the result of composing networks might not be. To make it, pass it through the squaring function

Homoscedatic vs Heteroscedatic Regression