Bayesian Model Estimation

Unlike Frequentist, sometimes things like sample mean is not a good metric because it has a high variance. Might give different results with different trials in a real valued distribution
The task is to estimate $θ$ from the data
Bayesian Prior
Now there are two sources of info about the true distribution $p_{X} (θ)$
- The likelihood $p_{\otimes_{i}} x (D ∣ θ)$ of $θ$ . Empirical data
- Prior plausibility in $h (θ)$
- Since these are independant sources we can combine them by multiplication: $p_{\otimes_{i}} x (D ∣ θ) h (θ)$
  - High values → Candidate model $θ$ is a good estimate
  - Bayesian Posterior
  - Posterior Mean estimate

Advantages

Integrating over millions of params and performing multiple preds for each param → infeasible
How to encode or represent Bayesian Posterior as very high dim
- No closed form representation over weights
- Represent data with histograms and use Monte Carlo

Green : prior , Red: Posterior
The Posterior Mean estimate is obtained by integrating $\int_{R} μ h (μ ∣ D) d μ$
Since this is different from sample mean → Prior distribution really does influence the models