Bayesian inference#
Suppose that we fit a model with parameters \(\boldsymbol w\) to the dataset \(\boldsymbol D = (\boldsymbol X, \boldsymbol y)\). According to the Bayes formula the posterior distribution
\[
p(\boldsymbol w \vert \boldsymbol X, \boldsymbol y) \propto p(\boldsymbol y \vert \boldsymbol X, \boldsymbol w) p(\boldsymbol w).
\]
This is also written as
\[
\mathrm{Posterior} = \frac{\mathrm{Likelihood}\times \mathrm{Prior}}{\mathrm{Evidence}}
\]
We are particularly interested in the posterior distribution because it allows us to make predictions.
Q. How to calculate evidence?
Conjugate distributions#
Likelihood and prior distriutions are called conjugated if posterior belongs to the same family as prior. See also cookbook, section 14.3.1.
Bayes rule#
Since
\[
p(\boldsymbol x , y) = p(\boldsymbol x \vert y) p(y) = p(y \vert \boldsymbol x) p(\boldsymbol x),
\]
we have
\[
p(y \vert \boldsymbol x) = \frac{p(\boldsymbol x \vert y) p(y)}{p(\boldsymbol x)} =
\frac{p(\boldsymbol x \vert y)p(y)}{\int p(\boldsymbol x \vert y) p(y)\,dy}.
\]