Bayesian inference#

Suppose that we fit a model with parameters \(\boldsymbol w\) to the dataset \(\boldsymbol D = (\boldsymbol X, \boldsymbol y)\). According to the Bayes formula the posterior distribution

\[ p(\boldsymbol w \vert \boldsymbol X, \boldsymbol y) \propto p(\boldsymbol y \vert \boldsymbol X, \boldsymbol w) p(\boldsymbol w). \]

This is also written as

\[ \mathrm{Posterior} = \frac{\mathrm{Likelihood}\times \mathrm{Prior}}{\mathrm{Evidence}} \]

We are particularly interested in the posterior distribution because it allows us to make predictions.

Q. How to calculate evidence?

Conjugate distributions#

Likelihood and prior distriutions are called conjugated if posterior belongs to the same family as prior. See also cookbook, section 14.3.1.

Bayes rule#

Since

\[ p(\boldsymbol x , y) = p(\boldsymbol x \vert y) p(y) = p(y \vert \boldsymbol x) p(\boldsymbol x), \]

we have

\[ p(y \vert \boldsymbol x) = \frac{p(\boldsymbol x \vert y) p(y)}{p(\boldsymbol x)} = \frac{p(\boldsymbol x \vert y)p(y)}{\int p(\boldsymbol x \vert y) p(y)\,dy}. \]