Independence and random vectors#

Independent events#

Events \(A\) and \(B\) are called independent if

\[ \mathbb P(A\cap B) = \mathbb P(A) \cdot \mathbb P(B). \]

Conditional probability#

The probability of the event \(A\) given that another event \(B\) has happened is called conditional probability \(\mathbb P(A \vert B)\). If \(\mathbb P(B) > 0\) then

\[ \mathbb P(A \vert B) = \frac{\mathbb P(A \cap B)}{\mathbb P(B)}. \]

Note that \(\mathbb P(A\vert B) = \mathbb P(A)\) if the events \(A\) and \(B\) are independent. This is quite natural: for independent events the knowledge about happening or not happening of the event \(B\) does not affect \(\mathbb P(A)\).

Law of total probability#

If \(\Omega = A_1 \bigsqcup A_2 \bigsqcup \ldots \bigsqcup A_n\) then

(72)#\[ \mathbb P(B) = \sum\limits_{k=1}^n \mathbb P(B\vert A_k) \mathbb P(A_k).\]

This law of total probability also holds after conditioning on some event \(C\):

\[ \mathbb P(B \vert C) = \sum\limits_{k=1}^n \mathbb P(B\vert A_k, C) \mathbb P(A_k\vert C). \]

Bayes’ rule#

From the equality

\[ \mathbb P(A\vert B) \mathbb P(B) = \mathbb P(A \cap B) = \mathbb P(B\vert A) \mathbb P(A) \]

one can deduce Bayes’ theorem:

\[ \mathbb P(A\vert B) = \frac{\mathbb P(B\vert A) \mathbb P(A)}{\mathbb P(B)} \]

Bayes’ theorem often combined with law of total probability (72):

\[ \mathbb P(A_k\vert B) = \frac{\mathbb P(B\vert A_k) \mathbb P(A_k)}{\sum\limits_{k=1}^n \mathbb P(B\vert A_k) \mathbb P(A_k)}. \]

Random vectors#

A random vector \(\boldsymbol \xi = (\xi_1, \ldots, \xi_n) \in \mathbb R^n\) is just a vector of random variables. It can be either discrete or countinuos depending on the types of \(\xi_k\). Pmf of a discrete random vector is a tensor

\[ p_{i_1i_2\ldots i_n} = \mathbb P(\xi_1 = i_1, \xi_2 = i_2, \ldots, \xi_n = i_n). \]

with properties

\[ p_{i_1i_2\ldots i_n} \geqslant 0,\quad \sum\limits_{i_1,\ldots, i_n}p_{i_1i_2\ldots i_n} = 1. \]

Pdf \(p_{\boldsymbol \xi}(x_1, \ldots, x_n)\) of a continuous vector is often called joint density of the random vector \(\boldsymbol \xi\):

\[ \mathbb P(\boldsymbol \xi \in A) = \int\limits_A p(x_1, \ldots, x_n)\,dx_1\ldots dx_n. \]

Joint density is a nonnegative function which integrates to \(1\).

Expectation of a random vector is calculated elementwise:

\[ \mathbb E \boldsymbol\xi = (\mathbb E\xi_1, \ldots, \mathbb E\xi_n). \]

Covariance matrix of a random vector \(\boldsymbol \xi = (\xi_1, \ldots, \xi_n)\) is an \(n\times n\) matrix \(\mathrm{cov}(\boldsymbol \xi , \boldsymbol \xi) = \mathbb E(\boldsymbol\xi - \mathbb E \boldsymbol \xi)(\boldsymbol\xi - \mathbb E \boldsymbol \xi)^\mathsf{T}\). In other words,

\[ \mathrm{cov}(\boldsymbol \xi , \boldsymbol \xi)_{ij} = \mathrm{cov}(\xi_i, \xi_j), \quad 1\leqslant i, j\leqslant n. \]

Independent random variables#

Discrete random variables \(\xi\) and \(\eta\) are called independent if

(73)#\[ \mathbb P(\xi = x_i, \eta = y_j) = \mathbb P(\xi = x_i)\mathbb P(\eta = y_j) \text{ for all } i, j.\]

Similarly, two continuous random variables \(\xi\) and \(\eta\) are independent if their joint density \(p(x, y)\) is equal to product of densities:

\[ p(x, y) = p_{\xi}(x) p_\eta(y). \]

If \(\xi\) and \(\eta\) are independent, their covariance is \(0\), and \(\mathbb V(\xi + \eta) = \mathbb V \xi + \mathbb V \eta\).

Random variables \(\xi_1\ldots, \xi_n\) are called mutually independent if their joint pmf or pdf equals to the product of one-dimensional ones. In this case the random vector \(\boldsymbol \xi = (\xi_1, \ldots, \xi_n)\) has mutually independent coordinates. Covariance matrix of such random vector is diagonal:

\[ \mathrm{cov}(\boldsymbol \xi, \boldsymbol \xi) = \mathrm{diag}\{\mathbb V\xi_1, \ldots, \mathbb V\xi_n\}. \]

Multivariate normal distribution#

Multivariate normal (gaussian) distribution \(\mathcal{N}(\boldsymbol\mu, \boldsymbol\Sigma)\) is specified by its joint density

\[ p(\boldsymbol x) = \frac1{(2\pi)^{n/2}\sqrt{\det\boldsymbol\Sigma}}\exp\left(-\frac12(\boldsymbol x - \boldsymbol\mu)^T\boldsymbol\Sigma^{-1}(\boldsymbol x - \boldsymbol\mu)\right), \]

where \(\boldsymbol x, \boldsymbol \mu\in\mathbb{R}^n\), \(\boldsymbol\Sigma\) is a symmetric invertible matrix of shape \(n\times n\).

If a random vector \(\boldsymbol \xi \sim \mathcal{N}(\boldsymbol\mu, \boldsymbol\Sigma)\) then

\[ \mathbb E\boldsymbol \xi =\boldsymbol \mu, \quad \mathrm{cov}(\boldsymbol \xi, \boldsymbol \xi ) = \boldsymbol \Sigma. \]

Any linear transformation of a gaussian random vector is also gaussian: if \(\boldsymbol \xi \sim \mathcal{N}(\boldsymbol\mu, \boldsymbol\Sigma)\) and \(\boldsymbol \eta = \boldsymbol{A\xi} + \boldsymbol b\), then

\[ \boldsymbol \eta \sim \mathcal{N}(\boldsymbol{A\mu} + \boldsymbol b, \boldsymbol{A\Sigma A}^T). \]

Exercises#

  1. Given that a family has a boy, what is the probability that both children are boys?

  2. Среди населения \(33.7\%\) имеют первую группу крови, \(37.5\%\) — вторую, \(20.9\%\) — третью, \(7.9\%\) — четвёртую. При переливании крови надо учитывать группы крови донора и рецепиента:

    • реципиенту с четвёртой группой крови можно перелить кровь любой группы;

    • реципиентам со второй и третьей группами можно перелить кровь той же группы или первой;

    • реципиентам с первой группой крови можно перелить только кровь первой группы.

    С какой вероятностью допустимо переливание в случайно взятой паре донор—реципиент?

  3. Suppose that \(5\) men out of \(100\) and \(25\) women out of \(10000\) are colorblind. A colorblind person is chosen at random. What is the probability of them being male?

  4. Show that covariance matrix of any random vector is symmetric and semi-positive definite.

  5. Find pdf of a gaussian random vector \(\boldsymbol \xi \in \mathbb R^n\) with mutually independent coordinates.