Random variables#
A random variable is a function \(\xi \colon \Omega \to \mathbb R\) where \(\Omega\) is the set of events or outcomes. It is also called support of \(\xi\). Depending on their support \(\Omega\) all random variables of practical interest are divided into two big groups: discrete and continuous.
A lot of useful tools for working with discrete and continuous distributions can be found in scipy.stats.
Discrete distributions#
All discrete random variables have either finite of countable support:
A discrete random variable \(\xi\) is defined by its probability mass function (pmf):
Every pmf must satisfy the following conditions:
Cumulative distribution function (cdf) is
Measures of central tendency#
Let \(\xi\) be a discrete random variable.
Expectation (mean) of \(\xi\) is
Expectation is a linear operation: \(\mathbb E(a\xi + b\eta) = a \mathbb E \xi + b \mathbb E \eta\).
Variance of \(\xi\) is
Note that \(\mathbb V\xi \geqslant 0\) for all \(\xi\).
Question
In which cases the equality \(\mathbb V \xi = 0\) is possible?
Is variance linear?
Does the equality \(\mathbb V(a\xi + b\eta) = a \mathbb V \xi + b \mathbb V \eta\) hold?
Answer
Almost never. The correct formula is
where \(\mathrm{cov}(\xi, \eta) = \mathbb E(\xi - \mathbb E\xi)(\eta - \mathbb E\eta)\).
Standard deviation is equal to the square root of variance: \(\mathrm{sd}(\xi) = \sqrt{\mathbb V\xi}\).
Median of \(\xi\) is defined as such number \(m\) for which
Mode of \(\xi\) is the point of maximum of its pmf:
Bernoulli distribution#
Bernoulli trial has two outcomes: \(\Omega = \{0, 1\}\), \(0\) = «failure», \(1\) = «success»,
A typical example of a Bernoulli trial is tossing a coin. If the coin is fair then \(p=\frac 12\).
Is a real coin fair?
According to the recent studies, probability of landing on the same side is \(0.508\).
Bernoulli random variable \(\xi \sim \mathrm{Bern}(p)\) is indicator of success:
If \(\xi \sim \mathrm{Bern}(p)\), then
In machine learning Bernoulli distribution models outputs of all binary classifiers.
Bernoulli sampling in scipy
:
from scipy.stats import bernoulli
bernoulli.rvs(0.3, size=15)
array([1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1])
Binomial distribution#
The number of successes in \(n\) independent Bernoulli trials is called binomial random variable: \(\xi \sim \mathrm{Bin}(n, p)\) if
The pmf of \(\xi \sim \mathrm{Bin}(n, p)\) is
Due to the binomial theorem this is a correct probability distribution, since
If \(\xi \sim \mathrm{Bin}(n, p)\) then
from scipy.stats import binom
binom.rvs(10, 0.5, size=15)
array([4, 5, 4, 6, 6, 2, 7, 5, 4, 3, 6, 2, 4, 3, 5])
Geometric distribution#
The number of independent Bernoulli trials before first success is called geometric random variable. Hence, \(\xi \sim \mathrm{Geom}(p)\) if
Note that geometric distribution has countable support. Since \(\sum\limits_{k=1}^\infty q^{k-1} = \frac 1{1-q} = \frac 1p\) if \( 0 < q < 1\), all probabilities \(p_k\) are summed up to \(1\).
If \(\xi \sim \mathrm{Geom}(p)\) then
from scipy.stats import geom
print("High success probability:")
print(geom.rvs(0.8, size=15))
print("Low success probability:")
print(geom.rvs(0.1, size=15))
High success probability:
[1 2 1 1 1 1 1 1 1 2 1 1 1 1 2]
Low success probability:
[35 4 4 29 10 8 1 10 25 6 10 7 12 8 7]
Poisson distribution#
A random variable \(\xi\) has Poisson distribution \(\mathrm{Pois}(\lambda)\), \(\lambda > 0\), if
If \(\xi \sim \mathrm{Pois}(\lambda)\) then
from scipy.stats import poisson
poisson.rvs(10, size=15)
array([ 8, 13, 8, 6, 11, 3, 12, 7, 3, 9, 9, 8, 13, 14, 12])
In some cases Poisson distribution can serve as an approximation to binomial one.
Theorem (Poisson)
Let \(\xi \sim \mathrm{Bin}(n, p_n)\) and \(\lim\limits_{n \to \infty} np_n = \lambda > 0\). Then
In other words,
Continuous distributions#
The support \(\Omega\) of a continuous random variable \(\xi\) is usually a subset of \(\mathbb R\). In this case \(\xi\) is specified by its probability density function (pdf) \(p_\xi(x)\) such that
Any pdf must be nonnegative and \(\int\limits_{-\infty}^\infty p_\xi(x)\, dx = 1\). Also,
and, consequently, the derivative of cdf is equal to pdf: \(F_\xi'(x) = p_\xi(x)\).
Measures of central tendency#
Let \(\xi\) be a continuous random variable whose pdf is \(p_\xi(x)\).
Expectation (mean) of \(\xi\) is
Variance of \(\xi\) is
Standard deviation is equal to the square root of variance: \(\mathrm{sd}(\xi) = \sqrt{\mathbb V\xi}\).
Median of \(\xi\) is defined as such number \(m\) for which \(F_\xi(x) = \frac 12\).
Mode of \(\xi\) is the point of maximum of its pdf:
Uniform distribution#
Uniform random variable has constant pdf: \(\xi \sim U[a, b]\) if
Если \(\xi \sim U[a,b]\), то
In scipy
uniform random samples are drawn from the interval \([\mathrm{loc}, \mathrm{loc} + \mathrm{scale}]\):
from scipy.stats import uniform
uniform.rvs(loc=-5, scale=15, size=10)
array([-2.13506683, 0.89582651, 4.29858172, 3.61820416, 2.59606492,
6.7227537 , -1.90840452, 8.738038 , 3.13103127, 7.32456515])
Normal distribution#
A random variable \(\xi\) has normal (or gaussian) distribution \(\mathcal N(\mu, \sigma^2)\) if its pdf equals
The parameters of the normal distribution \(\mathcal N(\mu, \sigma^2)\) are its mean and variance:
\(\mathcal N(0, 1)\) is called standard normal distribution.
from scipy.stats import norm
print("Samples from N(0,1):")
print(norm.rvs(size=5))
print("Samples from N(5, 10):")
print(norm.rvs(loc=5, scale=10, size=5))
print("Samples from N(0, 0.1):")
print(norm.rvs(scale=0.1, size=5))
Samples from N(0,1):
[-0.04244319 0.90050472 -0.51269944 3.10121726 0.87629146]
Samples from N(5, 10):
[ -1.44184941 -2.79935985 -25.54053636 -6.53836484 8.34693174]
Samples from N(0, 0.1):
[ 0.05221489 -0.12663539 -0.10321535 0.04337056 0.05200199]
Exponential distribution#
Exponential random variable \(\xi \sim \mathrm{Exp}(\lambda)\), \(\lambda > 0\), has pdf
If \(\xi \sim \mathrm{Exp}(\lambda)\) then
scipy.stats.expon
generates random samples from \(\mathrm{Exp}\big(\frac 1{\mathrm{scale}}\big)\) shifted by loc
:
from scipy.stats import expon
expon.rvs(loc=5, scale=6, size=10)
array([13.42358255, 5.85005263, 38.40230661, 5.90709704, 9.20047533,
7.90582941, 6.25514 , 16.05320006, 17.45020631, 16.60125856])
Gamma distribution#
A random variable \(\xi\) has gamma distribution \(\mathrm{Gamma}(\alpha, \beta)\), \(\alpha, \beta > 0\), if
where \(\Gamma(\alpha) = \int\limits_0^\infty x^{\alpha - 1}e^{-x}\,dx\).
Exercise
Show that \(\mathrm{Exp}(\lambda)\) is a special case of \(\mathrm{Gamma}(\alpha, \beta)\).
If \(\xi \sim \mathrm{Gamma}(\alpha, \beta)\) then
from scipy.stats import gamma
a = 1
gamma(a).rvs(size=10)
array([0.79076424, 1.50102604, 0.59402698, 2.12058575, 2.50603945,
0.11652061, 0.80429701, 2.07803267, 0.6086475 , 0.56575866])
Beta distribution#
A random variable \(\xi\) has beta distribution \(\mathrm{Beta}(\alpha, \beta)\), \(\alpha, \beta > 0\), if
where \(B(\alpha, \beta) = \int\limits_0^1 x^{\alpha - 1} (1-x)^{\beta - 1}\,dx\).
from scipy.stats import beta
beta.rvs(a=0.5, b=3, size=5)
array([2.95906460e-01, 2.31625931e-01, 2.02362660e-02, 4.77648785e-03,
7.09880652e-05])
Student \(t\)-distribution#
A random variable \(\xi\) has Student \(t\)-distribution with \(\nu\) degrees of freedom if
This distribution is similar to \(\mathcal N(0,1)\), but for small values of \(\nu\) it has much heavier tails. Because of that \(\mathbb V\xi\) does not exist if \(\nu \leqslant 2\), and for \(\nu \leqslant 1\) even \(\mathbb E\xi\) does not exist. In other cases
from scipy.stats import t
print("Light tails:")
print(t(5).rvs(size=8))
print("Heavy tails:")
print(t(1).rvs(size=10))
print("Extremely heavy tails:")
print(t(0.2).rvs(size=4))
Light tails:
[ 0.22826629 1.39464637 -0.5742551 -0.33687555 1.31954787 2.3753305
-0.87740398 0.4348478 ]
Heavy tails:
[ 0.29722105 15.32458687 1.05746483 -0.46758395 -3.27275471 -1.32018701
0.38763233 0.16052448 -1.69824294 -0.07677401]
Extremely heavy tails:
[ 0.07609448 -1.54488847 -0.33322246 -5.03746408]
As \(\nu \to +\infty\), \(t\)-distribution tends to \(\mathcal N(0,1)\):
nu = 1002
t(nu).mean(), t(nu).var()
(0.0, 1.002)
Laplace distribution#
Pdf of laplacian random variable \(\xi \sim \mathrm{Laplace}(\mu, b)\), \(b > 0\), is
If \(\xi \sim \mathrm{Laplace}(\mu, b)\), then
Exercises#
Choose all possible supports of the uniform distribution:
\(\Omega = \varnothing\)
\(\Omega = \{1, 2, \ldots, n\}\)
\(\Omega = \mathbb N\)
\(\Omega = [a, b)\)
\(\Omega = [0, +\infty)\)
\(\Omega = \mathbb R\)
Каждый день рекламу компании X в поисковой выдаче видят ровно \(1000\) человек. Вчера \(50\) из них кликнули на рекламу. С какой вероятностью не менее 50 людей кликнут на ее рекламу сегодня?
Известно, что на поисковой выдаче на рекламу компании X кликает в среднем примерно 50 пользователей в день. Количество показов достаточно большое и может меняться изо дня в день. С какой вероятностью сегодня будет совершено не менее 50 кликов по рекламным объявлениям?
Show then mean, median and mode of \(\mathcal N(\mu, \sigma^2)\) are equal to \(\mu\).
Find mode of \(\mathrm{Gamma}(\alpha, \beta)\).
Find mode of \(\mathrm{Beta}(\alpha, \beta)\).
Let \(\xi_\nu \sim \mathrm{Student}(\nu)\), \(\xi \sim \mathcal N(0, 1)\). Prove that \(\xi_\nu \stackrel{D}{\to} \xi\) as \(\nu \to +\infty\), i.e.,
\[ \lim\limits_{\nu \to +\infty}p_{\xi_\nu}(x) = p_\xi(x) \text{ for all } x\in \mathbb R. \]
n = 1000
p = 0.05
lam = 50
from scipy.stats import binom, poisson
xi = binom(n, p)
eta = poisson(lam)
print("Binomial answer:", 1 - xi.cdf(49))
print("Poisson answer:", 1 - eta.cdf(49))
Binomial answer: 0.5202589429126758
Poisson answer: 0.5188083154720433