Probability space

Probability Space#

Definition. A probability space is simply a measurable space \((\Omega,\mathcal{F})\) (where \(\mathcal{F}\) is the \(\sigma\)-algebra on \(\Omega\)) together with a probability measure \(P:\mathcal{F}\to[0,1]\). A measure \(\mu\) is called a probability measure if \(\mu(\Omega)=1\).

The whole space \(\Omega\) is the space of outcomes, and \(\mathcal{F}\subseteq \mathcal{P}(\Omega)\) is the space of events. An event with probability \(1\) is said to happend almost surely.
It is easy to see that a probability measure satisfies monotonicity, subadditivity, continuity from above and below.

Definition. A real valued function \(X:\Omega \to \mathbb{R}\) is said to be a random variable if for every Borel set \(B\subset \mathbb{R}\) we have \(X^{-1}(B) = \{\omega:X(\omega) \in B\}\in \mathcal{F}\). Namely, it is a measurable function from the measurable space \((\Omega,\mathcal{F})\) to the Borel space \((\mathbb{R},\mathcal{B}(\mathbb{R}))\). We write \(X\in\mathcal{F}\) and say that \(X\) is \(\mathcal{F}\)-measurable.

The notation \(X\in A\) for \(A\subset \mathbb{R}\) means \(X^{-1}(A)\); this means “\(X\) takes value on \(A\)”.

A sequence \(X_n\) of random variables is said to convere a.s. to a random variable \(X\) iff \(P(X_n\to X) = 1\).
A sequence \(X_n\) is said to converge in probability to \(X\) iff for every \(\epsilon > 0\), \(\lim_{n\to \infty} P(|X_n - X| > \epsilon) = 0\). Almost sure convergence implies convergence in probability.

Definition. The distribution of a random variable \(X\) is a probability measure \(X_\ast P\) on \(\mathbb{R}\) induced by \(X\) defined by setting \(X_\ast P(A) = P(X\in A) = P(X^{-1}(A))\) for Borel sets \(A\in\mathcal{B}(\mathbb{R})\). This is simply the pullback of \(X\) from \(\mathcal{F}\) to \(\mathcal{B}(\mathbb{R})\).

Usually the distribution of a random variable \(X\) is described by giving its distribution function \(F:\mathbb{R}\to\mathbb{R}\) defined as \(F(x) = P(X\leq x)\), where \(P(X\leq x)\) represents the probability that the random variable \(X\) takes on a value less or equal to \(x\), namely \(P(X\leq x) = P(X^{-1}((-\infty,x]))\).

The following holds for any distribution function \(F\):

\(F\) is nondecreasing.
\(\lim_{x\to\infty}F(x)=1, \lim_{x\to -\infty}F(x)=0\).
\(F\) is right continuous, meaning that \(\lim_{y\downarrow x} F(y) = F(x)\).
Let \(F(x-) = \lim_{y\uparrow x}F(y)\), we have \(F(x-) = P(X < x)\).
\(P(X=x) = F(x) - F(x-)\).

In fact if a function \(F:\mathbb{R}\to\mathbb{R}\) satisfies the 1. 2. 3. of the above properties, it is the distribution function of some random variables: Let \(\Omega = (0,1), \mathcal{F} = \mathcal{B}(\mathbb{R})\) and let \(P\) be the Lebesgue measure. For \(\omega \in (0,1)\) put \(X(\omega) = \text{sup}\{y\in\mathbb{R}: F(y) < \omega\}\). This \(X\) is called the inverse of \(F\) and denoted \(F^{-1}\), even though \(F\) may not be 1-1 and onto. Given \(X\) the inverse of \(F\), we can also induce a measure on \((\mathbb{R},\mathcal{B}(\mathbb{R}))\).

Two random variables \(X,Y\), if they induce the same distribution on the Borel space, are said to be equal in distribution, which is denote by \(X=_d Y\).

If the distribution function \(F(x) = P(X\leq x)\) has the form \(F(x) = \int_{-\infty}^{x}f(y) dy\) we say that \(X\) has density function \(f\). We have that

\[P(x=x) = \lim_{\epsilon \to 0}\int_{x-\epsilon}^{x+\epsilon}f(y)dy = 0\]

but it is often useful to think of \(f(x)\) as being \(P(X=x)\). Hence \(f(x) = \frac{d}{dx} F(x)\). This is actually the Radon-Nikodym derivative

\[f = \frac{d X_\ast P}{d\mu}\]

where \(\mu\) is the reference measure (hence Lebesgue) on \(\mathbb{R}\). This means that

\[X_\ast P(A) = P(X\in A) = \int_{X^{-1}(A)} dP = \int_A f d\mu.\]

Terminology#

Distribution function = cumulative distribution function = CDF
Density function = probability density function = PDF