# Probability theory 2

So that these posts don’t get too long, I’ll split them up when there’s more theorems/examples/etc. I want to add. The first new topic is the “law of the unconscious statistician”:

Theorem 1 If ${X}$ is a random variable with ${\mathbb E[|X|]<\infty}$ and ${g:\mathbb R\rightarrow\mathbb R}$ is a Borel-measurable function such that ${\mathbb E[|g(X)|]<\infty}$, then

$\displaystyle \mathbb E[g(X)] = \int_{\mathbb R} g(x)\mathsf dF(x),$

where ${F}$ is the distribution function of ${X}$.

Proof: Define ${Y=g(X)}$, that is, for each ${\omega\in\Omega}$, ${Y(\omega)=g(X(\omega))}$. Then the distribution of ${Y}$ is

$\displaystyle F_Y(y)=\mathbb P(\omega\in\Omega:Y(\omega)\leqslant y) = \mathbb P(\omega\in\Omega:g(X(\omega))\leqslant y) = \mathbb P(X\in A),$

where

$\displaystyle A=\{x\in\mathbb R: g(x)\leqslant y\}.$

Note that ${A}$ is a Borel-measurable set because ${g}$ is a Borel-measurable function. For all such sets ${A}$, define

$\displaystyle \widetilde{\mathbb P}(A) = \int_A \mathsf dF(x).$

Then ${\widetilde{\mathbb P}}$ is a probability measure on ${\mathcal B(\mathbb R)}$, and

$\displaystyle \widetilde{\mathbb P}(x\in\mathbb R:g(x)\leqslant y)=F_Y(y).$

It follows that

$\displaystyle \mathbb E[g(X)] = \int_{\Omega}g(X)\mathsf d\mathbb P = \int_{\mathbb R}g\;\mathsf d\widetilde{\mathbb P} = \int_{\mathbb R}g(x)\mathsf dF(x).$

$\Box$

When ${g(x)=x}$, we have the familiar formula

$\displaystyle \mathbb E[X] = \int_{\mathbb R}x\mathsf dF(x).$

When ${X}$ takes countably many values (is discrete), this reduces to

$\displaystyle \mathbb E[X]=\sum_{x\in E}x\mathbb P(X=x)$

where ${E}$ is the set of values that ${X}$ takes. When ${F}$ is absolutely continuous (i.e. ${X}$ is a continuous random variable that admits a density), this reduces to

$\displaystyle \mathbb E[X] = \int_{\mathbb R}xf(x)\mathsf dx,$

where ${f(x)=\frac{\mathsf d}{\mathsf dx}F(x)}$ is the density of ${X}$.