# Transient and limiting distribution of the M/M/∞ queue

Suppose customers arrive to a queue according to a Poisson process with intensity ${\lambda}$, have independent service times with distribution ${G(t)}$, and there is an infinite number of servers available, so that service begins immediately as a customer enters the system. Assume further that the system is empty at time ${0}$. Let ${X(t)}$ be the number of customers in the system at time ${t}$; what is the distribution of ${X(t)}$?

A customer who enters the system at time ${s\leqslant t}$ will still be in the system at time ${t}$ with probability ${1-G(t-s)}$. Let ${\{N(t):t\in\mathbb R_+\}}$ be a Poisson process with intensity ${\lambda}$ and ${N_1(t) = (1-G(t-s))N(t)}$, ${0\leqslant s\leqslant t}$. Then ${N_1(t)=X(t)}$, and so for each nonnegative integer ${k}$

\displaystyle \begin{aligned} \mathbb P_0(X(t) = k) &= \mathbb P_0(N_1(t) = k)\\ &= \exp\left( -\lambda \int_0^t (1-(G(t-s)))\ \mathsf ds\right) \frac{\left(\lambda\int_0^t (1-(G(t-s)))\ \mathsf ds \right)^k}{k!}\\ &= \exp\left( -\lambda \int_0^t (1-(G(s)))\ \mathsf ds\right) \frac{\left(\lambda\int_0^t (1-(G(s)))\ \mathsf ds \right)^k}{k!}.\\ \end{aligned}

In the case where ${G(t) = 1-e^{-\mu(t)}}$, that is, the service times are exponentially distributed with rate ${\mu}$, we have

$\displaystyle \int_0^t (1-G(s))\ \mathsf ds = \int_0^t e^{-\mu t} = \frac1\mu \left(1 - e^{-\mu t}\right),$

and hence

$\displaystyle \mathbb P_0(X(t) = k) = \exp\left(-\frac\lambda\mu\left(1-e^{-\mu t}\right)\right) \frac{\left(\frac\lambda\mu\left(1-e^{-\mu t}\right)\right)^k}{k!}.$

The expected number of customers at time ${t}$ is then

\displaystyle \begin{aligned} \mathbb E_0[X(t)] &= \sum_{k=1}^\infty k\cdot \mathbb P_0(X(t)=k)\\ &= \sum_{k=1}^\infty k\cdot \exp\left(-\frac\lambda\mu\left(1-e^{-\mu t}\right)\right) \frac{\left(\frac\lambda\mu\left(1-e^{-\mu t}\right)\right)^k}{k!}\\ &= \frac\lambda\mu\left(1-e^{-\mu t}\right)\exp\left(-\frac\lambda\mu\left(1-e^{-\mu t}\right)\right)\sum_{k=0}^\infty \frac{\left(\frac\lambda\mu\left(1-e^{-\mu t}\right)\right)^k}{k!}\\ &= \frac\lambda\mu\left(1-e^{-\mu t}\right)\exp\left(-\frac\lambda\mu\left(1-e^{-\mu t}\right)\right)\exp\left(\frac\lambda\mu\left(1-e^{-\mu t}\right)\right)\\ &= \frac\lambda\mu\left(1-e^{-\mu t}\right). \end{aligned}

Moreover, we observe that

$\displaystyle \lim_{t\rightarrow\infty} \mathbb P_0(X(t)=k) = e^{-\frac\lambda\mu}\frac{\left(\frac\lambda\mu\right)}{k!},$

which is the same as the stationary distribution ${\pi}$ of ${\{X(t)\}}$ derived from the generalized global balance equations

$\displaystyle \lambda \pi_k = (k+1)\mu \pi_{k+1},\quad n=0,1,2,\ldots$

(normalized by the condition ${\sum_{n=0}^\infty \pi_k=1}$).

# Generating function of a stochastic matrix

This is an example demonstrating my answer to this math.stackexchange post: Let ${\{X_n:n\in\mathbb N_0\}}$ be a Markov chain on ${\{0,1\}}$ with transition matrix

$\displaystyle P = \begin{pmatrix}1-\alpha&\alpha\\\beta&1-\beta\end{pmatrix}$

where ${\alpha,\beta\in(0,1)}$. Then ${P}$ is irreducible and aperiodic, so there exists a unique stationary distribution ${\pi}$ satisfying

$\displaystyle \lim_{n\rightarrow\infty}(P^n)_{i\cdot} = \pi,\quad i=1,2$

where ${(P^n)_{i\cdot} }$ denotes the ${i^{\mathrm{th}}}$ row of ${P^n}$. From the global balance equation ${\alpha\pi_0 = \beta\pi_1}$ we see that ${\pi_1 = \frac\alpha\beta\pi_0}$, and normalization yields

$\displaystyle \pi = \left(\frac\beta{\alpha+\beta},\frac\alpha{\alpha+\beta} \right).$

To actually compute ${P^n}$, we could write ${P^n=AD^nA^{-1}}$ where the columns of ${A}$ are linearly independent eigenvectors of ${P}$ and ${D}$ is a diagonal matrix whose entries are the corresponding eigenvalues. This is possible because the Perron-Frobenius theorem implies that the eigenspace corresponding to the eigenvalue ${1}$ is one-dimensional, and the rank of ${P}$ is two. Instead, we will compute the generating function of ${P}$. Let

$\displaystyle P(s) = \sum_{n=0}^\infty P^n s^n,$

then

\displaystyle \begin{aligned} P(s) &= (I-sP)^{-1}\\ &= \frac{1}{\det (I-sP)}\mathrm{adj} A\\ &= \frac1{(1-s)(1-(1-\alpha-\beta)s)}\begin{pmatrix}1-(1-\beta)s&\alpha s\\\beta s& 1-(1-\alpha)s\end{pmatrix}. \\ &= \begin{pmatrix}P_{00}(s) & P_{01}(s)\\ P_{10}(s) & P_{11}(s), \end{pmatrix} \end{aligned}

where

$\displaystyle P_{ij}(s):=\sum_{n=0}^\infty \mathbb P(X_n=j\mid X_0=i)s^n.$

Partial fraction decomposition yields

\displaystyle \begin{aligned} &\quad\frac1{(1-s)(1-(1-\alpha-\beta)s}\\ &= (\alpha+\beta)^{-1}\frac1{1-s} + \left(1-(\alpha+\beta)^{-1}\right)\frac1{1-(1-\alpha-\beta)s}, \end{aligned}

and so we have

\displaystyle \begin{aligned} P_{00}(s) &= \left(\frac{1-(1-\beta)s}{\alpha+\beta}\right)\left(\frac1{1-s} + \frac1{1-(1-\alpha-\beta)s}\right)\\ &= \left(\frac{1-(1-\beta)s}{\alpha+\beta}\right) \left( \sum_{n=0}^\infty s^n + \sum_{n=0}^\infty (1-\alpha-\beta)^ns^n\right)\\ &= \sum_{n=0}^\infty \left(\frac\beta{\alpha+\beta}+\frac{\alpha(1-\alpha-\beta)^n}{\alpha+\beta}\right) s^n. \end{aligned}

Similar computations yield

\displaystyle \begin{aligned} P_{01}(s) &= \sum_{n=0}^\infty \left(\frac\alpha{\alpha+\beta}-\frac{\alpha(1-\alpha-\beta)^n}{\alpha+\beta}\right) s^n,\\\\ P_{10}(s) &= \sum_{n=0}^\infty \left(\frac\beta{\alpha+\beta}-\frac{\beta(1-\alpha-\beta)^n}{\alpha+\beta}\right) s^n,\\\\ P_{11}(s) &= \sum_{n=0}^\infty \left(\frac\alpha{\alpha+\beta}+\frac{\beta(1-\alpha-\beta)^n}{\alpha+\beta}\right) s^n, \end{aligned}

and it follows that

$\displaystyle P(s) = \begin{pmatrix} \pi_0(1+(1-\alpha-\beta)^n) & \pi_1 (1-(1-\alpha-\beta)^n) \\ \pi_0(1-(1-\alpha-\beta)^n) & \pi_1(1+(1-\alpha-\beta)^n)\end{pmatrix} s^n.$

# Discrete-time queue (Geo/Geo/1)

Queueing theory is generally introduced in the continuous-time setting, with the fundamental ${M/M/1}$ queue for which many analytical results may be obtained with relatively straightforward computations, due largely to the memoryless property of the exponential distribution. In the next few posts I will analyze the discrete-time analog of this model; namely, a single-server queue with *geomentric* interarrival and service times.

Suppose customers arrive to a queue according to a Bernoulli process; that is, at each time ${n=1,2,\ldots}$, a customer arrives with probability ${p_\lambda}$, independent of past or future arrivals. The probability that a customer finishes service at time ${n+1}$, conditioned on that customer having been in service at time ${n}$, is ${p_\mu}$. Let ${\{A_m : m=1,2,\ldots\}}$ be the arrival process and ${\{D_m : m=1,2,\ldots\}}$ the departure process. The interarrival times are ${T_m:=A_m-A_{m-1}}$, with ${A_0\equiv 0}$. Conditioned on ${A_{m-1}=i}$, we have ${\mathbb P(A_m=i+j)=\mathbb P(T_m=j)=(1-p_\lambda)^{j-1}p_{\lambda}}$, and so by independence of arrivals, the ${T_m}$ are i.i.d. ${\mathsf{Geo}(p_\lambda)}$ random variables. A similar argument shows that the service times are i.i.d. ${\mathsf{Geo}(p_\mu)}$ random variables.

Let ${\{Z_n:n=0,1,2,\ldots\}}$ be the number of customers in the system at time ${n}$, after arrivals and before departures. It follows from the memoryless property of the geometric distribution that ${Z_n}$ is a Markov chain; that is, for any nonnegative integers ${n}$, ${i}$, and ${j}$ and any ${E\in\sigma\left(\bigcup_{k=0}^n Z_k\right)}$, we have

$\displaystyle \mathbb P(Z_{n+1} = j\mid Z_n=i, E) = \mathbb P(Z_{n+1}=j\mid Z_n=i).$

At each time ${n}$, at most one customer can arrive and at most one customer can depart. It follows that ${Z_n}$ is a birth-death chain, with transition probabilities given by

$\displaystyle p(i,j) = \begin{cases} 1-p_\lambda,& i=j=0\\ p_\lambda,& i=0, j=1\\ p_\mu(1-p_\lambda),& j=i-1\\ p_\lambda(1-p_\mu),& j=i+1, i>0\\ p_\lambda p_\mu + (1-p_\lambda)(1-p_\mu),& i=j>0. \end{cases}$

Suppose ${\nu}$ is an invariant measure for the transition matrix ${P}$, that is ${\nu = \nu P}$. Assume WLOG that ${\nu_0=1}$, then ${\nu_1 = \frac{p_\lambda}{p_\mu(1-p_\lambda)}}$ and

$\displaystyle \nu_n = \left(\frac{p_\lambda(1-p_\mu)}{p_\mu(1-p_\lambda)}\right)\nu_{n-1},\ n\geqslant2.$

By inspection, this recurrence has solution

$\displaystyle \nu_n=\left(\frac{p_\lambda(1-p_\mu)}{p_\mu(1-p_\lambda)}\right)^{n-1}\frac{p_\lambda}{p_\mu(1-p_\lambda)},\ n\geqslant1.$

Now, ${\{Z_n\}}$ is positive recurrent iff ${\sum_{n=0}^\infty \nu_n<\infty}$, which is true precisely when

$\displaystyle p_\lambda(1-p_\mu)

The offered load to the system is then

$\displaystyle \rho := \frac{p_\lambda(1-p_\mu)}{p_\mu(1-p_\lambda)}.$

Assuming ${\rho<1}$, let ${C=\sum_{n=0}^\infty \nu_n}$ and define ${\pi=C^{-1}\nu}$. By construction, ${\pi = \pi P}$ and ${\sum_{n=0}^\infty \pi_n=1}$, so that ${\pi}$ is the (unique) stationary distribution of ${\{Z_n\}}$. A straightforward computation yields

\displaystyle \begin{aligned} \pi_0 &= \frac{p_\mu-p_\lambda}{p_\mu}\\ \pi_n &= \rho^{n-1}\left(\frac{p_\lambda(1-\rho)}{p_\mu}\right),\ n\geqslant1. \end{aligned}

The limiting mean number of customers in the system ${L}$ is given by

\displaystyle \begin{aligned} L &= \sum_{n=1}^\infty n\pi_n\\ &= \left(\frac{p_\lambda(1-\rho)}{p_\mu}\right)\sum_{n=1}^\infty n\rho^{n-1}\\ &= \frac{p_\lambda}{p_\mu(1-\rho)}. \end{aligned}

Invoking Little’s Law, we compute the mean sojourn time of customers by

$\displaystyle W = \frac L{p_\lambda} = \frac1{p_\mu(1-\rho)}.$

# Brownian motion is differentiable nowhere

The elementary continuous functions we work with in calculus tend to be relatively well-behaved, in the sense that they are differentiable everywhere except some isolated set of points. The prototypical example is the map ${x\rightarrow|x|}$, which fails to be differentiable at zero since ${\frac{x-|x|}x = \frac x{|x|}}$ for ${x\ne 0}$, so ${\lim_{x\rightarrow 0}\frac{x-|x|}x}$ does not exist. More generally, let ${x_0 and ${y_0,\ldots,y_n\in\mathbb R}$ with

$\displaystyle \frac{y_{i+2}-y_{i+1}}{x_{i+2}-x_{i+1}} \ne \frac{y_{i+1}-y_n}{x_{i+1}-x_n}$

– that is, the slope of the line segment joining ${(x_i,y_i)}$ and ${(x_{i+1}, y_{i+1})}$ differs from the slope of the line segment joining ${(x_{i+1}, y_{i+1})}$ and ${(x_{i+2},y_{i+2})}$. Define ${f:[x_0,x_n]\rightarrow\mathbb R}$ such that ${f(x_i)=y_i}$ and the graph of ${f}$ is

$\displaystyle \bigcup_{i=0}^{n-1} \left\{\lambda(x_i,f(x_i))+(1-\lambda)(x_{i+1},f(x_{i+1}):\lambda\in[0,1] \right\}.$

That is, ${f}$ is a piecewise linear function connecting the points ${(x_i,y_i)}$ – like the functions ${B^{(n)}}$ used in the previous post. Then ${f}$ is differentiable on ${(x_0,x_n)\setminus\{x_1,\ldots,x_{n-1}}$. Moreover, if ${\{x_n\}}$ is a strictly increasing sequence with ${L:=\sum_{n=0}^{\infty}(x_{n+1}-x_n)<\infty}$ satisfying the aforementioned slope assumption, then defining ${f_n}$ as above, we see that ${\lim_{n\rightarrow\infty}f_n}$ is continuous on ${(x_0,x_0+L)}$ and differentiable at all but countably many points.

The paths of Brownian motion are even more pathological – they are with probability ${1}$ differentiable nowhere! Let ${W=\{W_t:t\in\mathbb R_+\}}$ be a standard Brownian motion and

$\displaystyle D = \{\omega:\exists t\in(0,1) \textrm{ s.t. } W \textrm{ differentiable at } t\}.$

We will show that ${\mathbb P(D)=0}$. For each integer ${n\geqslant2}$ and ${1\leqslant k\leqslant n-2}$, set

$\displaystyle M(k,n) = \max\left\{|W_{kn^{-1}}-W_{(k-1)n^{-1}}|,|W_{(k+1)n^{-1}}-W_{kn^{-1}}|,|W_{(k+2)n^{-1}}-W_{(k+1)n^{-1}}| \right\}$

and

$\displaystyle M_n = \min\{M(k,n): 1\leqslant k\leqslant n-2\}.$

Suppose ${\omega_0\in D}$, then since

$\displaystyle \lim_{s\rightarrow t}\frac{W(s,\omega_0)-W(t,\omega_0)}{s-t}$

exists, we may choose ${\delta>0}$, ${C>0}$ such that

$\displaystyle |W(t,\omega_0)-W(s,\omega_0)|< C|t-s|.$

Let ${N}$ be a positive integer with ${N > \frac3\delta}$, and put

$\displaystyle k = \max\left\{j\in\mathbb N: jN^{-1}\leqslant t\right\}.$

Then

$\displaystyle \max\left\{\left|jN^{-1}-t\right| : j\in\{k-1,k,k+1,k+2\} \right\}<\frac3N<\delta,$

so that

\displaystyle \begin{aligned} \left|W_{kN^{-1}}-W_{(k-1)N^{-1}} \right| &\leqslant \left| W_{kN^{-1}}-W_t\right|+ \left| W_{(k-1)N^{-1}}-W_t\right|\\ &\leqslant C\left|kN^{-1}-t| + C|(k-1)N^{-1}\right|\\ &\leqslant C(N^{-1} + 2N^{-1})\\ &= 3CN^{-1}, \end{aligned}

and similarly

$\displaystyle \left|W_{(k+1)N^{-1}}t-W_{kN^{-1}}\right|<3CN^{-1},\quad \left|W_{(k+2)N^{-1}}t-W_{(k+1)N^{-1}}\right|<3CN^{-1}.$

It follows that

$\displaystyle M_N(\omega_0)\leqslant M(k,N)(\omega_0)<3CN^{-1}.$

Now, the increments ${W_{(j+1)N^{-1}}-W_{jN^{-1}}}$ are i.i.d. with ${\mathcal N\left(0,N^{-1}\right)}$ distribution, so

$\displaystyle W_{(j+1)N^{-1}}-W_{jN^{-1}}\stackrel d= n^{-\frac12}W_1,$

from which we have

$\displaystyle \mathbb P(M(k,N) \leqslant 3CN^{-1}) = \mathbb P\left(|W_1|\leqslant 3CN^{-\frac12}\right)^3.$

It follows that

\displaystyle \begin{aligned} \mathbb P(M_N\leqslant 3CN^{-1}) &= \mathbb P\left(\bigcup_{j=1}^N \left\{M(j,N)\leqslant 3CN^{-1}\right\}\right)\\ &\leqslant \sum_{j=1}^N \mathbb P\left(M(j,N)\leqslant 3CN^{-1}\right)\\ &\leqslant N\mathbb P\left(M(k,N)\leqslant 3CN^{-1} \right)\\ &= N\left(\mathbb P\left(|W_1|\leqslant 3CN^{-\frac12} \right)\right)^3\\ &\leqslant N\left(6CN^{-\frac12}\right)^3\\ &= (6C)^3N^{-\frac12}\stackrel{n\rightarrow\infty}\longrightarrow0. \end{aligned}

Let ${E_n = \{\omega: M_m(\omega) \leqslant 3Cn^{-1}\}}$, then ${\omega_0\in\liminf_{n\rightarrow\infty} E_n}$ so ${D\subset\liminf_{n\rightarrow\infty} E_n}$. Fatou’s lemma then yields

$\displaystyle \mathbb P(D)\leqslant \mathbb P\left(\liminf_{n\rightarrow\infty} E_n\right) \leqslant\liminf_{n\rightarrow\infty} \mathbb P(E_n)=\lim_{n\rightarrow\infty}\mathbb P(E_n)=0,$

from which we conclude.

# Construction of Brownian Motion

A stochastic process ${X=\{X_t:t\in\mathbb R_+\}}$ is said to be standard Brownian motion (or a Wiener process) if

1. ${X_0=0}$ almost surely.
2. ${X}$ has independent increments, i.e. for ${0\leqslant t_0, the random variables

$\displaystyle \left\{X_{t_{i+1}}-X_{t_i} : 0\leqslant i\leqslant k-1 \right\}$

are independent.

3. For ${0\leqslant s,

$\displaystyle X_t-X_s\sim\mathcal N(0,t-s).$

4. The map ${\omega\rightarrow X_t(\omega)}$ is almost surely continuous.

In this post I follow the method in Sidney Resnick’s Adventures in Stochastic Processes to construct Brownian motion on ${[0,1]}$ using i.i.d. standard normal random variables. The key to this construction is the following lemma:

Lemma 1 Suppose ${X(s)}$ and ${X(t)}$ are random variables such that

$\displaystyle X(t)-X(s)\sim\mathcal N(0,t-s).$

Then there exists a random variable ${X\left(\frac{t+s}2\right)}$ such that

$\displaystyle X\left(\frac{t+s}2\right) - X(s),\ X(t) - X\left(\frac{t+s}2\right)\stackrel{\mathrm{i.i.d.}}\sim\mathcal N\left(0,\frac{t-s}2\right).$

Proof: Define ${U:=X(t)-X(s)}$ and suppose ${V\sim\mathcal N(0,t-s)}$ is independent of ${U}$. Define ${X\left(\frac{t+s}2\right)}$ by

\displaystyle \begin{aligned} X(t) - X\left(\frac{t+s}2\right) &= \frac{U+V}2\\ X\left(\frac{t+s}2\right) - X(s) &= \frac{U-V}2, \end{aligned}

so that

\displaystyle \begin{aligned} X(t) - X(s) &= U\\ X(t) + X(s) - 2X\left(\frac{t+s}2\right) &= V. \end{aligned}

Then as ${U,V\stackrel{\mathrm{i.i.d.}}\sim\mathcal(0,t-s)}$, we have

$\displaystyle \mathbb E[(U+V)(U-V)] = \mathbb E[U^2] - \mathbb E[V^2] = 0$

so that ${X(t)-X\left(\frac{t+s}2\right)}$ and ${X\left(\frac{t+s}2\right)-X(s)}$ are uncorrelated and hence independent (as they are normally distributed).

$\Box$

Now let

$\displaystyle \left\{V(k2^{-n}):1\leqslant k\leqslant 2^n, n\geqslant 1\right\}$

be independent random variables with

$\displaystyle V\left(k2^{-(n+1)}\right)\sim\mathcal N(0,2^{-n}).$

Let ${X(0):=0}$, ${X(1):=V(1)}$, and define ${X(2^{-1})}$ using ${V(2^{-1})}$ so that ${X(2^{-1})-X(0)}$ and ${X(1)-X(2^{-1})}$ are i.i.d. ${\mathcal N(0,2^{-1})}$ random variables. Suppose

$\displaystyle \{ X(k2^{-n}): 0\leqslant k\leqslant 2^n\}$

are defined such that

$\displaystyle X(k2^{-n})-X((k-1)2^{-n})\stackrel{\mathrm{i.i.d.}}\sim\mathcal N(0,2^{-n}),\ 1\leqslant k\leqslant 2^n.$

For each ${k\leqslant 2^n}$, define ${X((2k+1)2^{-(n+1)})}$ such that

\displaystyle \begin{aligned} X\left((2k+1)2^{-(n+1)}\right)-X(k2^{-n}) &\stackrel d= X((k+1)2^{-n}) - X\left((2k+1)2^{-(n+1)} \right)\\ &\sim\mathcal N\left(0, 2^{-(n+1)}\right) \end{aligned}

and the sequence ${\left\{X\left(k2^{-(n+1)}\right):0\leqslant k\leqslant 2^{-(n+1)}\right\}}$ has independent increments. For each ${n=1,2,\ldots}$ and ${0\leqslant k\leqslant 2^n-1}$ define the processes

\displaystyle \begin{aligned} B^{(n,k)}(t,\omega) &= 2^n(X((k+1)2^{-n},\omega)-X(k2^{-n},\omega))t\\ &\quad -k\cdot X((k+1)2^{-n},\omega) - (k-1)(X(k2^{-n},\omega)). \end{aligned}

and

$\displaystyle B^{(n)}(t) = \sum_{k=0}^{2^n-1} B^{(n,k)}(t)$

That is,

$\displaystyle B^{(n)}(t,\omega) = X(t,\omega),\quad t\in\left\{k2^{-n}, 0\leqslant k\leqslant 2^{-n}\right\}$

and ${B^{(n)}}$ is linear on each interval ${\left[k2^{-n},(k+1)2^{-n}\right]}$. Let ${\Delta^{(n)}}$ be the maximum deviation of ${B^{(n)}}$ and ${B^{(n+1)}}$ on ${[0,1]}$, partitioned by the intervals ${\left[k2^{-n},(k+1)2^{-n}\right]}$, that is,

$\displaystyle \Delta^{(n)}(\omega) = \max_{0\leqslant k\leqslant 2^n-1}\;\max_{k2^{-n}\leqslant t\leqslant (k+1)2^{-n}}\left|B^{(n+1)}(t,\omega)-B^{(n)}(t,\omega)\right|.$

For each ${n}$ we have

\displaystyle \begin{aligned} &\quad\max_{k2^{-n}\leqslant t\leqslant (k+1)2^{-n}}\left|B^{(n+1)}(t,\omega)-B^{(n)}(t,\omega)\right|\\ &= \left|\frac{B^{(n)}((k+1)2^{-n}) + B^{(n)}(k2^{-n}) }2 - B^{(n+1)}\left((2k+1)2^{-(n+1)}\right) \right|\\ &=\left|\frac{X((k+1)2^{-n}) + X(k2^{-n}) }2 - X\left((2k+1)2^{-(n+1)}\right) \right|\\ &=\frac12|V\left((2k+1)2^{-(n+1)}\right)|, \end{aligned}

where ${V\left((2k+1)2^{-(n+1)}\right)\sim\mathcal N(0,2^{-n})}$, and so

$\displaystyle \Delta^{(n)}(\omega) = \frac12\max_{0\leqslant k\leqslant 2^n-1}\left|V\left((2k+1)2^{-(n+1)}\right)\right|.$

For ${x\geqslant1}$ we compute

\displaystyle \begin{aligned} \mathbb P\left(\Delta^{(n)}> \frac{x/2}{2^{n/2}}\right) &= \mathbb P\left(\frac12\max_{0\leqslant k\leqslant 2^n-1}\left|V\left((2k+1)2^{-(n+1)}\right)\right| > \frac12\frac x{2^{n/2}}\right)\\ &= \mathbb P\left(\bigcup_{k=0}^{2^n-1}\left\{\left|V\left((2k+1)2^{-(n+1)}\right)\right|>2^{-\frac n2}x\right\}\right) \\ &\leqslant 2^n \mathbb P(|V(2^{-n})|>2^{-\frac n2}x)\\ &= 2^{n+1}\mathbb P(V(2^{-n})/2^{-\frac n2}>x)\\ &= 2^{n+1}\mathbb P(Z > x)\\ &\leqslant 2^{n+1}\left(\frac{\phi(x)}x\right)\\ &=\frac{2^{n+1}e^{-\frac12 x^2}}{\sqrt{2\pi}x}, \end{aligned}

where ${Z\sim\mathcal N(0,1)}$ and ${\phi}$ is the probability density of ${Z}$. Let ${x=2n^{\frac12}}$, then we have

\displaystyle \begin{aligned} \mathbb P\left(\Delta^{(n)}> 2^{-\frac12} n^{\frac12}\right) &\leqslant \frac{2^{n+1}e^{-\frac12\left(2n^{\frac12}\right)^2}}{\sqrt{2\pi}\left(2n^{\frac12}\right)}\\ &= \left(\frac1{\sqrt\pi e^n} \right)\left(\frac 2e \right)^n\\ &\leqslant \left(\frac 2e \right)^n. \end{aligned}

Since

$\displaystyle \sum_{n=0}^\infty \left(\frac 2e \right)^n,$

from the first Borel-Cantelli lemma we have

$\displaystyle \mathbb P\left(\limsup_{n\rightarrow\infty} \left\{\Delta^{(n)}> 2^{-\frac12} n^{\frac12}\right\}\right) = 0,$

from which

$\displaystyle \mathbb P\left(\sum_{n=1}^\infty \Delta^{(n)}<\infty \right)=1.$

This implies that the sequence of functions ${\{B^{(n)}\}}$ on ${C[0,1]}$ is Cauchy, as for ${n

\displaystyle \begin{aligned} \|B^{(n)}-B^{(m)}\|_\infty &= \sup_{t\in[0,1]} \left|B^{(n)}(t)-B^{(m)}(t)| \right|\\ &\leqslant \sum_{j=n}^{m-1}\Delta^{(j)}\\&\stackrel{n,m\rightarrow\infty}\longrightarrow0. \end{aligned}

Since ${C[0,1]}$ is complete, we conclude that ${B:=\lim_{n\rightarrow\infty}B^{(n)}}$ exists, with probability ${1}$, and by construction is Brownian motion on ${[0,1]}$.

# Discrete “M/M/1” queue

Consider a queueing system with one server, interarrival distribution ${\mathsf{Geo}\left(\lambda\right)}$, and service distribution ${\mathsf{Geo}\left(\mu\right)}$, where ${0<\lambda<\mu}$. This is the discrete analog of the ${M/M/1}$ queue. What can we say about its limiting behavior?

Let ${X_n}$ be the number of customers in the system at time ${n}$, then ${X=\{X_n:n=0,1,2,\ldots\}}$ is a Markov chain with transition probabilities

\displaystyle \begin{aligned} P_{0,1} &= 1\\ P_{n-1,n} &= \frac{\mu(1-\lambda)}{\lambda+\mu-\lambda\mu},\ n\geqslant1\\ P_{n,n} &= \frac{\lambda\mu}{\lambda+\mu-\lambda\mu},\ n\geqslant1\\ P_{n,n+1} &= \frac{\lambda(1-\mu)}{\lambda+\mu-\lambda\mu},\ n\geqslant1. \end{aligned}

Suppose ${\nu}$ is an invariant measure for ${P}$, that is, ${\nu P=\nu}$. A stationary distribution ${\pi}$ exists for ${X}$ iff ${V:=\sum_{n=0}^\infty \nu_n<\infty}$, in which case ${\pi=\nu/V}$. Assume WLOG that ${\nu_0=1}$, then the global balance equations are

\displaystyle \begin{aligned} \nu_0 &= \frac{\mu(1-\lambda)}{\lambda+\mu-\lambda\mu}\nu_1\\ \left(\frac{\lambda+\mu-2\lambda\mu}{\lambda+\mu-\lambda\mu}\right)\nu_1 &= \nu_0 + \left(\frac{\lambda(1-\mu)}{\lambda+\mu-\lambda\mu} \right)\nu_2\\ \left(\frac{\lambda+\mu-2\lambda\mu}{\lambda+\mu-\lambda\mu}\right)\nu_n &= \left(\frac{\mu(1-\lambda)}{\lambda+\mu-\lambda\mu} \right)\nu_{n-1}+\left(\frac{\lambda(1-\mu)}{\lambda+\mu-\lambda\mu} \right)\nu_{n+1},\ n\geqslant2.\\ \end{aligned}

It follows that

\displaystyle \begin{aligned} \nu_0 &= 1,\\ \nu_1 &= \frac{\lambda+\mu+\lambda\mu}{\mu(1-\lambda)},\\ \nu_{n+1} &= \left(\frac{\lambda+\mu-2\mu}{\mu(1-\lambda)}\right)\nu_n - \frac{\lambda(1-\mu)}{\mu(1-\lambda}\nu_{n-1},\ n\geqslant2. \end{aligned}

Let ${G(z) = \sum_{n=0}^\infty \nu_n z^n}$ be the generating function of ${\{\nu_n\}}$, then multiplying the recurrence by ${z^n}$ and summing over ${n}$ yields

$\displaystyle \sum_{n=2}^\infty \nu_{n+1}z^n = \left(\frac{\lambda+\mu-2\mu}{\mu(1-\lambda)}\right) \sum_{n=2}^\infty \nu_n z_n - \frac{\lambda(1-\mu)}{\mu(1-\lambda} \sum_{n=2}^\infty \nu_{n-1}z^n,$

so that

$\displaystyle \frac1z\left( G(z) - \nu_0 - \nu_1z - \nu_2z^2 \right) = \left(\frac{\lambda+\mu-2\mu}{\mu(1-\lambda)}\right)\left(G(z)-\nu_0-\nu_1z \right) -\left(\frac{\lambda(1-\mu)}{\mu(1-\lambda} \sum_{n=2}^\infty \nu_{n-1}z^n\right)\left(G(z)-\nu_0\right).$

Solving for ${G(z)}$, we have

$\displaystyle G(z) = \frac{\mu (1-\lambda-z)}{\mu(\lambda-1) +\lambda(1-\mu)z},$

from which it follows that

$\displaystyle \nu_n = \left(\frac{\lambda(1-\mu)}{\mu(1-\lambda)}\right)^n\left(\frac{\lambda+\mu-\lambda\mu}{\lambda(1-\mu)}\right),n\geqslant 2.$

We compute

\displaystyle \begin{aligned} \sum_{n=0}^\infty \nu_n &= \nu_0 + \nu_1 + \sum_{n=2}^\infty \nu_n\\ &= 1 + \frac{\lambda+\mu+\lambda\mu}{\mu(1-\lambda)} + \left(\frac{\lambda+\mu+\lambda\mu}{\mu(1-\lambda)}\right)\sum_{n=2}^\infty \left(\frac{\lambda(1-\mu)}{\mu(1-\lambda)}\right)^n\\ &= 1 + \frac{\lambda+\mu+\lambda\mu}{\mu(1-\lambda)} + \left(\frac{\lambda+\mu+\lambda\mu}{\mu(1-\lambda)}\right) + \frac{(1-\mu)(\lambda+\mu-\lambda\mu)^2}{(1-\lambda)(\mu-\lambda)\mu}\\ &= \frac{\mu(3-\mu)+\lambda(1-\mu(3-\mu))}{\mu-\lambda}, \end{aligned}

and hence

\displaystyle \begin{aligned} \pi_0 &= \frac{\mu-\lambda}{\mu(3-\mu)+\lambda(1-\mu(3-\mu))}\\ \pi_1 &= \frac{(\mu-\lambda)(\lambda+\mu-\lambda\mu)}{\mu(1-\lambda)(\mu(3-\mu)+\lambda(1-\mu(3-\mu)))}\\ \pi_n &= \left(\frac{(\mu-\lambda)(\lambda+\mu-\lambda\mu)}{\mu(1-\lambda)(\mu(3-\mu)+\lambda(1-\mu(3-\mu)))}\right)\left(\frac{\lambda(1-\mu)}{\mu(1-\lambda)}\right)^n,\ n\geqslant 2 \end{aligned}

# The Volterra Operator

Here’s a proof that the Volterra operator is not normal which I thought was clever: Let ${T:L^2([0,1])\rightarrow L^2([0,1])}$ be defined by

$\displaystyle Tf(x) = \int_0^x f(t)\ \mathsf dt.$

If ${\|f\|_2=1}$ then

\displaystyle \begin{aligned} \|Tf\|_2^2 &= \int_0^1 \left|\int_0^x f(t)\ \mathsf dt\right|^2 \mathsf dx\\ &\leqslant \int_0^1 \int_0^x |f(t)|^2\ \mathsf dt\ \mathsf dx\\ &\leqslant \int_0^1 \int_0^1 |f(t)|^2\ \mathsf dt\ \mathsf dx\\ &= \int_0^1 \|f\|_2 \ \mathsf dx\\ &= \|f\|_2, \end{aligned}

so ${T}$ is a bounded operator. It follows that there exists a unique adjoint operator ${T^*:L^2([0,1])\rightarrow L^2([0,1])}$ satisfying ${\langle Tf, g\rangle = \langle f, T^*g\rangle}$ for all ${f,g\in L^2([0,1])}$. Since ${[0,1]}$ has finite Lebesgue measure, ${L^2([0,1])\subset L^1([0,1])}$, and thus

$\displaystyle \int\limits_{[0,1]^2}|f(t)g(x)|\ \mathsf d(x\times t)\leqslant \int\limits_{[0,1]^2}\|f\|_1\|g\|_1\ \mathsf d(x\times t)=\|f\|_1\|g\|_1<\infty.$

It follows that ${fg\in L^2([0,1]^2)}$ and hence by Fubini’s theorem,

\displaystyle \begin{aligned} \langle Tf, g\rangle &= \int_0^1 \int_0^x f(t) \mathsf dt\ g(x) \mathsf dx\\ &=\int\limits_{[0,1]\times[0,x]}f(t)g(x)\ \mathsf d(x\times t)\\ &=\int\limits_{[t,1]\times[0,1]}f(t)g(x)\ \mathsf d(x\times t)\\ &= \int_0^1\int_t^1 f(t)g(x)\ \mathsf dx \mathsf dt\\ &= \int_0^1 f(t)\left(\int_t^1 g(x) \mathsf dx\right) \mathsf dt\\ &= \left\langle f, \int_t^1 g(x) \mathsf dx\right\rangle, \end{aligned}

from which we see that ${T^*f(x) = \int_x^1 f(t)\ \mathsf dt}$.

If ${f\in L^2([0,1])}$ with ${\|f\|_2=1}$, then for any ${x,y\in[0,1]}$, Hölder’s inequality yields

\displaystyle \begin{aligned} |Tf(x)-Tf(y)| &= \left|\int_y^x f(t)\ \mathsf dt \right|\\ &\leqslant \int_y^x 1\cdot|f(t)|\ \mathsf dt\\ &\leqslant \left(\int_y^x 1\ \mathsf dt\right)^{\frac12}\left(\int_y^x |f(t)|^2\ \mathsf dt\right)^{\frac12}\\ &\leqslant |x-y|^{\frac12}\left(\int_0^1 |f(t)|^2\ \mathsf dt\right)^{\frac12}\\ &= \|f\|_2|x-y|^{\frac12}, \end{aligned}

so that ${Tf}$ is Hölder continuous. Moreover, the kernel of ${T}$ is ${K(x,t)=\chi_{(0,x)}(t)}$ and

$\displaystyle \int\limits_{[0,1]^2} |K(x,t)|^2\ \mathsf d(x\times t) = \int_0^1\int_0^x \ \mathsf dt\ \mathsf dx=\frac12<\infty,$

so ${K\in L^2([0,1]^2)}$. It follows that ${T}$ is a Hilbert-Schmidt operator and therefore ${T}$ is compact. It follows then from the Fredholm alternative (see \texttt{http://bit.ly/1Uh8BgP}) that if ${\lambda\in\sigma(T)}$, ${\lambda\ne 0}$ then ${\lambda}$ is an eigenvalue of ${T}$. That is, ${Tf(x)=\lambda f(x)}$ for all ${x\in[0,1]}$. If ${Tf=\lambda f}$, then as ${Tf\in C^0([0,1])}$ we must have ${f\in C^0([0,1])}$. By the fundamental theorem of calculus, ${Tf\in C^1([0,1])}$, and so ${f\in C^1([0,1])}$. Differentiating both sides of the eigenvalue equation yields ${f(x)=\lambda f'(x)}$, and hence ${f(x) = Ce^{\frac1\lambda x}}$ for some ${C\in\mathbb R}$. Since ${Tf(0)=0}$, we have ${f(0)=0}$, and therefore ${C=0}$. It follows that the spectral radius of ${T}$, ${r(T)}$, is zero.

If ${A}$ is a bounded operator on a Hilbert space and ${\|f\|=1}$ then

$\displaystyle \|A^*f\|^2 = \langle A^*f, A^*f\rangle = \langle AA^*f, f\rangle\leqslant \|AA^*f\|$

so

$\displaystyle \|A^*\|^2 \leqslant \|AA^*\|\leqslant \|A\|\|A^*\|$

and hence ${\|A^*\|\leqslant\|A\|}$. By symmetry, ${\|A\|\leqslant\|A^*\|}$, whence ${\|A^*\|=\|A\|}$. It follows that ${\|AA^*\|=\|A\|^2=\|A^*A\|}$. Recall Gelfand’s formula for the spectral radius of a bounded operator ${A}$:

$\displaystyle r(A) = \lim_{n\rightarrow\infty} \|A^n\|^{\frac1n},$

If ${A}$ is a self-adjoint operator and ${\|f\|=1}$ then

$\displaystyle \|Af\|^2 = \langle Af, Af\rangle = \langle A^2f, f\rangle\leqslant\|A^2f\|\|f\|=\|A^2f\|$

which implies ${\|A^2\|=\|A\|^2}$. By induction it follows that ${\|A^{2^n}\|=\|A\|^{2^n}}$ for all ${n}$ and hence

$\displaystyle r(A) = \lim_{n\rightarrow\infty} \|A^{2^n}\|^{\frac1{2^n}} = \lim_{n\rightarrow\infty}\|A\|=\|A\|.$

If ${A}$ is normal, then by induction ${\|(A^*A)^n\|=\|A^n\|^2}$, and as ${A^*A}$ is self-adjoint, ${r(A^*A)=r(A)^2=\|A\|^2}$, from which we see that the spectral radius of a normal operator is equal to its operator norm.

Since the spectral radius of the Volterra operator ${T}$ is zero and its operator norm is positive, we conclude that ${T}$ is not a normal operator.