Transient and limiting distribution of the M/M/∞ queue

Suppose customers arrive to a queue according to a Poisson process with intensity {\lambda}, have independent service times with distribution {G(t)}, and there is an infinite number of servers available, so that service begins immediately as a customer enters the system. Assume further that the system is empty at time {0}. Let {X(t)} be the number of customers in the system at time {t}; what is the distribution of {X(t)}?

A customer who enters the system at time {s\leqslant t} will still be in the system at time {t} with probability {1-G(t-s)}. Let {\{N(t):t\in\mathbb R_+\}} be a Poisson process with intensity {\lambda} and {N_1(t) = (1-G(t-s))N(t)}, {0\leqslant s\leqslant t}. Then {N_1(t)=X(t)}, and so for each nonnegative integer {k}

\displaystyle  \begin{aligned} \mathbb P_0(X(t) = k) &= \mathbb P_0(N_1(t) = k)\\ &= \exp\left( -\lambda \int_0^t (1-(G(t-s)))\ \mathsf ds\right) \frac{\left(\lambda\int_0^t (1-(G(t-s)))\ \mathsf ds \right)^k}{k!}\\ &= \exp\left( -\lambda \int_0^t (1-(G(s)))\ \mathsf ds\right) \frac{\left(\lambda\int_0^t (1-(G(s)))\ \mathsf ds \right)^k}{k!}.\\ \end{aligned}

In the case where {G(t) = 1-e^{-\mu(t)}}, that is, the service times are exponentially distributed with rate {\mu}, we have

\displaystyle  \int_0^t (1-G(s))\ \mathsf ds = \int_0^t e^{-\mu t} = \frac1\mu \left(1 - e^{-\mu t}\right),

and hence

\displaystyle  \mathbb P_0(X(t) = k) = \exp\left(-\frac\lambda\mu\left(1-e^{-\mu t}\right)\right) \frac{\left(\frac\lambda\mu\left(1-e^{-\mu t}\right)\right)^k}{k!}.

The expected number of customers at time {t} is then

\displaystyle  \begin{aligned} \mathbb E_0[X(t)] &= \sum_{k=1}^\infty k\cdot \mathbb P_0(X(t)=k)\\ &= \sum_{k=1}^\infty k\cdot \exp\left(-\frac\lambda\mu\left(1-e^{-\mu t}\right)\right) \frac{\left(\frac\lambda\mu\left(1-e^{-\mu t}\right)\right)^k}{k!}\\ &= \frac\lambda\mu\left(1-e^{-\mu t}\right)\exp\left(-\frac\lambda\mu\left(1-e^{-\mu t}\right)\right)\sum_{k=0}^\infty \frac{\left(\frac\lambda\mu\left(1-e^{-\mu t}\right)\right)^k}{k!}\\ &= \frac\lambda\mu\left(1-e^{-\mu t}\right)\exp\left(-\frac\lambda\mu\left(1-e^{-\mu t}\right)\right)\exp\left(\frac\lambda\mu\left(1-e^{-\mu t}\right)\right)\\ &= \frac\lambda\mu\left(1-e^{-\mu t}\right). \end{aligned}

Moreover, we observe that

\displaystyle  \lim_{t\rightarrow\infty} \mathbb P_0(X(t)=k) = e^{-\frac\lambda\mu}\frac{\left(\frac\lambda\mu\right)}{k!},

which is the same as the stationary distribution {\pi} of {\{X(t)\}} derived from the generalized global balance equations

\displaystyle  \lambda \pi_k = (k+1)\mu \pi_{k+1},\quad n=0,1,2,\ldots

(normalized by the condition {\sum_{n=0}^\infty \pi_k=1}).

Generating function of a stochastic matrix

This is an example demonstrating my answer to this math.stackexchange post: Let {\{X_n:n\in\mathbb N_0\}} be a Markov chain on {\{0,1\}} with transition matrix

\displaystyle P = \begin{pmatrix}1-\alpha&\alpha\\\beta&1-\beta\end{pmatrix}

where {\alpha,\beta\in(0,1)}. Then {P} is irreducible and aperiodic, so there exists a unique stationary distribution {\pi} satisfying

\displaystyle \lim_{n\rightarrow\infty}(P^n)_{i\cdot} = \pi,\quad i=1,2

where {(P^n)_{i\cdot} } denotes the {i^{\mathrm{th}}} row of {P^n}. From the global balance equation {\alpha\pi_0 = \beta\pi_1} we see that {\pi_1 = \frac\alpha\beta\pi_0}, and normalization yields

\displaystyle \pi = \left(\frac\beta{\alpha+\beta},\frac\alpha{\alpha+\beta} \right).

To actually compute {P^n}, we could write {P^n=AD^nA^{-1}} where the columns of {A} are linearly independent eigenvectors of {P} and {D} is a diagonal matrix whose entries are the corresponding eigenvalues. This is possible because the Perron-Frobenius theorem implies that the eigenspace corresponding to the eigenvalue {1} is one-dimensional, and the rank of {P} is two. Instead, we will compute the generating function of {P}. Let

\displaystyle P(s) = \sum_{n=0}^\infty P^n s^n,

then

\displaystyle  \begin{aligned} P(s) &= (I-sP)^{-1}\\ &= \frac{1}{\det (I-sP)}\mathrm{adj} A\\ &= \frac1{(1-s)(1-(1-\alpha-\beta)s)}\begin{pmatrix}1-(1-\beta)s&\alpha s\\\beta s& 1-(1-\alpha)s\end{pmatrix}. \\ &= \begin{pmatrix}P_{00}(s) & P_{01}(s)\\ P_{10}(s) & P_{11}(s), \end{pmatrix} \end{aligned}

where

\displaystyle P_{ij}(s):=\sum_{n=0}^\infty \mathbb P(X_n=j\mid X_0=i)s^n.

Partial fraction decomposition yields

\displaystyle  \begin{aligned} &\quad\frac1{(1-s)(1-(1-\alpha-\beta)s}\\ &= (\alpha+\beta)^{-1}\frac1{1-s} + \left(1-(\alpha+\beta)^{-1}\right)\frac1{1-(1-\alpha-\beta)s}, \end{aligned}

and so we have

\displaystyle  \begin{aligned} P_{00}(s) &= \left(\frac{1-(1-\beta)s}{\alpha+\beta}\right)\left(\frac1{1-s} + \frac1{1-(1-\alpha-\beta)s}\right)\\ &= \left(\frac{1-(1-\beta)s}{\alpha+\beta}\right) \left( \sum_{n=0}^\infty s^n + \sum_{n=0}^\infty (1-\alpha-\beta)^ns^n\right)\\ &= \sum_{n=0}^\infty \left(\frac\beta{\alpha+\beta}+\frac{\alpha(1-\alpha-\beta)^n}{\alpha+\beta}\right) s^n. \end{aligned}

Similar computations yield

\displaystyle  \begin{aligned} P_{01}(s) &= \sum_{n=0}^\infty \left(\frac\alpha{\alpha+\beta}-\frac{\alpha(1-\alpha-\beta)^n}{\alpha+\beta}\right) s^n,\\\\ P_{10}(s) &= \sum_{n=0}^\infty \left(\frac\beta{\alpha+\beta}-\frac{\beta(1-\alpha-\beta)^n}{\alpha+\beta}\right) s^n,\\\\ P_{11}(s) &= \sum_{n=0}^\infty \left(\frac\alpha{\alpha+\beta}+\frac{\beta(1-\alpha-\beta)^n}{\alpha+\beta}\right) s^n, \end{aligned}

and it follows that

\displaystyle  P(s) = \begin{pmatrix} \pi_0(1+(1-\alpha-\beta)^n) & \pi_1 (1-(1-\alpha-\beta)^n) \\ \pi_0(1-(1-\alpha-\beta)^n) & \pi_1(1+(1-\alpha-\beta)^n)\end{pmatrix} s^n.

Discrete-time queue (Geo/Geo/1)

Queueing theory is generally introduced in the continuous-time setting, with the fundamental {M/M/1} queue for which many analytical results may be obtained with relatively straightforward computations, due largely to the memoryless property of the exponential distribution. In the next few posts I will analyze the discrete-time analog of this model; namely, a single-server queue with *geomentric* interarrival and service times.

Suppose customers arrive to a queue according to a Bernoulli process; that is, at each time {n=1,2,\ldots}, a customer arrives with probability {p_\lambda}, independent of past or future arrivals. The probability that a customer finishes service at time {n+1}, conditioned on that customer having been in service at time {n}, is {p_\mu}. Let {\{A_m : m=1,2,\ldots\}} be the arrival process and {\{D_m : m=1,2,\ldots\}} the departure process. The interarrival times are {T_m:=A_m-A_{m-1}}, with {A_0\equiv 0}. Conditioned on {A_{m-1}=i}, we have {\mathbb P(A_m=i+j)=\mathbb P(T_m=j)=(1-p_\lambda)^{j-1}p_{\lambda}}, and so by independence of arrivals, the {T_m} are i.i.d. {\mathsf{Geo}(p_\lambda)} random variables. A similar argument shows that the service times are i.i.d. {\mathsf{Geo}(p_\mu)} random variables.

Let {\{Z_n:n=0,1,2,\ldots\}} be the number of customers in the system at time {n}, after arrivals and before departures. It follows from the memoryless property of the geometric distribution that {Z_n} is a Markov chain; that is, for any nonnegative integers {n}, {i}, and {j} and any {E\in\sigma\left(\bigcup_{k=0}^n Z_k\right)}, we have

\displaystyle  \mathbb P(Z_{n+1} = j\mid Z_n=i, E) = \mathbb P(Z_{n+1}=j\mid Z_n=i).

At each time {n}, at most one customer can arrive and at most one customer can depart. It follows that {Z_n} is a birth-death chain, with transition probabilities given by

\displaystyle  p(i,j) = \begin{cases} 1-p_\lambda,& i=j=0\\ p_\lambda,& i=0, j=1\\ p_\mu(1-p_\lambda),& j=i-1\\ p_\lambda(1-p_\mu),& j=i+1, i>0\\ p_\lambda p_\mu + (1-p_\lambda)(1-p_\mu),& i=j>0. \end{cases}

Suppose {\nu} is an invariant measure for the transition matrix {P}, that is {\nu = \nu P}. Assume WLOG that {\nu_0=1}, then {\nu_1 = \frac{p_\lambda}{p_\mu(1-p_\lambda)}} and

\displaystyle \nu_n = \left(\frac{p_\lambda(1-p_\mu)}{p_\mu(1-p_\lambda)}\right)\nu_{n-1},\ n\geqslant2.

By inspection, this recurrence has solution

\displaystyle \nu_n=\left(\frac{p_\lambda(1-p_\mu)}{p_\mu(1-p_\lambda)}\right)^{n-1}\frac{p_\lambda}{p_\mu(1-p_\lambda)},\ n\geqslant1.

Now, {\{Z_n\}} is positive recurrent iff {\sum_{n=0}^\infty \nu_n<\infty}, which is true precisely when

\displaystyle p_\lambda(1-p_\mu)<p_\mu(1-p_\lambda).

The offered load to the system is then

\displaystyle \rho := \frac{p_\lambda(1-p_\mu)}{p_\mu(1-p_\lambda)}.

Assuming {\rho<1}, let {C=\sum_{n=0}^\infty \nu_n} and define {\pi=C^{-1}\nu}. By construction, {\pi = \pi P} and {\sum_{n=0}^\infty \pi_n=1}, so that {\pi} is the (unique) stationary distribution of {\{Z_n\}}. A straightforward computation yields

\displaystyle  \begin{aligned} \pi_0 &= \frac{p_\mu-p_\lambda}{p_\mu}\\ \pi_n &= \rho^{n-1}\left(\frac{p_\lambda(1-\rho)}{p_\mu}\right),\ n\geqslant1. \end{aligned}

The limiting mean number of customers in the system {L} is given by

\displaystyle  \begin{aligned} L &= \sum_{n=1}^\infty n\pi_n\\ &= \left(\frac{p_\lambda(1-\rho)}{p_\mu}\right)\sum_{n=1}^\infty n\rho^{n-1}\\ &= \frac{p_\lambda}{p_\mu(1-\rho)}. \end{aligned}

Invoking Little’s Law, we compute the mean sojourn time of customers by

\displaystyle  W = \frac L{p_\lambda} = \frac1{p_\mu(1-\rho)}.

Brownian motion is differentiable nowhere

The elementary continuous functions we work with in calculus tend to be relatively well-behaved, in the sense that they are differentiable everywhere except some isolated set of points. The prototypical example is the map {x\rightarrow|x|}, which fails to be differentiable at zero since {\frac{x-|x|}x = \frac x{|x|}} for {x\ne 0}, so {\lim_{x\rightarrow 0}\frac{x-|x|}x} does not exist. More generally, let {x_0<x_1<\ldots<x_n} and {y_0,\ldots,y_n\in\mathbb R} with

\displaystyle \frac{y_{i+2}-y_{i+1}}{x_{i+2}-x_{i+1}} \ne \frac{y_{i+1}-y_n}{x_{i+1}-x_n}

– that is, the slope of the line segment joining {(x_i,y_i)} and {(x_{i+1}, y_{i+1})} differs from the slope of the line segment joining {(x_{i+1}, y_{i+1})} and {(x_{i+2},y_{i+2})}. Define {f:[x_0,x_n]\rightarrow\mathbb R} such that {f(x_i)=y_i} and the graph of {f} is

\displaystyle  \bigcup_{i=0}^{n-1} \left\{\lambda(x_i,f(x_i))+(1-\lambda)(x_{i+1},f(x_{i+1}):\lambda\in[0,1] \right\}.

That is, {f} is a piecewise linear function connecting the points {(x_i,y_i)} – like the functions {B^{(n)}} used in the previous post. Then {f} is differentiable on {(x_0,x_n)\setminus\{x_1,\ldots,x_{n-1}}. Moreover, if {\{x_n\}} is a strictly increasing sequence with {L:=\sum_{n=0}^{\infty}(x_{n+1}-x_n)<\infty} satisfying the aforementioned slope assumption, then defining {f_n} as above, we see that {\lim_{n\rightarrow\infty}f_n} is continuous on {(x_0,x_0+L)} and differentiable at all but countably many points.

The paths of Brownian motion are even more pathological – they are with probability {1} differentiable nowhere! Let {W=\{W_t:t\in\mathbb R_+\}} be a standard Brownian motion and

\displaystyle  D = \{\omega:\exists t\in(0,1) \textrm{ s.t. } W \textrm{ differentiable at } t\}.

We will show that {\mathbb P(D)=0}. For each integer {n\geqslant2} and {1\leqslant k\leqslant n-2}, set

\displaystyle M(k,n) = \max\left\{|W_{kn^{-1}}-W_{(k-1)n^{-1}}|,|W_{(k+1)n^{-1}}-W_{kn^{-1}}|,|W_{(k+2)n^{-1}}-W_{(k+1)n^{-1}}| \right\}

and

\displaystyle M_n = \min\{M(k,n): 1\leqslant k\leqslant n-2\}.

Suppose {\omega_0\in D}, then since

\displaystyle \lim_{s\rightarrow t}\frac{W(s,\omega_0)-W(t,\omega_0)}{s-t}

exists, we may choose {\delta>0}, {C>0} such that

\displaystyle |W(t,\omega_0)-W(s,\omega_0)|< C|t-s|.

Let {N} be a positive integer with {N > \frac3\delta}, and put

\displaystyle k = \max\left\{j\in\mathbb N: jN^{-1}\leqslant t\right\}.

Then

\displaystyle \max\left\{\left|jN^{-1}-t\right| : j\in\{k-1,k,k+1,k+2\} \right\}<\frac3N<\delta,

so that

\displaystyle  \begin{aligned} \left|W_{kN^{-1}}-W_{(k-1)N^{-1}} \right| &\leqslant \left| W_{kN^{-1}}-W_t\right|+ \left| W_{(k-1)N^{-1}}-W_t\right|\\ &\leqslant C\left|kN^{-1}-t| + C|(k-1)N^{-1}\right|\\ &\leqslant C(N^{-1} + 2N^{-1})\\ &= 3CN^{-1}, \end{aligned}

and similarly

\displaystyle  \left|W_{(k+1)N^{-1}}t-W_{kN^{-1}}\right|<3CN^{-1},\quad \left|W_{(k+2)N^{-1}}t-W_{(k+1)N^{-1}}\right|<3CN^{-1}.

It follows that

\displaystyle M_N(\omega_0)\leqslant M(k,N)(\omega_0)<3CN^{-1}.

Now, the increments {W_{(j+1)N^{-1}}-W_{jN^{-1}}} are i.i.d. with {\mathcal N\left(0,N^{-1}\right)} distribution, so

\displaystyle W_{(j+1)N^{-1}}-W_{jN^{-1}}\stackrel d= n^{-\frac12}W_1,

from which we have

\displaystyle \mathbb P(M(k,N) \leqslant 3CN^{-1}) = \mathbb P\left(|W_1|\leqslant 3CN^{-\frac12}\right)^3.

It follows that

\displaystyle  \begin{aligned} \mathbb P(M_N\leqslant 3CN^{-1}) &= \mathbb P\left(\bigcup_{j=1}^N \left\{M(j,N)\leqslant 3CN^{-1}\right\}\right)\\ &\leqslant \sum_{j=1}^N \mathbb P\left(M(j,N)\leqslant 3CN^{-1}\right)\\ &\leqslant N\mathbb P\left(M(k,N)\leqslant 3CN^{-1} \right)\\ &= N\left(\mathbb P\left(|W_1|\leqslant 3CN^{-\frac12} \right)\right)^3\\ &\leqslant N\left(6CN^{-\frac12}\right)^3\\ &= (6C)^3N^{-\frac12}\stackrel{n\rightarrow\infty}\longrightarrow0. \end{aligned}

Let {E_n = \{\omega: M_m(\omega) \leqslant 3Cn^{-1}\}}, then {\omega_0\in\liminf_{n\rightarrow\infty} E_n} so {D\subset\liminf_{n\rightarrow\infty} E_n}. Fatou’s lemma then yields

\displaystyle \mathbb P(D)\leqslant \mathbb P\left(\liminf_{n\rightarrow\infty} E_n\right) \leqslant\liminf_{n\rightarrow\infty} \mathbb P(E_n)=\lim_{n\rightarrow\infty}\mathbb P(E_n)=0,

from which we conclude.

Construction of Brownian Motion

A stochastic process {X=\{X_t:t\in\mathbb R_+\}} is said to be standard Brownian motion (or a Wiener process) if

  1. {X_0=0} almost surely.
  2. {X} has independent increments, i.e. for {0\leqslant t_0<t_1<\cdots<t_k}, the random variables

    \displaystyle  \left\{X_{t_{i+1}}-X_{t_i} : 0\leqslant i\leqslant k-1 \right\}

    are independent.

  3. For {0\leqslant s<t},

    \displaystyle X_t-X_s\sim\mathcal N(0,t-s).

  4. The map {\omega\rightarrow X_t(\omega)} is almost surely continuous.

In this post I follow the method in Sidney Resnick’s Adventures in Stochastic Processes to construct Brownian motion on {[0,1]} using i.i.d. standard normal random variables. The key to this construction is the following lemma:

Lemma 1 Suppose {X(s)} and {X(t)} are random variables such that

\displaystyle  X(t)-X(s)\sim\mathcal N(0,t-s).

Then there exists a random variable {X\left(\frac{t+s}2\right)} such that

\displaystyle X\left(\frac{t+s}2\right) - X(s),\ X(t) - X\left(\frac{t+s}2\right)\stackrel{\mathrm{i.i.d.}}\sim\mathcal N\left(0,\frac{t-s}2\right).

Proof: Define {U:=X(t)-X(s)} and suppose {V\sim\mathcal N(0,t-s)} is independent of {U}. Define {X\left(\frac{t+s}2\right)} by

\displaystyle  \begin{aligned} X(t) - X\left(\frac{t+s}2\right) &= \frac{U+V}2\\ X\left(\frac{t+s}2\right) - X(s) &= \frac{U-V}2, \end{aligned}

so that

\displaystyle  \begin{aligned} X(t) - X(s) &= U\\ X(t) + X(s) - 2X\left(\frac{t+s}2\right) &= V. \end{aligned}

Then as {U,V\stackrel{\mathrm{i.i.d.}}\sim\mathcal(0,t-s)}, we have

\displaystyle  \mathbb E[(U+V)(U-V)] = \mathbb E[U^2] - \mathbb E[V^2] = 0

so that {X(t)-X\left(\frac{t+s}2\right)} and {X\left(\frac{t+s}2\right)-X(s)} are uncorrelated and hence independent (as they are normally distributed).

\Box

Now let

\displaystyle \left\{V(k2^{-n}):1\leqslant k\leqslant 2^n, n\geqslant 1\right\}

be independent random variables with

\displaystyle V\left(k2^{-(n+1)}\right)\sim\mathcal N(0,2^{-n}).

Let {X(0):=0}, {X(1):=V(1)}, and define {X(2^{-1})} using {V(2^{-1})} so that {X(2^{-1})-X(0)} and {X(1)-X(2^{-1})} are i.i.d. {\mathcal N(0,2^{-1})} random variables. Suppose

\displaystyle \{ X(k2^{-n}): 0\leqslant k\leqslant 2^n\}

are defined such that

\displaystyle X(k2^{-n})-X((k-1)2^{-n})\stackrel{\mathrm{i.i.d.}}\sim\mathcal N(0,2^{-n}),\ 1\leqslant k\leqslant 2^n.

For each {k\leqslant 2^n}, define {X((2k+1)2^{-(n+1)})} such that

\displaystyle  \begin{aligned} X\left((2k+1)2^{-(n+1)}\right)-X(k2^{-n}) &\stackrel d= X((k+1)2^{-n}) - X\left((2k+1)2^{-(n+1)} \right)\\ &\sim\mathcal N\left(0, 2^{-(n+1)}\right) \end{aligned}

and the sequence {\left\{X\left(k2^{-(n+1)}\right):0\leqslant k\leqslant 2^{-(n+1)}\right\}} has independent increments. For each {n=1,2,\ldots} and {0\leqslant k\leqslant 2^n-1} define the processes

\displaystyle  \begin{aligned} B^{(n,k)}(t,\omega) &= 2^n(X((k+1)2^{-n},\omega)-X(k2^{-n},\omega))t\\ &\quad -k\cdot X((k+1)2^{-n},\omega) - (k-1)(X(k2^{-n},\omega)). \end{aligned}

and

\displaystyle  B^{(n)}(t) = \sum_{k=0}^{2^n-1} B^{(n,k)}(t)

That is,

\displaystyle B^{(n)}(t,\omega) = X(t,\omega),\quad t\in\left\{k2^{-n}, 0\leqslant k\leqslant 2^{-n}\right\}

and {B^{(n)}} is linear on each interval {\left[k2^{-n},(k+1)2^{-n}\right]}. Let {\Delta^{(n)}} be the maximum deviation of {B^{(n)}} and {B^{(n+1)}} on {[0,1]}, partitioned by the intervals {\left[k2^{-n},(k+1)2^{-n}\right]}, that is,

\displaystyle \Delta^{(n)}(\omega) = \max_{0\leqslant k\leqslant 2^n-1}\;\max_{k2^{-n}\leqslant t\leqslant (k+1)2^{-n}}\left|B^{(n+1)}(t,\omega)-B^{(n)}(t,\omega)\right|.

For each {n} we have

\displaystyle  \begin{aligned} &\quad\max_{k2^{-n}\leqslant t\leqslant (k+1)2^{-n}}\left|B^{(n+1)}(t,\omega)-B^{(n)}(t,\omega)\right|\\ &= \left|\frac{B^{(n)}((k+1)2^{-n}) + B^{(n)}(k2^{-n}) }2 - B^{(n+1)}\left((2k+1)2^{-(n+1)}\right) \right|\\ &=\left|\frac{X((k+1)2^{-n}) + X(k2^{-n}) }2 - X\left((2k+1)2^{-(n+1)}\right) \right|\\ &=\frac12|V\left((2k+1)2^{-(n+1)}\right)|, \end{aligned}

where {V\left((2k+1)2^{-(n+1)}\right)\sim\mathcal N(0,2^{-n})}, and so

\displaystyle \Delta^{(n)}(\omega) = \frac12\max_{0\leqslant k\leqslant 2^n-1}\left|V\left((2k+1)2^{-(n+1)}\right)\right|.

For {x\geqslant1} we compute

\displaystyle  \begin{aligned} \mathbb P\left(\Delta^{(n)}> \frac{x/2}{2^{n/2}}\right) &= \mathbb P\left(\frac12\max_{0\leqslant k\leqslant 2^n-1}\left|V\left((2k+1)2^{-(n+1)}\right)\right| > \frac12\frac x{2^{n/2}}\right)\\ &= \mathbb P\left(\bigcup_{k=0}^{2^n-1}\left\{\left|V\left((2k+1)2^{-(n+1)}\right)\right|>2^{-\frac n2}x\right\}\right) \\ &\leqslant 2^n \mathbb P(|V(2^{-n})|>2^{-\frac n2}x)\\ &= 2^{n+1}\mathbb P(V(2^{-n})/2^{-\frac n2}>x)\\ &= 2^{n+1}\mathbb P(Z > x)\\ &\leqslant 2^{n+1}\left(\frac{\phi(x)}x\right)\\ &=\frac{2^{n+1}e^{-\frac12 x^2}}{\sqrt{2\pi}x}, \end{aligned}

where {Z\sim\mathcal N(0,1)} and {\phi} is the probability density of {Z}. Let {x=2n^{\frac12}}, then we have

\displaystyle  \begin{aligned} \mathbb P\left(\Delta^{(n)}> 2^{-\frac12} n^{\frac12}\right) &\leqslant \frac{2^{n+1}e^{-\frac12\left(2n^{\frac12}\right)^2}}{\sqrt{2\pi}\left(2n^{\frac12}\right)}\\ &= \left(\frac1{\sqrt\pi e^n} \right)\left(\frac 2e \right)^n\\ &\leqslant \left(\frac 2e \right)^n. \end{aligned}

Since

\displaystyle \sum_{n=0}^\infty \left(\frac 2e \right)^n,

from the first Borel-Cantelli lemma we have

\displaystyle  \mathbb P\left(\limsup_{n\rightarrow\infty} \left\{\Delta^{(n)}> 2^{-\frac12} n^{\frac12}\right\}\right) = 0,

from which

\displaystyle \mathbb P\left(\sum_{n=1}^\infty \Delta^{(n)}<\infty \right)=1.

This implies that the sequence of functions {\{B^{(n)}\}} on {C[0,1]} is Cauchy, as for {n<m}

\displaystyle  \begin{aligned} \|B^{(n)}-B^{(m)}\|_\infty &= \sup_{t\in[0,1]} \left|B^{(n)}(t)-B^{(m)}(t)| \right|\\ &\leqslant \sum_{j=n}^{m-1}\Delta^{(j)}\\&\stackrel{n,m\rightarrow\infty}\longrightarrow0. \end{aligned}

Since {C[0,1]} is complete, we conclude that {B:=\lim_{n\rightarrow\infty}B^{(n)}} exists, with probability {1}, and by construction is Brownian motion on {[0,1]}.

Discrete “M/M/1” queue

Consider a queueing system with one server, interarrival distribution {\mathsf{Geo}\left(\lambda\right)}, and service distribution {\mathsf{Geo}\left(\mu\right)}, where {0<\lambda<\mu}. This is the discrete analog of the {M/M/1} queue. What can we say about its limiting behavior?

Let {X_n} be the number of customers in the system at time {n}, then {X=\{X_n:n=0,1,2,\ldots\}} is a Markov chain with transition probabilities

\displaystyle  \begin{aligned} P_{0,1} &= 1\\ P_{n-1,n} &= \frac{\mu(1-\lambda)}{\lambda+\mu-\lambda\mu},\ n\geqslant1\\ P_{n,n} &= \frac{\lambda\mu}{\lambda+\mu-\lambda\mu},\ n\geqslant1\\ P_{n,n+1} &= \frac{\lambda(1-\mu)}{\lambda+\mu-\lambda\mu},\ n\geqslant1. \end{aligned}

Suppose {\nu} is an invariant measure for {P}, that is, {\nu P=\nu}. A stationary distribution {\pi} exists for {X} iff {V:=\sum_{n=0}^\infty \nu_n<\infty}, in which case {\pi=\nu/V}. Assume WLOG that {\nu_0=1}, then the global balance equations are

\displaystyle  \begin{aligned} \nu_0 &= \frac{\mu(1-\lambda)}{\lambda+\mu-\lambda\mu}\nu_1\\ \left(\frac{\lambda+\mu-2\lambda\mu}{\lambda+\mu-\lambda\mu}\right)\nu_1 &= \nu_0 + \left(\frac{\lambda(1-\mu)}{\lambda+\mu-\lambda\mu} \right)\nu_2\\ \left(\frac{\lambda+\mu-2\lambda\mu}{\lambda+\mu-\lambda\mu}\right)\nu_n &= \left(\frac{\mu(1-\lambda)}{\lambda+\mu-\lambda\mu} \right)\nu_{n-1}+\left(\frac{\lambda(1-\mu)}{\lambda+\mu-\lambda\mu} \right)\nu_{n+1},\ n\geqslant2.\\ \end{aligned}

It follows that

\displaystyle \begin{aligned} \nu_0 &= 1,\\ \nu_1 &= \frac{\lambda+\mu+\lambda\mu}{\mu(1-\lambda)},\\ \nu_{n+1} &= \left(\frac{\lambda+\mu-2\mu}{\mu(1-\lambda)}\right)\nu_n - \frac{\lambda(1-\mu)}{\mu(1-\lambda}\nu_{n-1},\ n\geqslant2. \end{aligned}

Let {G(z) = \sum_{n=0}^\infty \nu_n z^n} be the generating function of {\{\nu_n\}}, then multiplying the recurrence by {z^n} and summing over {n} yields

\displaystyle \sum_{n=2}^\infty \nu_{n+1}z^n = \left(\frac{\lambda+\mu-2\mu}{\mu(1-\lambda)}\right) \sum_{n=2}^\infty \nu_n z_n - \frac{\lambda(1-\mu)}{\mu(1-\lambda} \sum_{n=2}^\infty \nu_{n-1}z^n,

so that

\displaystyle \frac1z\left( G(z) - \nu_0 - \nu_1z - \nu_2z^2 \right) = \left(\frac{\lambda+\mu-2\mu}{\mu(1-\lambda)}\right)\left(G(z)-\nu_0-\nu_1z \right) -\left(\frac{\lambda(1-\mu)}{\mu(1-\lambda} \sum_{n=2}^\infty \nu_{n-1}z^n\right)\left(G(z)-\nu_0\right).

Solving for {G(z)}, we have

\displaystyle G(z) = \frac{\mu (1-\lambda-z)}{\mu(\lambda-1) +\lambda(1-\mu)z},

from which it follows that

\displaystyle \nu_n = \left(\frac{\lambda(1-\mu)}{\mu(1-\lambda)}\right)^n\left(\frac{\lambda+\mu-\lambda\mu}{\lambda(1-\mu)}\right),n\geqslant 2.

We compute

\displaystyle  \begin{aligned} \sum_{n=0}^\infty \nu_n &= \nu_0 + \nu_1 + \sum_{n=2}^\infty \nu_n\\ &= 1 + \frac{\lambda+\mu+\lambda\mu}{\mu(1-\lambda)} + \left(\frac{\lambda+\mu+\lambda\mu}{\mu(1-\lambda)}\right)\sum_{n=2}^\infty \left(\frac{\lambda(1-\mu)}{\mu(1-\lambda)}\right)^n\\ &= 1 + \frac{\lambda+\mu+\lambda\mu}{\mu(1-\lambda)} + \left(\frac{\lambda+\mu+\lambda\mu}{\mu(1-\lambda)}\right) + \frac{(1-\mu)(\lambda+\mu-\lambda\mu)^2}{(1-\lambda)(\mu-\lambda)\mu}\\ &= \frac{\mu(3-\mu)+\lambda(1-\mu(3-\mu))}{\mu-\lambda}, \end{aligned}

and hence

\displaystyle  \begin{aligned} \pi_0 &= \frac{\mu-\lambda}{\mu(3-\mu)+\lambda(1-\mu(3-\mu))}\\ \pi_1 &= \frac{(\mu-\lambda)(\lambda+\mu-\lambda\mu)}{\mu(1-\lambda)(\mu(3-\mu)+\lambda(1-\mu(3-\mu)))}\\ \pi_n &= \left(\frac{(\mu-\lambda)(\lambda+\mu-\lambda\mu)}{\mu(1-\lambda)(\mu(3-\mu)+\lambda(1-\mu(3-\mu)))}\right)\left(\frac{\lambda(1-\mu)}{\mu(1-\lambda)}\right)^n,\ n\geqslant 2 \end{aligned}

The Volterra Operator

Here’s a proof that the Volterra operator is not normal which I thought was clever: Let {T:L^2([0,1])\rightarrow L^2([0,1])} be defined by

\displaystyle Tf(x) = \int_0^x f(t)\ \mathsf dt.

If {\|f\|_2=1} then

\displaystyle  \begin{aligned} \|Tf\|_2^2 &= \int_0^1 \left|\int_0^x f(t)\ \mathsf dt\right|^2 \mathsf dx\\ &\leqslant \int_0^1 \int_0^x |f(t)|^2\ \mathsf dt\ \mathsf dx\\ &\leqslant \int_0^1 \int_0^1 |f(t)|^2\ \mathsf dt\ \mathsf dx\\ &= \int_0^1 \|f\|_2 \ \mathsf dx\\ &= \|f\|_2, \end{aligned}

so {T} is a bounded operator. It follows that there exists a unique adjoint operator {T^*:L^2([0,1])\rightarrow L^2([0,1])} satisfying {\langle Tf, g\rangle = \langle f, T^*g\rangle} for all {f,g\in L^2([0,1])}. Since {[0,1]} has finite Lebesgue measure, {L^2([0,1])\subset L^1([0,1])}, and thus

\displaystyle \int\limits_{[0,1]^2}|f(t)g(x)|\ \mathsf d(x\times t)\leqslant \int\limits_{[0,1]^2}\|f\|_1\|g\|_1\ \mathsf d(x\times t)=\|f\|_1\|g\|_1<\infty.

It follows that {fg\in L^2([0,1]^2)} and hence by Fubini’s theorem,

\displaystyle  \begin{aligned} \langle Tf, g\rangle &= \int_0^1 \int_0^x f(t) \mathsf dt\ g(x) \mathsf dx\\ &=\int\limits_{[0,1]\times[0,x]}f(t)g(x)\ \mathsf d(x\times t)\\ &=\int\limits_{[t,1]\times[0,1]}f(t)g(x)\ \mathsf d(x\times t)\\ &= \int_0^1\int_t^1 f(t)g(x)\ \mathsf dx \mathsf dt\\ &= \int_0^1 f(t)\left(\int_t^1 g(x) \mathsf dx\right) \mathsf dt\\ &= \left\langle f, \int_t^1 g(x) \mathsf dx\right\rangle, \end{aligned}

from which we see that {T^*f(x) = \int_x^1 f(t)\ \mathsf dt}.

If {f\in L^2([0,1])} with {\|f\|_2=1}, then for any {x,y\in[0,1]}, Hölder’s inequality yields

\displaystyle  \begin{aligned} |Tf(x)-Tf(y)| &= \left|\int_y^x f(t)\ \mathsf dt \right|\\ &\leqslant \int_y^x 1\cdot|f(t)|\ \mathsf dt\\ &\leqslant \left(\int_y^x 1\ \mathsf dt\right)^{\frac12}\left(\int_y^x |f(t)|^2\ \mathsf dt\right)^{\frac12}\\ &\leqslant |x-y|^{\frac12}\left(\int_0^1 |f(t)|^2\ \mathsf dt\right)^{\frac12}\\ &= \|f\|_2|x-y|^{\frac12}, \end{aligned}

so that {Tf} is Hölder continuous. Moreover, the kernel of {T} is {K(x,t)=\chi_{(0,x)}(t)} and

\displaystyle \int\limits_{[0,1]^2} |K(x,t)|^2\ \mathsf d(x\times t) = \int_0^1\int_0^x \ \mathsf dt\ \mathsf dx=\frac12<\infty,

so {K\in L^2([0,1]^2)}. It follows that {T} is a Hilbert-Schmidt operator and therefore {T} is compact. It follows then from the Fredholm alternative (see \texttt{http://bit.ly/1Uh8BgP}) that if {\lambda\in\sigma(T)}, {\lambda\ne 0} then {\lambda} is an eigenvalue of {T}. That is, {Tf(x)=\lambda f(x)} for all {x\in[0,1]}. If {Tf=\lambda f}, then as {Tf\in C^0([0,1])} we must have {f\in C^0([0,1])}. By the fundamental theorem of calculus, {Tf\in C^1([0,1])}, and so {f\in C^1([0,1])}. Differentiating both sides of the eigenvalue equation yields {f(x)=\lambda f'(x)}, and hence {f(x) = Ce^{\frac1\lambda x}} for some {C\in\mathbb R}. Since {Tf(0)=0}, we have {f(0)=0}, and therefore {C=0}. It follows that the spectral radius of {T}, {r(T)}, is zero.

If {A} is a bounded operator on a Hilbert space and {\|f\|=1} then

\displaystyle \|A^*f\|^2 = \langle A^*f, A^*f\rangle = \langle AA^*f, f\rangle\leqslant \|AA^*f\|

so

\displaystyle  \|A^*\|^2 \leqslant \|AA^*\|\leqslant \|A\|\|A^*\|

and hence {\|A^*\|\leqslant\|A\|}. By symmetry, {\|A\|\leqslant\|A^*\|}, whence {\|A^*\|=\|A\|}. It follows that {\|AA^*\|=\|A\|^2=\|A^*A\|}. Recall Gelfand’s formula for the spectral radius of a bounded operator {A}:

\displaystyle r(A) = \lim_{n\rightarrow\infty} \|A^n\|^{\frac1n},

If {A} is a self-adjoint operator and {\|f\|=1} then

\displaystyle \|Af\|^2 = \langle Af, Af\rangle = \langle A^2f, f\rangle\leqslant\|A^2f\|\|f\|=\|A^2f\|

which implies {\|A^2\|=\|A\|^2}. By induction it follows that {\|A^{2^n}\|=\|A\|^{2^n}} for all {n} and hence

\displaystyle r(A) = \lim_{n\rightarrow\infty} \|A^{2^n}\|^{\frac1{2^n}} = \lim_{n\rightarrow\infty}\|A\|=\|A\|.

If {A} is normal, then by induction {\|(A^*A)^n\|=\|A^n\|^2}, and as {A^*A} is self-adjoint, {r(A^*A)=r(A)^2=\|A\|^2}, from which we see that the spectral radius of a normal operator is equal to its operator norm.

Since the spectral radius of the Volterra operator {T} is zero and its operator norm is positive, we conclude that {T} is not a normal operator.