Professional Documents
Culture Documents
3 Markov Processes 31
3.1 Discrete-time Markov chains . . . . . . . . . . . . . . . . . . . . . 31
3.1.1 Class structure . . . . . . . . . . . . . . . . . . . . . . . . 33
3.1.2 Hitting times and absortion probabilities . . . . . . . . . . 34
3.1.3 Recurrence and transience . . . . . . . . . . . . . . . . . . 39
3.1.4 Recurrence and trasient in a random walk . . . . . . . . . 41
3.1.5 Invariant distributions . . . . . . . . . . . . . . . . . . . . 42
3.1.6 Limiting behaviour . . . . . . . . . . . . . . . . . . . . . . 47
3.1.7 Ergodic theorem . . . . . . . . . . . . . . . . . . . . . . . 47
3.2 Poisson process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2.1 Exponential distribution . . . . . . . . . . . . . . . . . . . 48
3.2.2 Poisson process . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3 Continuous-time Markov chains . . . . . . . . . . . . . . . . . . . 52
3
4 CONTENTS
t −→ Xt (ω)
defined on the parameter set T , is called a realization, trajectory, sample path
or sample function of the process.
Let {Xt ; t ∈ T } be a real-valued stochastic process and {t1 < · · · < tn } ⊂ T ,
then the probability distribution Pt1 ,...,tn = P ◦ (Xt1 , ..., Xtn )−1 of the random
vector
3
4 CHAPTER 1. BASIC CONCEPTS AND DEFINITIONS
for t1 < · · · < tn ∈ T and B1 , ..., Bn ∈ B(Rn ) is called a cylinder set. Consider
the σ-field F generated by the cylinders sets.
Theorem 1.1.1 Consider a family of probability measures
{Pt1 ,...,tn , t1 < · · · < tn , n ≥ 1, ti ∈ T }
such that: 1. Pt1 ,...,tn is a probability on Rn 2.(Consistence condition): If
{tk1 < · · · < tkm } ⊂ {t1 < · · · < tn } ,
then Ptk1 ,...,tkm is the marginal of Pt1 ,...,tn , corresponding to the indexes k1 , ..., km .
Then, there exists a unique probability P on F which has the family {Pt1 ,...,tn }
as finite-dimensional marginal distributions.
1.2 Examples
A real-valued process {Xt , t ≥ 0} is called a second order process provided
E(X 2 ) < ∞ for all t ≥ 0. The mean and the covariance function of a second
order process {Xt , t ≥ 0} are defined by
mX (t) = E(Xt )
ΓX (s, t) = Cov(Xs ; Xt )
= E((Xs − mX (s))(Xt − mX (t)).
The variance of the process {Xt , t ≥ 0} is defined by is defined by
2
σX (t) = ΓX (t, t) = Var(Xt ).
Example 1.2.1 Let X and Y be independent random variables. Consider the
stochastic process with parameter t ∈ [0, ∞)
Xt = tX + Y.
The sample paths of this process are lines with random coefficients. The finite-
dimensional marginal distributions are given by
xi − y
Z
P (Xt1 ≤ x1 , ..., Xtn ≤ xn ) = FX min PY (dy).
R 1≤i≤n ti
Example 1.2.2 Consider the stochastic process
Xt = Acos(ϕ + λt);
where A and ϕ are independent random variables such that E(A) = 0, E(A2 ) <
∞ and ϕ is uniformly distributed on [0, 2π]. This is a second order process with
mX (t) = 0
1
ΓX (s, t) = E(A2 ) cos λ(t − s).
2
1.2. EXAMPLES 5
P {Xt = Yt } = 1.
Xt = 0
0 if 6 t
ξ=
Yt =
1 if ξ=t
Definition 1.3.2 Two stochastic processes {Xt , t ∈ T } and {Yt , t ∈ T } are said
to be indistinguishable if X· (ω) = Y· (ω) for all ω 6∈ N , with P (N ) = 0.
Two stochastic process which have right continuous sample paths and are
equivalent, then they are indistinguishable.
Two discrete time stochastic processes which are equivalent, they are also
indistinguishable.
lim E(|Xt − Xs |p ) = 0.
s−→t
for all s, t ∈ T . Then, there exists a version of the process {Xt , t ∈ T } with
continuous sample paths.
Condition (1.1) also provides some information about the modulus of conti-
nuity of the sample paths of the process: mT (ω, δ) := sups,t∈T,|t−s|<δ |Xt (ω) −
Xs (ω)| . In particular if T = [a, b], for each ε > 0 there exists a random variable
Gε such that, with probability one,
α−1
mT (ω, δ) ≤ Gε |b − a| p −ε
Remark 1.5.1 If T = [0, ∞) , from the definition, it follows that all the
marginal distributions are determined by X0 and Xt − Xs , s < t ∈ T . The
most important processes with independent increments are the Poisson process
and the Wiener process (or Brownian motion).
2. Markov processes.
and to assume that Y and Z are, respectively, sequences of i.i.d random variables
and that Y and Z are independent.
we have that
2
|t − s| = E((ξt − ξs ) ) = E(ξt2 ) + E(ξs2 ) − 2Cov (ξt , ξs ) ,
and
1
Cov (ξt , ξs ) = (t + s − |t − s|) = min(s, t).
2
10 CHAPTER 1. BASIC CONCEPTS AND DEFINITIONS
1.7 Problemes
1. Siguin Ω = [0, 1], F = B([0, 1]) i P la mesura de probabilitat uniforme en
Ω. Considereu el procés Xt (ω) = tω, t ∈ [0, 1]. Trobeu les trajectòries i
les distribucions en una i dos dimensions.
2. Siguin Xt = A cos αt + B sin αt i Yt = R cos(αt + Θ). On α > 0, A i B
són variables aleatòries independents amb distribució N (0, σ 2 ). R i Θ són
també variables aleatòries independents on Θ ∼Unif(0, 2π) i R té densitat
x x2
fR (r) = exp(− )1(0,+∞) (x).
σ2 2σ 2
Demostreu que els processos {Xt }t≥0 i {Yt }t≥0 tenen la mateixa llei.
3. Siguin X1 i X2 variables aleatòries independents definides en el mateix es-
pai de probabilitat (Ω, F, P ) i amb la mateixa distribució normal estàndard.
Sigui {Yt }t≥0 el procés estocàstic definit per
Yt = (X1 + X2 ) t.
Determineu les distribucions en dimensió finita del procés. Si A és el
conjunt de les trajectòries no negatives del procés, calculeu P (A).
4. Sigui {Yt }t≥0 el procés estocàstic definit per
Yt = X + αt, α > 1,
on X és una variable aleatòria amb llei N(0, 1). Sigui D ⊂ [0, +∞) un
conjunt finit o infinit numerable. Determineu:
Yt = α sin (βt + X) ,
Example 2.1.1 Consider the particular case where the σ-field B is generated by
a finite partition {B1 , ..., Bm }. In this case, the conditional expectation E(X|B)
is a discrete random variable that takes the constant value E(X|Bj ) on each set
Bj :
m
X E(X1Bj )
E(X|B) = 1Bj .
j=1
P (Bj )
Here are some properties of the conditional expectations in the general case:
13
14 CHAPTER 2. DISCRETE TIME MARTINGALES
2. A random variable and its conditional expectation have the same expecta-
tion: E(E(X|B)) = E(X). This follows from property (ii) taking A = Ω.
3. If X and B are independent, then E(X|B) = E(X). In fact, the constant
E(X) is clearly B-measurable, and for all A ∈ B we have E(X1A ) =
E(X)E(1A ) = E(E(X)1A ).
4. If X is B-measurable, then E(X|B) = X.
5. If Y is a bounded and B-measurable random variable, then E(Y X|B) =
Y E(X|B). In fact, the random variable Y E(X|B) is integrable and B-
measurable, and for all A ∈ B we have
E(E(X|B)Y 1A ) = E(XY 1A ),
where the equality follows from (2.1). This property means that B-measurable
random variables behave as constants and can be factorized out of the
conditional expectation with respect to B. This property holds if X, Y ∈
L2 (Ω).
6. Given two σ-fields C ⊂ B, then E(E(X|B)|C) = E(X|C)
7. Consider two random variable X and Z, such that Z is B-measurable
and X is independent of B. Consider a measurable function h(x, z) such
that the composition h(X, Z) is an integrable random variable. Then,
we have E(h(X, Z)|B) = E(h(X; z))|z=Z . That is, we first compute the
expectation E(h(X; z)) for any fixed value z of the random variable Z
and, afterwards, we replace z by Z.
Notice that the conditional expectation E(X|Y1 , ..., Ym ) is a function g(Y1 , ..., Ym )
of the variables Y1 , ..., Ym , where
Z
g(y1 , ..., ym ) = xp(dx|y1 , ..., ym ).
R
In particular, if the random variables X, Y1 , ..., Ym have a joint density f (x, y1 , ..., ym ),
then the conditional distribution has the density:
f (x, y1 , ..., ym )
f (x|y1 , ..., ym ) = R ∞ ,
−∞
f (x, y1 , ..., ym )dx
and Z ∞
E(X|Y1 , ..., Ym ) = xf (x|Y1 , ..., Ym )dx.
−∞
hZ, Y i = E(ZY ).
Then, the set of square integrable and B-measurable random variables, denoted
by L2 (Ω, B, P ) is a closed subspace of L2 (Ω, F, P ). Then, given a random
variable X such that E(X 2 ) < ∞, the conditional expectation E(X|B) is the
projection of X on the subspace L2 (Ω, B, P ). In fact, we have:
E(Mn+1 |Fn ) = Mn .
E(∆Mn |Fn−1 ) = 0,
Example 2.2.1 Suppose that {ξn , n ≥ 1} are i.i.d. centered random variables.
Set M0 = 0 and Mn = ξ1 + ... + ξn , for n ≥ 1.Then Mn is a martingale w.r.t
(Fn ), with Fn = σ(ξ1 , ..., ξn ) for n ≥ 1, and F0 = {φ, Ω}. In fact,
Example 2.2.2 Suppose that {ξn , n ≥ 1} are i.i.d. random variables such that
ξ1 +...+ξn
P (ξn = 1} = 1 − P (ξn = −1} = p, on 0 < p < 1. Then Mn = 1−p p ,
2.2. DISCRETE TIME MARTINGALES 17
In the two previous examples, Fn = σ(M0 , ..., Mn ), for all n ≥ 0. That is,
(Fn ) is the filtration generated by the process (Mn ). Usually, when the filtration
is not mentioned, we will take Fn = σ(M0 , ..., Mn ), for all n ≥ 0. This is always
possible due to the following result:
E(Mm |Fn ) = Mn .
In fact, for m > n + 1
2n−1
if ξ1 = .... = ξn−1 = −1
Hn = , n > 1,
0 otherwise
2.3. STOPPING TIMES 19
{T = n} ∈ Fn , for all n ≥ 0.
Proof. {T ≤ n} = ∪nj=1 {T = j}
{T = n} = {T ≤ n} ∩ {T ≤ n − 1}c .
Remark 2.3.1 We could consider random times with certain delay, for in-
stance if {T ≤ n − 1} ∈ Fn . Or even we can consider predictable times:
{T ≤ n} ∈ Fn−1 .
20 CHAPTER 2. DISCRETE TIME MARTINGALES
Remark 2.3.2 The extension of the notion of stopping time to continuous time
is evident: the (generalized) random variable
T : Ω −→ [0, ∞]
Ta := inf{t > 0 : Xt = a}
is a stopping time because
{S ∨ T ≤ t} = {S ≤ t} ∩ {T ≤ t} ,
{S ∧ T ≤ t} = {S ≤ t} ∪ {T ≤ t} .
A ∩ {T ≤ t} = A ∩ {S ≤ t} ∩ {T ≤ t} ∈ Ft
for all t ≥ 0.
Remark 2.3.3 The σ-field FT have the meaning of the information before the
stopping time T . We observe what happens until time T , time T included.
2.4. OPTIONAL STOPPING THEOREM 21
{XT ∈ B} ∩ {T = n} = {Xn ∈ B} ∩ {T = n} ∈ Fn
for any subset B of the state space (Borel set if the state space is R).
as a consequence, if {Mn } is a (sub)martingale, the stopped process MnT :=
{MT ∧n } will be a (sub)martingale.
E(MT |FS ) ≥ MS
Proof. We make the proof only in the martingale case. Notice first that
MT is integrable because
Xm
|MT | ≤ |Mn |.
n=0
Moreover, the random variables Hn are nonnegative and bounded by one. There-
fore by Proposition 2.2.2, (H · M )n is a martingale. We have
(H · M )0 = M0
(H · M )m = M0 + 1A (MT − MS ).
T = inf{n ≥ 0, Mn ≥ λ} ∧ N.
Remark 2.4.1 Note that the last inequality is a generalization of the Chebyshev
inequality.
XN = YN
Xn = max(Yn , E(Xn+1 |Fn )), 0 ≤ n ≤ N − 1,
E(Xn+1 |Fn ) ≤ Xn .
ν = inf{n ≥ 0, Xn = Yn }
Proof.
And
n
X
Xnν = X0 + 1{j≤ν} (Xj − Xj−1 )
j=1
therefore
ν
Xn+1 − Xnν = 1{n+1≤ν} (Xn+1 − Xn )
and
ν
E(Xn+1 − Xnν |Fn ) = 1{n+1≤ν} E(Xn+1 − Xn |Fn )
0 if ν ≤ n since the indicator vanishes
=
0 if ν > n since in such a case Xn = E(Xn+1 |Fn )
Corollary 2.5.1
On the other hand (Xn ) is supermartingale and then (Xnτ ) as well for all τ ∈
τ0,N , so
τ
X0 ≥ E(XN |F0 ) = E(Xτ |F0 ) ≥ E(Yτ |F0 ),
therefore
E(Yν |F0 ) ≥ E(Yτ |F0 ), ∀τ ∈ τ0,N
where
νn = inf{j ≥ n, Xj = Yj }
Definition 2.5.1 A stopping time ν is said to be optimal for the sequence (Yn )
if
E(Yν |F0 ) = sup E(Yτ |F0 ).
τ ∈τ0,N
Reciprocally, we know, by the previous corollary, that X0 = supτ ∈τ0,N E(Yτ |F0 ).
Then, if τ is optimal
where the last inequality is due to the fact that (Xnτ ) is a supermartingale. So,
we have
E(Xτ − Yτ |F0 ) = 0
and since Xτ − Yτ ≥ 0, we conclude that Xτ = Yτ .
Now we can also see that (Xnτ ) is a martingale. We know that it is a super-
martingale, then
Xnτ ≥ E(XN
τ
|Fn ) = E(Xτ |Fn )
Xn = Mn − An
and to identify
n
X
Mn = (Xj − E(Xj |Fj−1 )) + X0 ,
j=1
Xn
An = (Xj−1 − E(Xj |Fj−1 ))
j=1
we have
Mn − Mn0 = An − A0n , 0 ≤ n ≤ N,
but then since (Mn ) y (Mn0 ) are martingales and (An ) y (A0n ) predictable, it
turns out that
that is
AN − A0N = AN −1 − A0N −1 = ... = A0 − A00 = 0,
since by hypothesis A0 = A00 = 0.
This decomposition is known as the Doob decomposition.
Proposition 2.6.2 The biggest optimal stopping time for (Yn ) is given by
N si AN = 0
νmax = ,
inf{n, An+1 > 0} si AN > 0
Xνmax = Yνmax
N
X −1
Xνmax = 1{νmax =j} Xj + 1{νmax =N } XN
j=1
N
X −1
= 1{νmax =j} max(Yj , E(Xj+1 |Fj )) + 1{νmax =N } YN ,
j=1
then we have:
Lemma 2.7.1 Let {Mn }n≥0 be a supermartingale let UN [a, b](ω) be the number
of upcrossings by time N. Then
Corollary 2.7.1 Let {Mn }n≥0 be a supermartingale such that supn E(|Mn |) <
∞. Let a < b ∈ R and U∞ [a, b] := limn→∞ Un [a, b], then
so that
P (U∞ [a, b] = ∞) = 0.
and taking the limit when N → ∞ and using the monotone convergence theorem
we obtain the result.
28 CHAPTER 2. DISCRETE TIME MARTINGALES
Theorem 2.7.1 Let {Mn }n≥0 be a supermartingale such that supn E(|Mn |) <
∞. Then almost surely M∞ := limn→∞ Mn exists and is finite.
Proof. Let Λ = {ω, Mn (ω) does not converge to a limit in [−∞, +∞]},
then
Λ = ∪a<b∈Q {ω, lim inf Mn (ω) < a < b < lim sup Mn (ω)}
= ∪a<b∈Q Λab (say),
but
Λab ⊂ {U∞ [a, b] = ∞} ,
and by the previous corollary P (Λab ) = 0 and then P (Λ) = 0. So M∞ exists in
[−∞, +∞] a.s., but by Fatou’s lemma
E (|M∞ |) = E (lim inf |Mn |) ≤ lim inf E (|Mn |) ≤ sup E(|Mn |) < ∞,
n
so M∞ is finite a.s..
Example 2.7.1 Suppose that {ξn , n ≥ 1} are i.i.d. centered random variables
with distribution N (0, 2σ 2 ). Set M0 = 1 and
Xn
Mn = exp ξj − nσ 2 .
j=1
Theorem 2.7.2 Let {Mn }n≥0 be a martingale, such that supn E(Mn2 ) < ∞,
then Mn → M∞ a.s. and in L2 .
so
∞
X
E((Mk − Mk−1 )2 ) < ∞,
k=1
2.7. MARTINGALE CONVERGENCE THEOREMS 29
when r → ∞, we have
∞
X
E((M∞ − Mn )2 ) ≤ E((Mk − Mk−1 )2 ),
k=n+1
so
lim E((M∞ − Mn )2 ) = 0.
n→∞
30 CHAPTER 2. DISCRETE TIME MARTINGALES
Chapter 3
Markov Processes
Definition 3.1.1 We say that (Xn )n≥0 is a Markov chain with initial distri-
bution λ := (λi , i ∈ I) and transition matrix P := (pij , i, j ∈ I)
(i) P {X0 = i} = λi , i ∈ I
(ii) for all n ≥ 0, P (Xn+1 = in+1 |X0 = i0 , X1 = i1 , ..., Xn = in ) = pin in+1 .
Where I is a countable set. We say that (Xn )n≥0 is Markov(λ, P ) for short.
31
32 CHAPTER 3. MARKOV PROCESSES
Proof.
P (Xn+m = j|Xn = i)
P (Xn+m = j, Xn = i)
= =
P (Xn = i)
P P P
i0 ∈I i1 ∈I ... in+m−1 ∈I P (X0 = i0 , ..., Xn = i)piin+1 · · · pin+m−1 j
= P P P
i0 ∈I i1 ∈I ... in ∈I P (X0 = i0 , ..., Xn = i)
X X
= ... piin+1 · · · pin+m−1 j
in+1 ∈I in+m−1 ∈I
P P
in+1 ∈I ... in+m−1 ∈I P (X0 = i, X1 = in+1 ..., Xm−1 = in+m−1 , Xm = j)
=
P (X0 = i)
P (X0 = i, Xm = j)
= = P (Xm = j|X0 = i)
P (X0 = i)
Proof. (i):
XX X
P {Xn = j} = ... P (X0 = i0 , ..., Xn = j)
i0 ∈I i1 ∈I in−1 ∈I
XX X
= ... P (Xn = j|Xn−1 = in−1 , ..., X0 = i0 )P (Xn−1 = in−1 , ..., X0 = i0 )
i0 ∈I i1 ∈I in−1 ∈I
XX X
= ... pin−1 j P (Xn−1 = in−1 , ..., X0 = i0 )
i0 ∈I i1 ∈I in−1 ∈I
XX X
= ... pin−2 in−1 pin−1 j P (Xn−2 = in−2 , ..., X0 = i0 ) =
i0 ∈I i1 ∈I in−1 ∈I
XX X
= ... = ... λi0 pi0 i1 · · · pin−2 in−1 pin−1 j = (λP n )j .
i0 ∈I i1 ∈I in−1 ∈I
(n)
pii1 · · · pin−1 j = (P n )ij = pij .
P P
(ii): P (Xn = j|, X0 = i) = i1 ∈I ... in−1 ∈I
Example 3.1.1 The most general two-state chain has transition matrix
1−α α
P = .
β 1−β
3.1. DISCRETE-TIME MARKOV CHAINS 33
(n)
If we want to calculate p11 we can use the relation P n+1 = P n P , then
(n+1) (n) (n)
p11 = p11 (1 − α) + p12 β.
We know that
(n) (n)
p11 + p12 = P1 (Xn = 1 or 2) = 1,
so
(n+1) (n)
p11 = p11 (1 − α − β) + β, n ≥ 0
(0)
with p11 = 1. This has a unique solution
β α n
(n)
p11 = α+β + α+β (1 − α − β) for α + β > 0
1 for α + β = 0.
(i) i → j;
(ii) pi0 i1 · · · pin−1 in > 0 for some states i0 , i1 , ..., in , with i0 = i and in = j;
(n)
(iii) pij > 0 for some n ≥ 0.
Proof. (i)∼(iii):
∞
(n) (n)
X
pij ≤ Pi (Xn = j for some n ≥ 0) ≤ pij .
n=0
(ii)∼ (iii):
(n)
X
pij = pii1 · · · pin−1 j.
i1 ,i2 ,...,in−1
i ∈ C, i → j imply j ∈ C.
the communicating classes associated with it are: {1, 2, 3}, {4} and {5, 6}, with
only {5, 6} being closed.
τ A = inf{n ≥ 0, Xn ∈ A}.
hA A
i := Pi (τ < ∞).
Proposition 3.1.4
X
Ei (τ A ) = Pi (τ A ≥ n)
1≤n≤∞
Proof. X
Pi (τ A ≥ n) = Pi (τ A = m),
n≤m≤∞
then
X X X
Pi (τ A ≥ n) = Pi (τ A = m)
1≤n≤∞ 1≤n≤∞ n≤m≤∞
X X
= Pi (τ A = m)
1≤m≤∞ 1≤n≤m
X
= mPi (τ A = m).
1≤m≤∞
3.1. DISCRETE-TIME MARKOV CHAINS 35
Then
X
hA
i = Pi (τ A < ∞, X1 = j)
j∈I
X
= Pi (τ A < ∞|X1 = j)Pi (X1 = j)
j∈I
X X
= Pj (τ A < ∞)Pi (X1 = j) = pij hA
j .
j∈I j∈I
36 CHAPTER 3. MARKOV PROCESSES
= ... = Pi (τ A ≥ 1) + ... + Pi (τ A ≥ n)
X X
+ ... pij1 · · · pjn−1 jn yjn ,
j1 6∈A jn 6∈A
and X
yi ≥ Pi (τ A ≥ n).
1≤n≤∞
p00 = 1
pi,i−1 = q > 0, pi,i+1 = p = 1 − q > 0, for i = 1, 2, ....
Imagine you enter in the casino with 1 euro and you can win 1 euro with
probability por to lose it with probability q. The resources of the casino are
regarded as infinite, so there is no upper bound to your fortune. But which is
the probability you finish broke?
Set hi = Pi (τ {0} < ∞), then hi is the minimal solution of the system
h0 = 1,
hi = qhi−1 + phi+1 , i = 1, 2, ...
hi = A + Bi
38 CHAPTER 3. MARKOV PROCESSES
Remark 3.1.2 Note also that if τ is a finite stopping time associated to (Xn )n≥0
and A ∈ Fτ then for all m ≥ 0
P (Xτ +1 = j1 , Xτ +2 = j2 , ..., Xτ +m = jm |A, Xτ = i)
= P (X1 = j1 , X2 = j2 , ..., Xm = jm |X0 = i).
In particular assume that
τ0 = inf{n ≥ 0, Xn ∈ J ⊂ I}
and for m = 0, 1, 2, ...
τm+1 = inf{n > τm , Xn ∈ J ⊂ I},
then
P Xτm+1 = j|Xτ0 = i0 , Xτ1 = i1 , ..., Xτm = i
= P (Xτ1 = j|Xτ0 = i) ,
where for i, j ∈ J
P (Xτ1 = j|Xτ0 = i) = Pi (Xn = j for some n ≥ 1)
3.1. DISCRETE-TIME MARKOV CHAINS 39
(i)
P (δr(i) = n|τr−1 < ∞, A) = P (τ (i) = n|X0 = i).
note that
∞ ∞ ∞
(n)
X X X
Ei (Vi ) = Ei (1{Xn =i} ) = Pi (Xn = i) = pii .
n=0 n=0 n=0
Lemma 3.1.2 For r ≥ 0 we have Pi (Vi > r) = fir (we assume 00 = 1).
Proof.
Pi (Vi = ∞) = lim Pi (Vi > n) = fin ,
n→∞
(i)
then if fi := Pi (τ < ∞) = 1, then Pi (Vi = ∞) = 1 and i is recurrent and
∞
(n)
X
pii = Ei (Vi ) = ∞.
n=0
Theorem 3.1.2 Let C a communicating class. Then either all states in C are
trasient or recurrent.
then
∞ ∞
X (r) 1 (n+m+r)
X
pjj ≤ (n) (m)
pii < ∞,
r=0 pij pji r=0
so by the previous Theorem j is also trasient.
Proof. Let C a class that is not closed. Then there exist i ∈ C, j 6∈ C and
m ≥ 1 such that
Pi (Xm = j) > 0.
Since we have that
Proof. Suppose that C is close and that (Xn )n≥0 starts in C. Then for
some i ∈ C
P (Xn = i for infinitely many n) > 0,
since the class is finite. Let τ = inf{n ≥ 0, Xn = i} then
Bu Stirling’s formula
√
n! ∼ 2πnnn e−n , as n → ∞
so n
(2n) (2n)! n 1 (4pq)
p00 = 2 (pq) ∼
√ √ , as n → ∞.
(n!) π n
(2n)
If p = q = 21 , p00 ∼ √1 √1 ,
π n
and since
∞
X 1
√ = ∞,
n=1
n
we have that
∞
(n)
X
p00 = ∞,
n=0
so the random walk is recurrent. If, on the contrary, p 6= q, then 4pq = r < 1
(2n) rn
and p00 ∼ √1π √ n
, now since
∞
X rn
√ < ∞,
n=1
n
we have that
∞
(n)
X
p00 < ∞,
n=0
λP = λ.
Theorem 3.1.5 Let (Xn )n≥0 be Markov(λ, P ) and suposse that λ is invariant
then (Xm+n )n≥0 is also Markov(λ, P ).
Proof. By the Markov property (Xm+n )n≥0 has the same transition matrix
with initial distribution P (Xm = i) = (λP m )i = λi .
Proof. We have
(n) (n)
X X X
πj = lim pij = lim pij = 1,
n→∞ n→∞
j∈I j∈I j∈I
and
(n) (n−1) (n−1)
X X
πj = lim pij = lim pik pkj = lim p pkj
n→∞ n→∞ n→∞ ik
k∈I k∈I
X
= πk pkj .
k∈I
Example 3.1.4 Consider the two-state Markov chain with transition matrix
1−α α
P = .
β 1−β
and we obtain
β α
λ1, λ2 = α+β , α+β .
Note also that , for the case α = β = 1 we do not have convergence of P n .
Proof. (i) is obvious. (ii) Since P is recurrent, under Pk , τ (k) < ∞ and
X0 = Xτ (k) = k
(k)
∞
τ
!
X X
k
γj = Ek 1{Xn =j} = Ek 1{Xn =j} 1{n≤τ (k) }
n=1 n=1
∞
X ∞
XX
= Pk (Xn = j, n ≤ τ (k) ) = Pk (Xn−1 = i, Xn = j, n ≤ τ (k) )
n=1 i∈I n=1
∞
XX
= Pk (Xn = j|Xn−1 = i, n ≤ τ (k) )Pk (Xn−1 = i, n ≤ τ (k) )
i∈I n=1
X ∞
X X τ (k)
X −1
= pij Pk (Xn−1 = i, n ≤ τ (k) ) = pij E 1{Xm =i}
i∈I n=1 i∈I m=0
X
= pij γik .
i∈I
(n)
(iii) Fixed i ∈ I, since P is irreducible, there exist m, n ≥ 0 such that pki > 0
(m) (n) (m)
and pik > 0, then by (i) and (ii) γik > γkk pki > 0 and γkk > γik pik .
Proof.
X X
λj = λ i0 pi 0 j = λi0 pi0 j + pkj
i0 ∈I i0 6=k
X X X
= λi1 pi1 i0 pi0 j + pkj + pki0 pi0 j
i0 6=k i1 6=k i0 6=k
X X X
= ... = ... λin pin in−1 · · · pi0 j
i0 6=k i1 6=k in 6=k
X X X X
+ pkj + pki0 pi0 j + ... + ... pkin−1 · · · pi0 j
i0 6=k i0 6=k i1 6=k in−1 6=k
X X X X
≥ pkj + pki0 pi0 j + ... + ... pkin−1 · · · pi0 j
i0 6=k i0 6=k i1 6=k in−1 6=k
so µi = 0.
46 CHAPTER 3. MARKOV PROCESSES
mi := Ei (τ (i) )
X X πi 1
mk = γik ≤ = <∞
πk πk
i∈I i∈I
and k is positive recurrent. Note finally that, since P is recurrent, by the previ-
ous theorem the above inequality is in fact an equality.
Theorem 3.1.10 If the state space is finite there is at least one stationary
distribution.
Proof. We can assume the chain is irreducible, then since it is finite, every
state is positive recurrent, in fact since
X
mk = γik
i∈I
and I is finite it is sufficient to see that γik < ∞, but this follows from Theorem
3.1.7.
3.1. DISCRETE-TIME MARKOV CHAINS 47
Theorem 3.1.11 Let P be ireducible and aperiodic and suposse that P has an
invariant distribution π then
(n)
pij → πj as n → ∞ for all i, j.
Vi (n) Vi 1
≤ →0=
n n mi
(i)
Suposse that P is recurrent. Let δr the length of the r excursion to i. Then
(i) (i)
by the strong Markov property δr are i.i.d. with E(δr ) = mi , and
(i) (i) (i) (i)
δ1 + ... + δVi (n)−1 n δ1 + ... + δVi (n)
≤ ≤
Vi (n) Vi (n) Vi (n)
For the second part, we can assume w.l.o.g. that |f | ≤ 1 then for any J ⊆ I
n−1
1 X X V (n)
i
f (Xk ) − πi f (i) = − πi f (i)
n n
k=0 i∈I
X Vi (n)
X Vi (n)
≤
n − π i
+
n − πi
i∈J i6∈J
X Vi (n) X
Vi (n)
≤ n − πi +
+ πi
n
i∈J i6∈J
X Vi (n) X
≤2
n − π i + 2
πi ,
i∈J i6∈J
• T /µ ∼ E (µλ) .
(i) N0 = 0
(ii) Nt+s − Ns ∼ P oisson(λt)
(iii) (Nt )t≥0 has independent increments .
Conversely , if (i) (ii) and (iii) hold and the process is right continuous, then
(Nt )t≥0 is a Poisson process.
Proof. (The converse)
P (T1 > t) = P (Nt = 0) = e−λt .
(i) N0 = 0,
(ii) Nt is an increasing, right continuous, integer-valued process
with independent increments,
(ii) P (Nt+h − Nt = 0) = 1 − λh + o(h), P (Nt+h − Nt = 1) = λh + o(h) ,
as h ↓ 0 uniformly in t.
If (i) (ii) and (iii) hold, for i = 2, 3, ..., we have P (Nt+h − Nt = i) = o(h) as
h ↓ 0, uniformly in t. Set pj (t) = P (Nt = j). Then, for j = 1, 2, ...,
j
X
pj (t + h) = P (Nt+h = j) = P (Nt+h − Nt = i)P (Nt = j − i)
i=0
= (1 − λh + o(h))pj (t) + (λh + o(h)) pj−1 (t) + o(h).
So
pj (t + h) − pj (t)
= −λpj (t) + λpj−1 (t) + O(h),
h
since this estimate is uniform in t we can put t = s − h to obtain
pj (s) − pj (s − h)
= −λpj (s − h) + λpj−1 (s − h) + O(h),
h
now letting h ↓ 0 we obtain that pj (t) are differentiable and satisfies the differ-
ential equation
p0j (t) = −λpj (t) + λpj−1 (t).
Analogously we can see that
and since
p0 (0) = 1, pj (0) = 0 for j = 1, 2, ...
we have that
j
(λt)
pj (t) = e−λt for j = 0, 1, 2, ....
j!
52 CHAPTER 3. MARKOV PROCESSES
Example 3.3.1 Let (Nt )t≥0 be a Poisson process with rate λ and let (Yn )n≥0
be a discrete time Markov chain with transition probabilities πij independent of
(Nt )t≥0 . Then Xt = YNt is a continuous-time Markov chain. Intuitively, this
follows from the memoryless property of the exponential distribution : if Xs = i,
then, independently of what happened in the past, the time to the next jump will
be exponentially distributed with rate λ and will go to state j with probability
pij . If we write
pij (t) = P (Xt = j|X0 = i),
we have
P (Xt = j, X0 = i) P (YNt = j, Y0 = i)
pij (t) = =
P (X0 = i) P (Y0 = i)
∞
X P (Yk = j, Y0 = i)1
= P (Nt = k)
P (Y0 = i)
k=0
∞ k
−λt ∞ (k)
X (k) e
(λt) X e−λt (tλπij )
= πij =
k! k!
k=0 k=0
= e−λt etλΠ ij = et(λΠ−λI) = etQ ij .
ij
Chapman-Kolmogorov equation
Proposition 3.3.1
X
pik (s)pkj (t) = pij (t + s). (3.3)
k∈I
Proof.
X
pij (t + s) = P (Xt+s = j|X0 = i) = P (Xt+s = j, Xs = k|X0 = i)
k∈I
X
= P (Xt+s = j|Xs = k)P (Xs = k|X0 = i).
k∈I
3.3. CONTINUOUS-TIME MARKOV CHAINS 53
(3.3) shows that if we know that transition probability for t < t0 for any
t0 > 0 we know it for all t. This observation suggests that the transition
probabilities pij (t) can be determined from their derivatives at 0 :
pij (h)
qij = lim , j 6= i.
h→0 h
If this limit exists we will call qij the jump rate from i to j.
For Example 3.3.1 qij = λπij .
Example 3.3.1 is atypical. There we started with the Markov chain and then
defined its rates: In most cases it is much simpler to describe the system by
writing down its transition rates qij for j 6= i, which describe the rates at which
jumps are made from i to j.
Example 3.3.2 Poisson process
qn,n+1 = λ for all n ≥ 0.
Example 3.3.3 M/M/s queue. Imagine a bank with s sellers that serve cus-
tomers who queue in a single line if all of the servers are busy. We imagine that
customers arrive at times of a Poisson process with rate λ, and that each ser-
vice time is an independent exponential with rate µ. As in the previous example
qn,n+1 = λ. To model the departures we let
nµ 0 ≤ n ≤ s
qn,n−1 =
sµ n ≥ µ
Hence P (t) satisfies the forward and backward equations. By repeated term-
by-term differentiation we obtain (iv). To show uniqueness, assume that M (t)
is another solution of the forward equation, then
d d d −tQ
M (t)e−tQ = (M (t)) e−tQ + M (t)
e
dt dt dt
= M (t)Qe−tQ + M (t)(−Q)e−tQ = 0,
so M (t)e−tQ is constant, and so M (t) = etQ . Similarly for the backward equa-
tion.
Since I is finite,
pij (t + h) − pij (t) X
= pik (t)qkj + O(h).
h
k∈I
P
Remark 3.3.1 Note that in the previous theorem qii = − j6=i qij . By Theorem
3.3.1 pij (t), i, j ∈ I, t ≥ 0} is also the solution of the backward equation
Remark 3.3.2 We have similar results for the case of infinite state-space. But
in this case we have to look for a minimal non-negative solution of the backward
and forward equations.
Example 3.3.5 Consider a two-state chain, where, for concreteness I = {1, 2}.
In this case we have to specify only two rates q12 = λ and q21 = µ, so
−λ λ
Q= .
µ −µ
Example 3.3.6 (The Yule process). In this model each particle splits into two
at rate β so qi,i+1 = βi. It can be shown that
p1j (t) = e−βt (1 − e−βt )j−1 for j ≥ 1. (3.4)
From this, since the chain starting at i is the sum of i copies of the chain
starting at i we have
j+i−1 i
pij (t) = e−βt (1 − e−βt )j−i .
j
Example 3.3.8 (Simple birth process or Yule process) qi,i+1 = βi. Write
Xt the number of individuals at time t. Suppose that X0 = 1 and let J1 denote,
as above, the time of the first birth, then J1 ∼ exp(β) and
If we put µ(t) = E(Xt ) then E(Xt |J1 = s) = 2µ(t − s): if we have a birth at s
hereafter we have like two simple birth process of the same type as X. Then
Z t
µ(t) = 2 βe−βs µ(t − s)ds + eβt ,
0
58 CHAPTER 3. MARKOV PROCESSES
= eβt .
Definition 3.3.3 If (Jn )n≥1 n≥1 are the jump times and (Sn )n≥1 the holding
times of our Markov chain
∞
X
ζ = sup Jn = Sn
n=1
Example 3.3.9 (Non-minimal chain) Consider a birth process (Xt )t≥0 start-
ing from 0 with rates qi,i+1 = 2i for i ≥ 0. Then by the previous remark the
process explodes, we have insisted that Xt = ∞ if t ≥ ζ, where ζ is the explosion
time. But another obvious possibility is to start the process off again from zero
at time ζ and do the same for all subsequent explosions. Using the memoryless
property of the exponential distribution it can be shown that the new process is
a continuous Markov chain in the sense of Definition 3.3.1 .
Theorem 3.3.3 For distinct states i and j the following are equivalent:
(i) i → j;
(ii) i → j for the jump chain;
(iii) qi0 i1 qi1 i2 · · · qin−1 in > 0 for some states i0 , i1 , ..., in with i0 = i,
and in = j;
(iv) pij (t) > 0 for all t > 0;
(v) pij (t) > 0 for some t > 0.
with the usual convention that inf ∅ = ∞. Since (Xt )t≥0 is minimal, if H A is
the hitting time of A for the jump chain, then
A
H < ∞ = DA < ∞ ,
hA A A
i = Pi (D < ∞) = Pi (H < ∞).
The average time taken, starting from i, for (Xt )t≥0 to reach A is given by
kiA = Ei (DA )
Theorem 3.3.5 Assume that qi > 0 for all i 6∈ A. The vector of expected
hitting times k A = (kiA , i ∈ A) is the minimal non-negative solution to the
system of linear equations
A
ki P=0 for i ∈ A
(3.5)
− j∈I qij kjA = 1 for i 6∈ A
and then X
−qii kiA − qij kjA = 1.
j6=i
So,
A
n
X n∧H
X
yi ≥ Ei (Sm 1{H A ≥m} ) = Ei Sm → Ei (DA ) = kiA .
n→∞
m=1 m=1
3.3. CONTINUOUS-TIME MARKOV CHAINS 61
Pi ({t ≥ 0, Xt = i} is unbounded) = 1.
We say is transient if
Pi ({t ≥ 0, Xt = i} is unbounded) = 0.
Remark 3.3.5 Note that if (Xt )t≥0 can explode starting from i then i si cer-
tainly not recurrent.
(i) if i is recurrent for the jump chain (Yn )n≥0 , then i is recurrent for (Xt )t≥0 ;
(ii) if i is transient for the jump chain (Yn )n≥0 , then i is transient for (Xt )t≥0 ;
(iii) every state is either recurrent or transient;
(iv) recurrence and transient are class properties.
Proof. (i) Suppose i is recurrent for the jump chain (Yn )n≥0 . If X0 = i then
(Xt )t≥0 does not explode and Jn → ∞.Also X(Jn ) = Yn = i infinitely often, so
{t ≥ 0, Xt = i} is unbounded, with probability 1.
(ii) Suppose i is transient for (Yn )n≥0 . If X0 = i then
N = sup{n ≥ 0, Yn = i} < ∞,
Proof. If qi = 0 then (Xt )t≥0 cannot leave i and pii (t) = 1.Suppose qi > 0.
Let Ni denote the first passage time of the chain (Yn )n≥0 for the state i.Then
so i is recurrent if and only if P (Ti < ∞) = 1 by the previous theorem and the
result for the jump chain (Yn )n≥0 .
(n)
Write πij for (Πn )ij . We shall prove that
Z ∞ ∞
1 X (n)
pii (t)dt = π .
0 qi n=0 ij
Z ∞ Z ∞ Z ∞
pii (t)dt = (Ei 1{Xt =i} )dt = Ei 1{Xt =i} dt
0 0 0
∞
!
X
= Ei Sn+1 1{Yn =i}
n=0
∞ ∞
X 1 X (n)
= Ei (Sn+1 |Yn = i) Pi (Yn = i) = π .
n=0
qi n=0 ij
(i) λQ = 0;
(ii) λP (s) = λ .
Theorem 3.3.13 Let Q be an irreducible rate matrix and let ν be any distri-
bution. Suppose that (Xt )t≥0 is Markov (ν, Q) . Then