You are on page 1of 4

Stochastic Processes and Time Series

Module 3
Markov Chains - III
Dr. Alok Goswami, Professor, Indian Statistical Institute, Kolkata

1 Markov Chain: Classification of States


Consider a MC {Xn }n≥0 on state space S with transition matrix P = (( pxy )). We classify the
states into two types − they will play very different roles in the long-run behaviour of the chain.
Towards this end, we start with some definitions:

1.1 First Hitting Time and Number of Visits of a State:


For x ∈ S, define:
Tx = min{n ≥ 1 : Xn = x} and Nx = #{n ≥ 1 : Xn = x}
[In the definition of Tx , follow the convention that min ∅ = ∞]

• Tx = time of first visit (not counting time 0) to state x,


Nx = total number of visits (again not counting time 0) to x.

• Both Tx and Nx are random variables, with Tx taking values in {1, 2, . . . , ∞} and Nx taking
values in {0, 1, 2, . . . , ∞}.

• Events {Tx = ∞} and {Nx = 0} are same events; by taking complements, {Tx < ∞} and
{Nx ≥ 1} are same events.

For x, y ∈ S (not necessarily distinct), denote:


ρxy = Px (Ty < ∞)
ρxy represents the probability that the chain ever visits the state y (at some time ≥ 1), given that
it starts from state x.
(n) P (n)
• If we denote ρxy = Px (Ty = n), n ≥ 1, then ρxy = n ρxy . (The events {Ty = n}, for
different n, are disjoint!)

• The event {Ty = n} implies (but may not be implied by) the event {Xn = y}; so, one has
(n) (n)
ρxy ≤ pxy

1
1.2 Recurrent and Transient States: Definition
A state x ∈ S is called “recurrent” if ρxx = 1 and is called “transient” if ρxx < 1.
Thus a state x is recurrent if the chain starting from x is sure to return to x, while x is transient
if, for the chain starting at x, there is a positive probability of never returning to x.

• If a state x is recurrent, then the definition says that a chain given to start at x is sure to
return to x at least once.

• However, it turns out (and we will prove this) that something stronger is actually true.

• If x is a recurrent state, then a chain starting at x makes an infinite number of returns


to x, with probability one!!

• Seems surprising? Think about it carefully and you would see that this emerges as a natural
consequence of Markov property (carefully used!)

We prove a basic result which would not only give the above result, but lead to a clearer picture
of the difference between recurrent and transient states in the long-term evolution of the chain.

Proposition: Let x, y ∈ S. Then for any m ≥ 1, Px (Ny ≥ m) = ρxy · ρm−1 yy


Trivially for m = 1: Px (Ny ≥ 1) = Px (Ty < ∞) = ρxy .
Heuristic for m ≥ 2: Consider, for example, m = 2.
{Ny ≥ 2}: there is a 1st visit to y, followed by at least 1 more visit. The probability of there being
a first visit to y, starting from x is ρxy , and, conditioning on the history upto the time of the first
visit to y, the probability of there being at least one more visit to y should be (markov property?)
same as the probability for a chain starting at y to make at least one visit to y, which is ρyy .
Why is this NOT a proof? The time of the first visit to y is a random time, while in our definition
of markov property, we can only condition on history upto a non-random time n.
The actual proof takes care of this by decomposing according to the time when the first visit to y
occurs.
Proof: As seen already, there is nothing to prove for m = 1.
Consider m = 2: The event {Ny ≥ 2} means a 1st visit to y, followed by at least 1 more visit.
Decomposing according to the time when the first visit to y happens, we can write

[
{Ny ≥ 2} = {Ty = n, Xn+k = y for some k ≥ 1}
n=1
Since {Ty = n}, n = 1, 2, . . . are disjoint events,
X∞
Px (Ny ≥ 2) = Px (Ty = n, Xn+k = y for some k ≥ 1)
n=1
Recaling Px (A) = P(A | X0 = x) and applying property (CP3), we get, for each n ≥ 1,
Px (Ty = n, Xn+k = y for some k ≥ 1) = P(Ty = n | X0 = x)
×P(Xn+k = y for some k ≥ 1 | X0 = x, Ty = n)
(n)
The 1st factor on the rhs equals ρxy , by definition.

Expanding the event {Ty = n}, the 2nd factor on the rhs equals
P(Xn+k = y for some k ≥ 1 | X0 = x, Xj 6= y, 1 ≤ j ≤ n − 1, Xn = y), which, by the assumed Markov

2
property, equals
P(Xk = y for some k ≥ 1 | X0 = y) = Py (Ny ≥ 1) = ρyy

P (n)
Thus, Px (Ny ≥ 2) = ρxy · ρyy = ρxy ρyy , as was to be proved. Proof for general m can be
n=1
completed by induction (Exercise!).

1.3 Interesting Consequences:


• y recurrent: For any x, Px (Ny ≥ m) = ρxy ∀ m ≥ 1
{Ny ≥ m} ↓ {Ny = ∞} as m → ∞ ⇒ Px (Ny = ∞) = ρxy
In particular, Py (Ny = ∞) = 1 (as claimed earlier!)

• y transient: Since ρyy < 1, Px (Ny ≥ m) = ρxy ρm−1


yy → 0.
Thus, for any x, Px (Ny = ∞) = 0;
Or equivalently, for any state x, Px (Ny < ∞) = 1

Main Takeaways:

• y recurrent: If the chain starts from y, it will return to y infinitely often with probability 1.
In general, if the chain starts from any state x, then the probability of y being visited infinitely
often is the same as the probability of at least one visit, namely, ρxy .

• y transient: No matter which state x the chain starts from, it will, with probability 1, visit y
only a finite number of times (the number of visits is random, of course!)

Next consider, Expected Number of Visits: Ex (Ny ) for x, y ∈ S.

y recurrent: Not much to do! As already seen, given that the chain starts at a state x, Ny takes
only two values, 0 and ∞, with probabiliities ρxy and 1 − ρxy respectively. It follows that
Ex (Ny ) = ∞ or 0, according as, ρxy > 0 or ρxy = 0;
In particular, Ey (Ny ) = ∞
y transient: We know, for any x, Px (Ny < ∞) = 1, that is, Ny is a random variable taking only
non-negative integer values.

P
Recall: For non-negative integer-valued Z, E(Z) = P(Z ≥ m)
m=1
Using this and ρyy < 1, one gets that, for any x,
∞ ∞
X X ρxy
Ex (Ny ) = Px (Ny ≥ m) = ρxy ρm−1
yy = <∞
1 − ρyy
m=1 m=1
A different way of looking at Ex (Ny ): For y ∈ S, consider indicator random variables {ξn }n≥1 ,
defined as:
ξn = 1 or 0 according as Xn = y or Xn 6= y
P
Clearly, Ny = n≥1 ξn and so, for any x,
X X X
Ex (Ny ) = Ex (ξn ) = Px (ξn = 1) = p(n)
xy
n≥1 n≥1 n≥1
P
(ξn are non-negative, so interchange of Ex and n≥1 is justified! )

3
y transient: We already know Ex (Ny ) < ∞, for all x.
P (n)
By what we just proved, this is equivalent to pxy < ∞, for all x.
n≥1
(n)
In particular, pxy → 0 as n → ∞, for any x.

P (n)
y recurrent: We know Ey (Ny ) = ∞, which means pyy = ∞.
n≥1

We get a new characterization of recurrence/transience:

1.4 Theorem
P (n)
(a) A state y is recurrent or transient according as the infinite series pyy diverges or converges.
n≥1
P (n)
(b) If y is transient, then the infinite series pxy converges for all x and, in particular, for all x,
n≥1
(n)
lim pxy = 0.
n→∞

You might also like