Professional Documents
Culture Documents
There is a zoo of strong laws of large numbers, each of which varies in the exact
assumptions it makes on the underlying sequence of random variables.
Theorem 6.1 (Strong law of large numbers). Let X1 , X2 , . . . be integrable real random
variables i.i.d. Then
X1 + + Xn
n
converges almost surely to E[X1 ].
Proof. First, let us note that X1 E[X1 ], X2 E[X2 ], . . . are i.i.d. centered. Then,
without loss of generality, we may assume that E[X1 ] = 0. Second, let Yn := Xn 1(|Xn |n)
for every n N and let h : R R, h(x) = |x|. Since,
X
X
P(Xn 6= Yn ) = P(|Xn | > n)
n=1 n=1
X
= PX1 (h > n)
n=1
X X
= PX1 (m + 1 h > m)
n=1 m=n
X X m
= PX1 (m + 1 h > m)
m=1 n=1
X
= mPX1 (m + 1 h > m)
m=1
Z
hdPX1 ,
R
1
(iv) E[Yn ] 0.
Then, by lemma 6.2, proposition 6.1 follows. Indeed, we shall prove every assertion
above:
(ii) Since,
X
X Z
2 2
n E[Yn2 ] = n x2 1(hn) (x)PX1 (dx)
n=1 n=1 R
X n
X Z
2
n x2 PX1 (dx)
n=1 m=1 (m1<hm)
X X n Z
2
mn |x|PX1 (dx)
n=1 m=1 (m1<hm)
X
X Z
2
= mn |x|PX1 (dx)
m=1 n=m (m1<hm)
!Z
X X
2
= m n |x|PX1 (dx)
m=1 n=m (m1<hm)
X Z
2 |x|PX1 (dx)
m=1 (m1<hm)
Z
2 |x|PX1 (dx),
R
X
n2 E[Yn2 ] < . (6.2)
n=1
Lemma 6.2. Let X1 , X2 , . . . be square integrable and independent real random variables
with 2
P
n=1 n Var[Xn ] < and
E[X1 ] + + E[Xn ]
a R.
n
Then
X1 + + Xn a.s.
a.
n
Pn
Proof. Let Sn := k=1 (Xk E[Xk ]). Fix > 0. For every k N, let Ak be the event
where
n1 |Sn | for some n with 2k1 n < 2k .
2
Then on Ak we have
|Sn | 2k1 for some n < 2k ,
so by Kolmogorovs inequality,
2 k
k1 2
X
P(Ak ) (2 ) Var[Xn ].
n=1
Therefore,
2 k
X 4 X X 2k
P(Ak ) 2 2 Var[Xn ]
k=1
k=1 n=1
4 X X 2k
= 2 2 Var[Xn ]
n=1 k=log n
2
8 X 2
2 n Var[Xn ],
n=1
so
P lim sup Ak = 0
k
by the Borel-Cantelli lemma. But lim sup Ak is precisely the set where
so
1
P lim sup n |Sn | < = 1.
n
3
Proof. First, fix > 0. We decompose the probability space according to the first time
at which the partial sums exceed the value . Hence, let
:= min{k = 1, . . . , n ; |Sk | }
The random variables Sn Sk and Sk 1Ak are (Xk+1 , . . . , Xn ) and (X1 , . . . , Xk ) mea-
surable, and thus
Then
Var[Sn ] = E[Sn2 ]
" n #
X
E Sn2 1Ak
k=1
n
X
= E[Sn2 1Ak ]
k=1
n
X
= E[((Sn Sk )2 + 2(Sn Sk )Sk + Sk2 )1Ak ]
k=1
n
X n
X
2
= E[(Sn Sk ) 1Ak ] + E[Sk2 )1Ak ]
k=1 k=1
Xn
E[Sk2 1Ak ]
k=1
Xn
E[2 1Ak ]
k=1
2
= P(A),
distribution function of X1 , X2 , . . ..
Theorem 6.7 (GlivenkoCantelli). Let X1 , X2 , . . . be i.i.d. real random variables with
distribution function F , and let Fn , n N, be the empirical distribution functions. Then
4
Proof. Exercise.
Example 6.8 (Shannons theorem). ...
Exercise 6.1. Let X1 , X2 , . . . be i.i.d. real random variables with
X1 + . . . + Xn a.s.
Y
n
for some random variable Y . Show that X1 L1 (P) and Y = E[X1 ] a.s. (Hint: first
show that
Exercise 6.2. Let E be a finite set and let p be a probability vector on E. Show that
the entropy H(p) is minimal (in fact, zero) if p = e for some e E. It is maximal (in
fact, log(#E)) if p is the uniform distribution on E.
Exercise 6.3. Let X1 , X2 , . . . be independent and centered real random variables with
P
n=1 Var[Xn ] < . Prove that (X1 + + Xn ) converges almost surely. (hint: apply
Kolmogorovs inequality to show that the partial sums are Cauchy almost surely.)
Exercise 6.4. If the plus and minus signs in 1
P
n=1 n are determined by successive
tosses of a fair coin, prove that the resulting series converges almost surely.
Exercise 6.5. Let X1 , X2 , . . . be real random variables i.i.d. that are not integrable.
Prove that
a.s.
lim sup n1 |X1 + + Xn | .
n
P
(Hint: show that n=1 P(|Xn | > n) = and apply the Borel-Cantelli lemma.)
Exercise 6.6. A collection or population of N objects (such as mice, grains of sand,
etc.) may be considered as a sample space in which each object has probability N 1 .
Let X be a random variable on this space (a numerical characteristic of the objects such
as mass, diameter, etc.) with mean m and variance v. In statistics one is interested in
determining m and v by taking a sequence of random samples from the population and
measuring X for each sample, thus obtaining a sequence (Xn ) of numbers that are values
of independent random variables with the same distribution as X. The nth sample mean
is Mn = n1 ni=1 Xn and the nth sample variance is Vn = (n 1)1 ni=1 (Xi Mi )2 .
P P
a.s. a.s.
(i) Show that E[Mn ] = m, E[Vn ] = v, and Mn m and Vn v.
(ii) Can you see why one uses (n 1)1 instead of n 1 in the definition of Vn ?