You are on page 1of 6

Placing weak convergence of probability measures in context

Vladimir Grenkov

In order to fully understand dierent notions of convergence of probability measures, we have to look to a few topics in analysis, somewhat more generally. We begin by examining normed spaces, dening their dual spaces, and bounded linear functionals over them.

1 Linear functionals and dual spaces


A linear functional is a linear mapping from a vector space into its eld of scalars. Let Q be a normed vector space, with R as its eld of scalars. Then, dene a linear functional f , as a linear mapping f : Q R, for example, f (x) = f (x), R. Now let Q denote the set of all continous linear functionals from Q into R, and for each f Q , let f = sup{ |f (x)| : x Q, x = 0} = sup {|f (x)|} x xQ
x =1

(1)

Theorem 1. Continuity and boundedness are equivalent for linear functionals. Proof. Assume on the contrary that a continous linear functional f (x) is not bounded. Then, for some n, we can nd xn R such that |f (xn )| > n xn . Let yn = n xnn . Then, x 1 yn = n , i.e. yn 0 and we have: |f (yn )| = |f ( xn |f (xn )| )| = > 1. n xn n xn

So f (x) is not continuous at x = 0, a contradiction. Now, let f (x) be bounded, that is, |f (x)| < M x , for some constant M . Let {xn } be a sequence with xn 0, then, |f (x)| M xn 0, so, f (x) is continuous at 0, and by linearity, continuous at every other point. Now, since f is thus bounded, we can see that f is well dened in the sense that f < . We can easily verify that this indeed denes a true norm, that is, f 0, and f = 0 if and only if f = 0 1

cf

= |c| f f + g .

f +g

We call the space Q , together with this norm, the dual space of Q. Furthermore, as shown in the next theorem, this norm makes Q into a complete normed space, that is, a Banach space. Theorem 2. For any normed linear space (Q, ) over R, (Q , ) is a Banach space. Proof. We have to show that Cauchy sequence converge. Let {fn }, with fn Q , be a Cauchy sequence, then x Q, |fn (x) fm (x)| fn fm x ,

So, {fn (x)} is a Cauchy sequence of real numbers, and since R is complete, fn (x) converges pointwise to some f (x). Then, x Q and x = 0, |f (x)| |fn (x)| = lim M, n x x so f Q and f M . Also, for all m, fm f lim fm fn
n

0 as m .

Note that above, the function f Q was a linear functional over the space Q. Now, similarly, we can look at (f ), with Q , where Q is the space of linear functionals over Q . We call Q the double dual space of Q.

2 Convergence
In analysis we can identify three types of convergence: strong, weak, and weak*. For clarity, I will restate our notation: Q is a normed space, Q is the dual of Q, and Q is the dual of Q . Let yn , y Q, fn , f Q , and Q , then we write: yn y if and only if yn y 0. yn fn
w w s

y if and only if f (yn ) f (y), f Q . f if and only if fn (y) f (y), y Q.

Strong convergence is fairly standard, however, weak and weak* deserve some further explanation. We say that yn converges to y in Q in the weak sense if for all f in Q , f (yn ) converges to f (y). Analogously we can talk about weak convergence of linear functionals: say that a sequence of functionals fn in Q converges weakly to f in Q if for all in Q , (fn ) converges to (f ). Notice that this is quite dierent from the 2

denition of weak* convergence, which says that fn converges weakly* to f in Q if for all y in the original space Q, fn (y) converges to f (y). Remarks: 1. Weak* convergence says that |fn (y) f (y)| 0, y Q, that is fn f 0, and hence, is equivalent to strong convergence for fn Q . Thus, if we let fn play the role of yn , in the denition, we will get strong convergence for fn under the norm of Q . 2. Suppose that a normed space is reexive (in the sense that its double dual is itself, or Q = Q), then weak* convergence is equivalent to weak convergence. This is because we can identify every linear functional Q with y Q and thus interpret y as a linear functional in Q , and write y(f ) = f (y), y Q, f Q . Then we can say yn
w

(2)

y if and only if yn (f ) y(f ), f Q

and, because of (2), this is clearly equivalent to yn


w

y if and only if f (yn ) f (y), f Q .

For example, any Hilbert space, such as L2 , is reexive.

3 Probability measures
3.1 Probability measure as a linear functional
Let (, B, P ) be a probability space, X : S be a random variable, taking values in S, and let PX = P X 1 be the induced probability measure, then PX : S [0, 1]. Here CB (S) will play the role of the space Q from the previous sections. Now if we take any f CB (S), with the norm f = supsS {|f (s)|}, we would like to view PX as a linear functional on CB (S), that is, to consider PX (f ). We dene this linear functional as follows: PX (f ) = f (x)dPX (x) (3)
S

Note that PX in PX (f ) and in dPX are dierent objects, a linear functional and a probability measure, respectively, however, it is convenient to use the same letter for both. This representation leads to a natural denition of a norm for the linear functional PX : PX = sup {|
f CB (S) f =1 S

f dPX |}.

Now we can easily verify that the linear functional PX is indeed a bounded linear functional: PX (f + g) = (f + g)dPX = f dPX + gdPX = PX (f ) + PX (g),

so PX is linear by linearity of the integral. And, PX


S

f dPX
S

dPX = 1,

so it is also bounded. Furthermore, by Theorem 1, the dual space of CB (S), CB (S) , is a Banach space. Let M(CB (S)) be the space of all linear functionals, PX , generated by the probability measures over CB (S). Clearly, M CB (S) , since the range a of probability measure is restricted to the interval [0, 1], and hence PX 1, a property which is not true for linear functionals in general.

3.2 Weak convergence of probability measures


We know now that PX is a bounded linear functional. We have all the tools to talk about convergence of PX . Let Xn be a sequence of random variables, and PXn be the associated induced probability measures. Recall that we say that PXn converges to PX weakly in the probabilistic sense if for all f CB (S), E[f (xn )] =

f (Xn ())dP () =
S

f (x)dPXn (x)
S

f (x)dPX (x) = E[f (x)].

We can rewrite this as PXn (f ) PX (f ), f CB (S).

But, we immediately recognize this as weak* convergence of PXn to PX , as dened earlier. Since now CB (S) is our normed space Q, and both PXn , PX M(CB (S)), which is a subset of Q , the dual of CB (S), we can rewrite this convergence as: PXn
w

PX if and only if PXn (f ) PX (f ),

f CB (S).

(4)

A natural denition would then be to say that one has weak* convergence of the probability measures PXn to PX if (4) holds. But in probability, it is customary to say instead that the probability measure PXn converges to PX weakly if (4) holds.

4 Weak topology
Throughout this discussion we talked about convergence by comparing various notions of distance. Strong convergence implied that we used the distance between points in the original normed space, while weak and weak* convergence required that we use a distance dened on the space of linear functionals. With each of these distances, we can dene a topology over the respective spaces. For example, strong topology would involve dening a neighbourhood of a point with respect to the norm of the space: Vs (y, ) = {y S : s y < }. Recall that a topology T on S is a collection of sets called open, such that T ,S T the union (ninte or innite) of sets in T is in T the intersection of nitely many sets in T is in T The connection between limits and topology is as follows: Let sn , s S. We say that sn s if and only if all neighbourhoods of s contains innitely many of the sn . We can dene a topology in this way with respect to each of the types of convergences we encountered, however it would be most interesting to examine the neighbourhoods of a probability measure PX under weak* convergence. Let X be a random variable as before, X : S, let f CB (S), and M(CB (S)) be the space of linear functionals over CB (S) generated by the induced probability measures. Consider the family of sets: VPX (f, ) = { : a probability measure , |
S

f d
S

f dPX | < }

or equivalently, VP (f, ) = { : M(CB (S)), |(f ) PX (f )| < }.


X

This set denes a weak* topology for probability measures. We can dene a neighbourhood around some probability measure in terms of how close another probability measure is to it. More specically, given a sequence of probability measures {n }, n if any neighbourhood V contains an innite number of n .

References Dudley R.M., Real Analysis and Probability, Cambridge University Press, 2002 Kolmogorov A.N. and Fomin S.V., Elements of Theory of Functions and Functional Analysis, Dover, 1999 Stroock, D.W., Probability Theory: an Analytic View, Cambridge University Press, 2000

You might also like