You are on page 1of 11

Asymptotic Theory

Let {Xn : n = 1, 2, ...} be a sequence of random variables indexed by n and Fn (x) = Pr(Xn x) be the distribution function of Xn . Further, let X be a random variable with distribution function F (x) = Pr(X x) and c be a constant. Almost sure convergence: Xn c if Pr(limn Xn = c) = 1 Convergence in probability: Xn c if limn Pr(|Xn c| ) = 1, > 0 Convergence in quadratic mean: Xn c if limn E[(Xn c)2 ] = 0 Convergence in distribution: Xn X if limn Fn (x) = F (x) at all points x at which F (x) is continuous [F is the asymptotic distribution of Xn ]
d q.m. p a.s.

The denitions of convergence in probability, quadratic mean (mean square) and a.s.(almost sure) can be generalised to the case in which c is not a constant but a random variable whose distribution does not depend on n. For example, we say that Xn X if Xn X 0. If a random sequence converges in q.m., then it converges in probability and in distribution & % Using Chebyshevs inequality, we can prove that: Xn X = Xn X only if X is a constant (the distribution of X is degenerate)
d p q.m. a.s. p p

Convergence of random vectors 1

As with non-stochastic limits, these concepts extend immediately to vectors and matrices of nite dimension. For example, convergence in probability is said to hold for a random vector if it holds for each of its components. X n = (X1n , ..., Xkn )0 , X = (X1 , ..., Xk )0

X n X i Xjn Xj (j = 1, ..., k) Cramer-Wold Device X n X if for any real constants b1 , ..., bk ,


k X j=1 d

bj Xjn

k X j=1

bj Xj

Convergence of transformations 1. Continuous Mapping Theorem: Let X1 , X2 , .. and X be random k-vectors and g be a vector-valued continuous function on <k . Then: (a) Xn X = g(Xn ) g (X) (b) Xn X = g(Xn ) g (X)
0 e.g. Xn X N (0, Ik ) = Xn Xn X 0 X 2 (k) p p a.s. a.s.

(c) Xn X = g(Xn ) g (X)


d

2. Slutskys Theorem: Let {Xn } and {Yn } be sequences of random vectors such that Xn X and Yn c (constant). Then (a) Xn + Yn X + c
0 (b) Yn Xn c0 X d d d p

3. Cramers Linear transformation Theorem: Let {Xn } be a sequence of random vectors and {An } be a sequence of random 2

square matrices. Then Xn X and An A implies An Xn AX. Example Xn N (, ) and An A = An Xn N (A, AA0 ) 4. Xn X and (Xn Yn ) 0 = Yn X 5. Xn X and Yn 0 = Xn Yn 0 6. Delta Method: Let {Xn } be a sequence of random p-vectors and g : <p < a function with continuous rst derivatives. Then, d d n(Xn c) X = n(g (Xn ) g (c)) GX, where G =
g x0 x=c d p p d p d d p d

Proof : Consider any real vector m1 and form the function h(x) = 0 g(x), h() dierentiable. Then using the mean-value theorem, there exists cn between Xn and c such that h0 (x) (Xn c) h(Xn ) h(c) = x x=cn
p p

and therefore

Since cn is between Xn and c, and since Xn c = cn c. Then,given that h is a continuous function, we have that h(x) p h(x) x0 x=cn x0 x=c d h(x) n[h(Xn ) h(c)] X x0 x=c 3

h(x) n[h(Xn ) h(c)] = n(Xn c) 0 x x=cn

Since

d n(Xn c) X,

or in terms of the original function (recall that h(x) 0 g(x))

g(x) d 0 { n[g(Xn ) g(c)]} 0 X x0 Since this is true for any we conclude that d g(x) n[g(Xn ) g(c)] X x0 x=c

Example 1 Let X1 , X2 , ...Xn be a random vector sample from D(, 2 ) d n(X n ) X, X N (0, 2 )

Sn =

1 Xn ,

Xn = n

n X i=1

Xi

g 1 1 g(x) = continuous at x = , G = = 2 0 x x x= 1 d 1 1 2 n(Sn ) 2 X, 2 X N (0, 4 )

Law of Large Numbers


Given restrictions on the dependence, heterogeneity, and moments of a sequence of random variables {Xt } : 1X p Xt E n t=1
n n

If {Xt } are iid with E(Xt ) = < , then

1X Xt n t=1

as n .

We can also apply LLN to functions of Xt , e.g., 1X 2 p X E n t=1 t


n

1X p Xt (Kolmogorov) n t=1 1X 2 X n t=1 t


n

2 = E(Xt ) if i.i.d. .

Example 2 Bivariate Linear Regression

Yt = a + Xt + ut , yt = Yt Y , xt = Xt X P P yt xt xt ut = P 2 =+ P 2 xt xt

This means that is a consistent estimator of .

P P E(n1 xt ut ) p lim n1 xt ut P 2 =+ p lim = + = n var(xt ) p lim n1 xt

Consistency
p An estimator n is said to be consistent for if n as n .

A sucient (but not necessary) condition for consistency is that lim E(n ) = lim var(n ) = 0

and

These conditions imply that as n tends to innity the sampling distribution of n becomes less and less dispersed and eventually collapses (becomes degenerate) at . Example 3 Multiple Linear Regression y = X + u, E(u) = 0, E(uu0 ) = 2 In

1 p lim ( X 0 X) = Q < , positive denite n n 1 1 = + (X 0 X)1 X 0 u = + ( X 0 X)1 ( X 0 u) n n

1 1 p lim = + p lim( X 0 X)1 p lim( X 0 u) n n n 1 = + Q1 p lim( X 0 u) n


1 n 1 n

1 0 Xu= n
n

1 n

p lim(

1X 1X ut ) = E( ut ) = 0 n t=1 n t=1 6

ut

x2t ut . . . xkt ut
n

p lim(

(provided that the error is uncorrelated with the regressors) Thus, 1 p lim( X 0 u) = 0 and p lim = n

1X 1X x2t ut ) = E(x2t ut ) = 0 n t=1 n t=1

Central Limit Theorem


Given restrictions on the dependence, heterogeneity, and moments of a scalar stochastic sequence {Xt }: n(X n n ) N(0, 2 ) n where 1X Xn = Xt , n t=1
n

n = E(X n ),

2 = nvar(X n ) n

If {Xt } is an iid sequence and var(xt ) = 2 < , then

n 1 X d n(X n n ) = n(X n ) = (Xt ) N (0, 2 ) n t=1


2

i.e. X n N (, ) (Lindeberg-Levy) (extends directly to random vectors) n Let {Xt } be independent with E(Xt ) = t , var(Xt ) = 2 and E(|Xt t |2+ ) < t for some > 0 and all t. If 2 > 0 n suciently large, n

d n(X n n ) N (0, 2 ) n n(X n n ) (or asymptotic variance of X n ) 7

2 =asymptotic variance of n

A useful multivariate CLT (LindebergFeller) Let {Xt } be a sequence of independent random vectors with E(Xt ) = 0, var(Xt ) = t and distribution functions Ft (x) = P (Xt x). Suppose further that 1X lim t = 6= 0 n n t=1 1X lim n n t=1
n n

and for every > 01

|x|> n

kxk2 dFt (x) = 0.

Then,

1 X d Xt N(0, ) n t=1

Convergence of moments
Consider a random sequence {Xn } such that Xn X (i.e. limn Fn (x) = F (x)) asymptotic moments of Xn are:2
r Ea (Xn )
1

E(X ) =

X r dF (x), r 1

No random variable dominates the sequence of the sum

Pr(|X| n) 2 max Pr(|X| n)

dened in terms of the asymptotic distribution F (x)

The limit of the r-th moment is dened as3 Z r lim E(Xn ) = X r dFn (x)
n r r In general, there is no reason why Ea (Xn ) and lim E(Xn ) should be equal. n

r r If Xn X = lim E(Xn ) = Ea (Xn ) n

If Xn X =
a.s.

r lim E(Xn ) n

6=

r Ea (Xn )

q q Proposition 1 If E(|Xn |r ) < for all n, then lim E(Xn ) = Ea (Xn ) for any n 2 q < r. In particular, if E(Xn ) < , then lim E(Xn ) = Ea (Xn ). n

If {n } is a sequence of estimators of , then n is asymptotically unbiased if Ea (n ) = . (if p lim n = , then Ea (n ) = ) d n is asymptotically normal if n(n ) N (0, V ), V is the asymptotic covariance matrix of n(n ). If n is consistent, this class of estimators is known as the consistent asymptotically normal (CAN) class.

OLS Asymptotics Non-stochastic Regressors


yt = x0 + ut or y = X + u t a) {ut } are iid with E(u) = 0 and E(uu0 ) = 2 In b) |xit | < c for some c > 0 (i = 1, 2, ..., k)
1 c) lim ( n X 0 X) = Q < and positive denite n
3

(xt is k 1)

dened in terms of Fn (x)

We rst show that 1 d X 0 u N (0, 2 Q). n Notice that 1 1 X X 0u = xt ut , n n t=1


n

where {xt ut } is an independent sequence of random vectors with E(xt ut ) = 0 and var(xt ut ) = 2 xt x0 2 Qt . Hence, applying the LindebergFeller CLT to {xt ut } t 1 X d xt ut N (0, ), n t=1
n n

we get

= lim

1X var(xt ut ) n n t=1

1X 2 1 = lim xt x0 = 2 lim ( X 0 X) = 2 Q, t n n n n t=1 1 d X 0 u N (0, 2 Q) n Since = + (X 0 X)1 X 0 u, we have 1 n( ) = ( X 0 X)1 n 1 0 Xu n

Cramers linear transformation theorem implies that 1 d n( ) lim ( X 0 X)1 N (0, 2 Q) n n


1 Notice that using the continuous mapping theorem, lim( n X 0 X)1 = Q1 ,

d n( ) N(0, 2 Q1 QQ1 ) N (0, 2 Q1 ) or

10

1 AN (, 2 Q1 ). n In practice 2 and Q1 are unknown and they are replaced by: 2 = 1 1 u0 u and ( X 0 X) nk n

respectively. Thus, the asymptotic covariance matrix of is estimated by 2 (X 0 X)1 , which is the nite-sample estimator of var(). The asymptotic normality of ensures that conventional t and F tests are asymptotically valid. Critical values should be taken from N (0,1) rather than t(n k) and from
2 (q) q

instead of F (q, n k).

OLS asymptotics Stochastic Regressors


y = X + u Under certain condition on the behaviour of the stochastic regressors, it can be shown that 1 d X 0 u N (0, 2 Q) n where 1 Q = p lim ( X 0 X) n n Proceeding as in the case of non-stochastic regressors, d T ( ) N (0, 2 Q1 )

11

You might also like