Asymptotic Theory

Asymptotic Theory
Let {Xn : n = 1, 2, ...} be a sequence of random variables indexed by n and Fn (x) = Pr(Xn x) be the distribution function of Xn . Further, let X be a random variable with distribution function F (x) = Pr(X x) and c be a constant. Almost sure convergence: Xn c if Pr(limn Xn = c) = 1 Convergence in probability: Xn c if limn Pr(|Xn c| ) = 1, > 0 Convergence in quadratic mean: Xn c if limn E[(Xn c)2 ] = 0 Convergence in distribution: Xn X if limn Fn (x) = F (x) at all points x at which F (x) is continuous [F is the asymptotic distribution of Xn ]
d q.m. p a.s.
The denitions of convergence in probability, quadratic mean (mean square) and a.s.(almost sure) can be generalised to the case in which c is not a constant but a random variable whose distribution does not depend on n. For example, we say that Xn X if Xn X 0. If a random sequence converges in q.m., then it converges in probability and in distribution & % Using Chebyshevs inequality, we can prove that: Xn X = Xn X only if X is a constant (the distribution of X is degenerate)
d p q.m. a.s. p p
Convergence of random vectors 1
As with non-stochastic limits, these concepts extend immediately to vectors and matrices of nite dimension. For example, convergence in probability is said to hold for a random vector if it holds for each of its components. X n = (X1n , ..., Xkn )0 , X = (X1 , ..., Xk )0
X n X i Xjn Xj (j = 1, ..., k) Cramer-Wold Device X n X if for any real constants b1 , ..., bk ,

k X j=1 d
bj Xjn
k X j=1
bj Xj
Convergence of transformations 1. Continuous Mapping Theorem: Let X1 , X2 , .. and X be random k-vectors and g be a vector-valued continuous function on <k . Then: (a) Xn X = g(Xn ) g (X) (b) Xn X = g(Xn ) g (X)
0 e.g. Xn X N (0, Ik ) = Xn Xn X 0 X 2 (k) p p a.s. a.s.
(c) Xn X = g(Xn ) g (X)

d
2. Slutskys Theorem: Let {Xn } and {Yn } be sequences of random vectors such that Xn X and Yn c (constant). Then (a) Xn + Yn X + c
0 (b) Yn Xn c0 X d d d p
3. Cramers Linear transformation Theorem: Let {Xn } be a sequence of random vectors and {An } be a sequence of random 2
square matrices. Then Xn X and An A implies An Xn AX. Example Xn N (, ) and An A = An Xn N (A, AA0 ) 4. Xn X and (Xn Yn ) 0 = Yn X 5. Xn X and Yn 0 = Xn Yn 0 6. Delta Method: Let {Xn } be a sequence of random p-vectors and g : <p < a function with continuous rst derivatives. Then, d d n(Xn c) X = n(g (Xn ) g (c)) GX, where G =
g x0 x=c d p p d p d d p d
Proof : Consider any real vector m1 and form the function h(x) = 0 g(x), h() dierentiable. Then using the mean-value theorem, there exists cn between Xn and c such that h0 (x) (Xn c) h(Xn ) h(c) = x x=cn
p p
and therefore
Since cn is between Xn and c, and since Xn c = cn c. Then,given that h is a continuous function, we have that h(x) p h(x) x0 x=cn x0 x=c d h(x) n[h(Xn ) h(c)] X x0 x=c 3
h(x) n[h(Xn ) h(c)] = n(Xn c) 0 x x=cn
Since
d n(Xn c) X,
or in terms of the original function (recall that h(x) 0 g(x))
g(x) d 0 { n[g(Xn ) g(c)]} 0 X x0 Since this is true for any we conclude that d g(x) n[g(Xn ) g(c)] X x0 x=c
Example 1 Let X1 , X2 , ...Xn be a random vector sample from D(, 2 ) d n(X n ) X, X N (0, 2 )
Sn =
1 Xn ,
Xn = n
n X i=1
Xi
g 1 1 g(x) = continuous at x = , G = = 2 0 x x x= 1 d 1 1 2 n(Sn ) 2 X, 2 X N (0, 4 )
Law of Large Numbers

Given restrictions on the dependence, heterogeneity, and moments of a sequence of random variables {Xt } : 1X p Xt E n t=1
n n
If {Xt } are iid with E(Xt ) = < , then
1X Xt n t=1
as n .
We can also apply LLN to functions of Xt , e.g., 1X 2 p X E n t=1 t

n
1X p Xt (Kolmogorov) n t=1 1X 2 X n t=1 t

n
2 = E(Xt ) if i.i.d. .
Example 2 Bivariate Linear Regression
Yt = a + Xt + ut , yt = Yt Y , xt = Xt X P P yt xt xt ut = P 2 =+ P 2 xt xt
This means that is a consistent estimator of .
P P E(n1 xt ut ) p lim n1 xt ut P 2 =+ p lim = + = n var(xt ) p lim n1 xt
Consistency
p An estimator n is said to be consistent for if n as n .
A sucient (but not necessary) condition for consistency is that lim E(n ) = lim var(n ) = 0
and
These conditions imply that as n tends to innity the sampling distribution of n becomes less and less dispersed and eventually collapses (becomes degenerate) at . Example 3 Multiple Linear Regression y = X + u, E(u) = 0, E(uu0 ) = 2 In
1 p lim ( X 0 X) = Q < , positive denite n n 1 1 = + (X 0 X)1 X 0 u = + ( X 0 X)1 ( X 0 u) n n
1 1 p lim = + p lim( X 0 X)1 p lim( X 0 u) n n n 1 = + Q1 p lim( X 0 u) n

1 n 1 n
1 0 Xu= n
n
1 n
p lim(
1X 1X ut ) = E( ut ) = 0 n t=1 n t=1 6
ut
x2t ut . . . xkt ut
n
p lim(
(provided that the error is uncorrelated with the regressors) Thus, 1 p lim( X 0 u) = 0 and p lim = n
1X 1X x2t ut ) = E(x2t ut ) = 0 n t=1 n t=1
Central Limit Theorem

Given restrictions on the dependence, heterogeneity, and moments of a scalar stochastic sequence {Xt }: n(X n n ) N(0, 2 ) n where 1X Xn = Xt , n t=1
n
n = E(X n ),
2 = nvar(X n ) n
If {Xt } is an iid sequence and var(xt ) = 2 < , then
n 1 X d n(X n n ) = n(X n ) = (Xt ) N (0, 2 ) n t=1

2
i.e. X n N (, ) (Lindeberg-Levy) (extends directly to random vectors) n Let {Xt } be independent with E(Xt ) = t , var(Xt ) = 2 and E(|Xt t |2+ ) < t for some > 0 and all t. If 2 > 0 n suciently large, n
d n(X n n ) N (0, 2 ) n n(X n n ) (or asymptotic variance of X n ) 7
2 =asymptotic variance of n
A useful multivariate CLT (LindebergFeller) Let {Xt } be a sequence of independent random vectors with E(Xt ) = 0, var(Xt ) = t and distribution functions Ft (x) = P (Xt x). Suppose further that 1X lim t = 6= 0 n n t=1 1X lim n n t=1
n n
and for every > 01
|x|> n
kxk2 dFt (x) = 0.
Then,
1 X d Xt N(0, ) n t=1
Convergence of moments
Consider a random sequence {Xn } such that Xn X (i.e. limn Fn (x) = F (x)) asymptotic moments of Xn are:2
r Ea (Xn )
1
E(X ) =
X r dF (x), r 1
No random variable dominates the sequence of the sum
Pr(|X| n) 2 max Pr(|X| n)
dened in terms of the asymptotic distribution F (x)
The limit of the r-th moment is dened as3 Z r lim E(Xn ) = X r dFn (x)
n r r In general, there is no reason why Ea (Xn ) and lim E(Xn ) should be equal. n
r r If Xn X = lim E(Xn ) = Ea (Xn ) n
If Xn X =
a.s.
r lim E(Xn ) n
6=
r Ea (Xn )
q q Proposition 1 If E(|Xn |r ) < for all n, then lim E(Xn ) = Ea (Xn ) for any n 2 q < r. In particular, if E(Xn ) < , then lim E(Xn ) = Ea (Xn ). n
If {n } is a sequence of estimators of , then n is asymptotically unbiased if Ea (n ) = . (if p lim n = , then Ea (n ) = ) d n is asymptotically normal if n(n ) N (0, V ), V is the asymptotic covariance matrix of n(n ). If n is consistent, this class of estimators is known as the consistent asymptotically normal (CAN) class.
OLS Asymptotics Non-stochastic Regressors

yt = x0 + ut or y = X + u t a) {ut } are iid with E(u) = 0 and E(uu0 ) = 2 In b) |xit | < c for some c > 0 (i = 1, 2, ..., k)
1 c) lim ( n X 0 X) = Q < and positive denite n
3
(xt is k 1)
dened in terms of Fn (x)
We rst show that 1 d X 0 u N (0, 2 Q). n Notice that 1 1 X X 0u = xt ut , n n t=1

n
where {xt ut } is an independent sequence of random vectors with E(xt ut ) = 0 and var(xt ut ) = 2 xt x0 2 Qt . Hence, applying the LindebergFeller CLT to {xt ut } t 1 X d xt ut N (0, ), n t=1
n n
we get
= lim
1X var(xt ut ) n n t=1
1X 2 1 = lim xt x0 = 2 lim ( X 0 X) = 2 Q, t n n n n t=1 1 d X 0 u N (0, 2 Q) n Since = + (X 0 X)1 X 0 u, we have 1 n( ) = ( X 0 X)1 n 1 0 Xu n
Cramers linear transformation theorem implies that 1 d n( ) lim ( X 0 X)1 N (0, 2 Q) n n

1 Notice that using the continuous mapping theorem, lim( n X 0 X)1 = Q1 ,
d n( ) N(0, 2 Q1 QQ1 ) N (0, 2 Q1 ) or
10
1 AN (, 2 Q1 ). n In practice 2 and Q1 are unknown and they are replaced by: 2 = 1 1 u0 u and ( X 0 X) nk n
respectively. Thus, the asymptotic covariance matrix of is estimated by 2 (X 0 X)1 , which is the nite-sample estimator of var(). The asymptotic normality of ensures that conventional t and F tests are asymptotically valid. Critical values should be taken from N (0,1) rather than t(n k) and from
2 (q) q
instead of F (q, n k).
OLS asymptotics Stochastic Regressors

y = X + u Under certain condition on the behaviour of the stochastic regressors, it can be shown that 1 d X 0 u N (0, 2 Q) n where 1 Q = p lim ( X 0 X) n n Proceeding as in the case of non-stochastic regressors, d T ( ) N (0, 2 Q1 )
11

Asymptotic Theory

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Asymptotic Theory

Uploaded by

Copyright:

Available Formats

Asymptotic Theory

Convergence of random vectors 1

X n X i Xjn Xj (j = 1, ..., k) Cramer-Wold Device X n X if for any real constants b1 , ..., bk ,

(c) Xn X = g(Xn ) g (X)

h(x) n[h(Xn ) h(c)] = n(Xn c) 0 x x=cn

or in terms of the original function (recall that h(x) 0 g(x))

g 1 1 g(x) = continuous at x = , G = = 2 0 x x x= 1 d 1 1 2 n(Sn ) 2 X, 2 X N (0, 4 )

Law of Large Numbers

If {Xt } are iid with E(Xt ) = < , then

We can also apply LLN to functions of Xt , e.g., 1X 2 p X E n t=1 t

1X p Xt (Kolmogorov) n t=1 1X 2 X n t=1 t

Example 2 Bivariate Linear Regression

This means that is a consistent estimator of .

P P E(n1 xt ut ) p lim n1 xt ut P 2 =+ p lim = + = n var(xt ) p lim n1 xt

1 p lim ( X 0 X) = Q < , positive denite n n 1 1 = + (X 0 X)1 X 0 u = + ( X 0 X)1 ( X 0 u) n n

1 1 p lim = + p lim( X 0 X)1 p lim( X 0 u) n n n 1 = + Q1 p lim( X 0 u) n

1X 1X x2t ut ) = E(x2t ut ) = 0 n t=1 n t=1

Central Limit Theorem

If {Xt } is an iid sequence and var(xt ) = 2 < , then

n 1 X d n(X n n ) = n(X n ) = (Xt ) N (0, 2 ) n t=1

d n(X n n ) N (0, 2 ) n n(X n n ) (or asymptotic variance of X n ) 7

and for every > 01

kxk2 dFt (x) = 0.

No random variable dominates the sequence of the sum

Pr(|X| n) 2 max Pr(|X| n)

dened in terms of the asymptotic distribution F (x)

r r If Xn X = lim E(Xn ) = Ea (Xn ) n

OLS Asymptotics Non-stochastic Regressors

dened in terms of Fn (x)

We rst show that 1 d X 0 u N (0, 2 Q). n Notice that 1 1 X X 0u = xt ut , n n t=1

1X 2 1 = lim xt x0 = 2 lim ( X 0 X) = 2 Q, t n n n n t=1 1 d X 0 u N (0, 2 Q) n Since = + (X 0 X)1 X 0 u, we have 1 n( ) = ( X 0 X)1 n 1 0 Xu n

Cramers linear transformation theorem implies that 1 d n( ) lim ( X 0 X)1 N (0, 2 Q) n n

d n( ) N(0, 2 Q1 QQ1 ) N (0, 2 Q1 ) or

instead of F (q, n k).

OLS asymptotics Stochastic Regressors

You might also like