1.1K views

Uploaded by Edward Omey

- Normal Distribution
- _24_._09_._2018_Economic D
- MATH30-6 Lecture 8.pptx
- Ch-4 Worksheet Bus Stat2017
- Statistics
- 1. Humanities-The Use of Participatory Learning Method in Training-Maryam Ghorbani
- Biostatistics - Multiple Choice Questions
- Greenwood Management Annual Report - Acacia Mangium 2012
- STAT 3360 Homework Chapter 7
- MAT 540 Statistical Concepts for Research
- UUM-BWRR3033-Risk Management--CHAPTER 04 Risk Measurement
- Standard Deviation and Mean-Variance
- 1030_3934_1045
- Risk and Insurance
- STA230-20100329163207
- UBL
- Chapter 15
- Chap008(1)
- Ch2 Introduce to Firing Theory
- lecture-1_2

You are on page 1of 23

E. Omey and S. Van Gulck

HUB –Stormstraat 2, 1000 - Brussels, Belgium

{edward.omey, stefan.vangulck}@hubrussel.be

March 2008

Abstract

studying con…dence intervals for the mean for example, the use of the central

limit theorem is fully exploited. For large samples from an arbitrary distribution

with …nite second moment, we can always construct con…dence intervals and test

hypothesis concerning . In the same textbooks, in the treatment of the variance

2

and the correlation coe¢ cient , the analysis is usually restricted to samples

from normal distributions!

In this paper we give a general and simple central limit appraoch to these

parameters and show that it is convenient but not necessary to restrict attention

to normal samples. Among others we discuss central limit theorems for the

sample variance s2 , the sample correlation coe¢ cient r and the ratio of sample

variances s22 =s21 for paired and for unpaired samples.

1

1 Introduction

Let X1 ; X2 ; :::; Xn denote a sample from X s A( ; 2 ), where A is an arbitrary

distribution with = E(X) and 2 = V ar(X). The sample mean is given by

n

1X

X= Xi .

n i=1

probabilities concerning X is a more complicated problem. For small samples

there are not many distributions for which the distribution of X is known. For

large samples we can use the central limit theorem. The central limit theorem

for X states that as n ! 1, we have

p X d

n =) Z s N (0; 1),

i.e. we have

p X

P( n x) ! P (Z x).

su¢ ciently well.

The sample variances are given by

n

1X

S2 = (X )2 = (Xi )2 ,

n i=1

n n 2

s2 = (X X)2 = (X 2 X ).

n 1 n 1

It is well known that E(S 2 ) = E(s2 ) = 2

. For the variance, we …nd that

1

V ar(S 2 ) = V ar((X )2 ).

n

To calculate the variance of s2 is, in general, much more complicated. For a

sample from the normal distribution N ( ; 2 ) there are no problems. In this

case we have

nS 2 2 (n 1)s2

2

s n , 2

s 2n 1

and for large n we have

2 4 2 4

S2 t N ( 2

), s2 t N ( 2 ;

; ).

n n 1

In the case of a sample from another distribution, these approximations are

usually not valid. In section 2 of this paper, we provide a central limit theorem

for S 2 and for s2 .

In section 3 we state and prove a multivariate central limit theorem and

then apply a tranfer theorem to obtain central limit theorems for the sample

coe¢ cient of variation CV , the sample correlation coe¢ cient r and the ratio of

sample variances.

2

2 Central Limit Theorem for S 2 and s2

2.1 Central limit theorem for S 2

In view of the de…nition of S 2 , using the ordinary central limit theorem, we

immediately obtain the following result.

p

P ( n(S 2 2

) x) ! P (U x)

2 2

where U s N (0; U) with U = V ar((X )2 ).

2

Remark. Note that U is related to the kurtosis (X) of X. Recall that

the kurtosis is de…ned as:

E((X )4 ) V ar((X )2 )

(X) = 4

3= 4

2.

2 4

We …nd that U = ( (X) + 2) .

To prove a central limit theorem for s2 , we rewrite s2 as follows. We have

n

X

(n 1)s2 = (Xi (X ))2

i=1

n

X n

X

= (Xi )2 + n(X )2 2(X ) (Xi )

i=1 i=1

2

= nS n(X )2

It follows that

p

p n p n n p

n(s2 2

)= n(S 2 2

)+ 2

n(X )2 (1)

n 1 n 1 n 1

We prove the following result.

p

P ( n(s2 2

) x) ! P (U x)

2 2

where U s N (0; U) with U = V ar((X )2 ).

3

p

Proof. Consider (1) and write n(s2 2

) = A + B, where

n p

An = n(S 2 2

),

n 1

p

n 2 n p

Bn = n(X )2 .

n 1 n 1

Using Theorem 1, we have

P (An x) ! P (U x).

p

n 2 n p

Bn = n(X )(X ).

n 1 n 1

Using the central limit theorem we have

p

P ( n(X )= x) ! P (Z x)

P P

and the law of law numbers gives X ! 0. It follows that Bn ! 0. The

result now follows.

Remarks.

d

1) In the previous result we used the following property: if Xn =) X and

P d

Yn ! 0, then Xn + Yn =) X.

2) In section 4.1 we provide another proof of this result.

3) We …nd con…dence intervals for 2 in the usual way. We have

2 U

= s2 z =2 p

n

2 4

and using U = ( (X) + 2) we …nd that

2 s2

= p .

1 z =2 ( (X) + 2)=n

1) If X s N ( ; 2 ) ,we have E((X )3 ) = 0 and E((X )4 ) = 3 4

and then

2 4

it follows that U = 2 . We …nd back the known result.

2) If X s BERN (p), then = p and, using q = 1 p, we have

Now we …nd that 2U = pq(1 4pq). Note that for p = 1=2 we have 2U = 0.

3) If X s U N IF ( a; a), we have = 0, 2 = a2 =3 and E(X 4 ) = a4 =5. We

…nd that p 2

n(s a2 =3) =) U s N (0; a4 =5).

4

3 Multivariate central limit theorem

3.1 The central limit theorem

We prove the following theorem.

Pn from a bivari-

2 2 1

ate distribution

Pn (X; Y ) s A( 1 ; 2 ; 1 ; 2 ; ). Let X = n i=1 Xi and Y =

n 1 i=1 Yi . Then we have

p p

P ( n(X 1) x; n(Y 2) y) ! P (U x; V y)

2 2

where (U; V ) has a bivariate normal distribution (U; V ) s BN (0; 0; 1; 2; ).

Proof. For arbitrary a and b where (a; b) 6= (0; 0), we consider aX +bY . Clearly

we have

E(aX + bY ) = a 1 + b 2,

V ar(aX + bY ) = a2 21 + b2 22 + 2ab 1 2.

p d

n(aX + bY a 1 b 2) =) W

where

W s N (0; a2 2

1 + b2 2

2 + 2ab 1 2 ).

d

W = aU + bV

2 2

where (U; V ) has a bivariate normal distribution (U; V ) s BN (0; 0; 1; 2; ).

The result now follows from the Cramer-Wold device.

Remark. The Cramer-Wold device states that for random vectors (Xn ; Yn )

we have

d

(Xn ; Yn ) =) (U; V )

if and only if

d

8(a; b) 6= (0; 0) : aXn + bYn =) aU + bV .

This device is easy to prove by using generating functions or characteristic

functions.

with a similar proof.

5

Theorem 4 Let (X1;j ; :::; Xk;j ), j = 1; 2; :::; n, denote a sample from a multi-

variate distribution (X1 ; X2 ; :::; Xk ) s A with means E(Xi ) = i and variance-

covariance matrix = (cov(Xi ; Xj ))ki;j=1 . For each i = 1; 2; :::; k, let X i =

P n

n 1 j=1 Xi;j . Then we have

p p p

P ( n(X 1 1) x1 ; n(X 2 2) x2 ; :::; n(X k k) xk )

! P (U1 x1 ; U2 x2 ; :::; Uk xk )

where (U1 ; U2 ; :::; Uk ) has a multivariate normal distribution with E(Ui ) = 0 and

Cov(Ui ; Uj ) = i;j .

The following corollary will we be useful.

Corollary 5 (5) Let (X1 ; Y1 ); (X2 ; Y2 ); :::; (Xn ; Yn ) denote a sample from a bi-

variate distribution (X; Y ) s A( 1 ; 2 ; 21 ; 22 ; ) and suppose that E(X 4 +Y 4 ) <

1. Consider the vectors

!

A = (X; Y ; X 2 ; Y 2 ; XY ),

! = ( ; ; E(X 2 ); E(Y 2 ); E(XY )).

1 2

Then p ! !) ! ! !

P ( n( A x ) ! P (V x ),

!

where V has a multivariate normal distribution with means 0 and with variance-

covariance matrix given by

0 2 1

1 Cov(X; Y ) Cov(X; X 2 ) Cov(X; Y 2 ) Cov(X; XY )

B 2

Cov(Y; X 2 ) Cov(Y; Y 2 ) Cov(Y; XY ) C

B 2 C

B 2

V ar(X ) Cov(X ; Y ) Cov(X 2 ; XY ) C

2 2

(2)

B C

@ V ar(Y 2 ) Cov(Y 2 ; XY ) A

V ar(XY )

3.2 Functions

Using the notations of Theorem 3, let us consider a new random variable

f (X; Y ), where the function f (x; y) is su¢ ciently smooth. Writing the …rst

terms of a Taylor expansion, we have

f f 1

f (x; y) = f (a; b) + (a; b)(x a) + (a; b)(y b) + R

x y 2

where the remainder term R is of the form

fx;x ( ; ) fx;y ( ; ) x a

R = (x a; y b) .

fx;y ( ; ) fy;y ( ; ) y b

Here the fa;b denote the second partial derivatives of f , and (resp. ) is

between x and a (resp. y and b). If these partial derivatives are bounded

around (a; b), for some constant c > 0 we have

jRj c((x a)2 + (y b)2 + j(x a)(y b)j).

6

Furthermore, if jx aj and jy bj , we …nd that

f f 2

f (x; y) f (a; b) (a; b)(x a) (a; b)(y b) 3c

x y

and hence also that

2 f f

3c + (a; b)(x a) + (a; b)(y b)

x y

f (x; y) f (a; b)

2 f f

3c + (a; b)(x a) + (a; b)(y b)

x y

Now replace (x; y) and (a; b) by (X; Y ) and ( 1; 2) and de…ne the following

quantities:

! f f

= ( 1; 2) =(

( ; ); ( 1; 2 )),

x 1 2 y

p p

An = 1 n(X 1 ) + 2 n(Y 2 ),

p

Kn = n(f (X; Y ) f ( 1 ; 2 )).

Note that Theorem 3 implies that P (A(n) x) ! P (W x) = P ( 1 U + 2V

x).

If X 1 and Y 2 , the previous analysis shows that

p p

3c n 2 + An Kn 3c n 2 + An

Now consider P (Kn x) and write P (Kn x) = I + II, where

I = P (Kn x; E),

II = P (Kn x; E c ),

where E is the event E = X 1 and Y 2 , and E c its com-

plement.

We have II P (E c ) P( X 1 > ) + P( Y 2 > ). Using the

inequality of Chebyshev, we obtain that

2 2

1 + 2

II .

n 2

If we choose such that n 2 ! 1, we obtain that II ! 0.

For I, we have

p p

I P ( 3 nc 2 + A(n) x; E) P (A(n) x + 3 nc 2 ).

p

If we choose such that n 2 ! 0, we …nd, after taking limits for n ! 1,

that I is bounded from above by P (W x). A good choice of is for example

= n 1=3 . On the other hand, we have

p

I P (3 nc 2 + A(n) x; E)

p p

= P (3 nc 2 + A(n) x) P (3 nc 2 + A(n) x; E c )

7

p

As before, we have P (3 nc 2 + A(n) x) ! P (W x). For the other term,

we have p

P (3 nc 2 + A(n) x; E c ) P (E c ) ! 0.

We obtain that as n ! 1, I is bounded from below by P (W x). We conclude

that

P (Kn x) ! P (W x).

Clearly we have E(W ) = 0 and for the variance we …nd that

2 1 ! !T

W = V ar(W ) = ( 1; 2) = .

2

where

V ar(X) Cov(X; Y ))

= .

Cov(X; Y ) V ar(Y )

This approach can also be used for random vectors with 3 or more components.

The general result is the following.

have p !

P ( n(f ( A ) f (!)) x) ! P (W x)

d Pk ! !T

where W = i=1 i Ui s N (0; 2W ) with i = ( f = xi )(!) and 2W = .

If (f1 (!

x ); f2 (!

x ); :::; fm (!

x )), is such a vector, it su¢ ces to consider linear

combinations of the form

h(!

x ) = u1 f1 (!

x ) + u2 f2 (!

x ) + ::: + um fm (!

x)

where (u1 ; u2 ; :::; um ) 6= (0; 0; :::; 0). Now Theorem 6 and the Cramer-Wold

device can be used.

4.1 The sample variance s2

!

Here is another proof of Theorem 2. Consider the vectors A = (X; X 2 ), ! =

!

( ; E(X 2 ) and the function f (x; y) = y x2 . In this case we …nd f ( A ) =

!

(n 1)s2 =n and f ( ) = 2 . Using ( 1 ; 2 ) = ( 2 ; 1) it follows from Theorem

6 that

p n 1 2 2

P ( n( s ) x) ! P (W x),

n

8

2

where W s N (0; W) with

2 V ar(X) Cov(X; X 2 ) 2

= ( 2 ; 1) 2

W Cov(X; X ) V ar(X 2 ) 1

= 4 2 V ar(X) 4 Cov(X; X 2 ) + V ar(X 2 )

= V ar(X 2 2 X)

= V ar((X )2 )

In probability theory and statistics, the coe¢ cient of variation (CV ) is a nor-

malized measure of dispersion of a probability distribution. It is de…ned as the

ratio of the standard deviation to the mean: CV = = . This is only de…ned

for non-zero mean , and is most useful for variables that are always positive.

The sample coe¢ cient of variation is given by

s

SCV = .

X

a:s:

If 6= 0, we have X ! 6= 0 and SCV is well-de…ned a:s:.Now we consider

!

the vectors A = (X; X 2 ), ! = ( ; E(X 2 ) and the function

p

y x2

f (x; y) = .

x

It is easy to see that f (!) = CV and that

r

! n 1

f(A) = SCV .

n

Straightforward calculations show that

E(X 2 ) 1

( 1; 2) =( 2

; ).

2

Using Theorem 6, we …nd that

r

p n 1

P ( n( SCV CV ) x) ! P (W x).

n

2

where W s N (0; W) with

2 ! V ar(X) Cov(X; X 2 ) !T

= 2

W Cov(X; X ) V ar(X 2 )

E 2 (X 2 ) E(X 2 ) 1

= 4 3 2

Cov(X; X 2 ) + 2 2

V ar(X 2 ).

4

9

To simplify, note that

E((X )3 ) = Cov(X; X 2 ) 2 2

,

V ar((X )2 ) = V ar(X 2 ) + 4 2 2

4 Cov(X; X 2 )

Now we …nd

2 E 2 (X 2 ) E(X 2 ) 4

W = 4

( 3 2 2 2

)(E((X )3 ) + 2 2

)

4

1

+ 2 2

(V ar((X )2 ) 4 2 2

)

4

2 2 2 2

( + ) 1 2 1

= 4 3

E((X )3 ) 2

+ 2 2

V ar((X )2 ) 1

4

4

1 1

= 4 3

E((X )3 ) + 2 2

V ar((X )2 ).

4

3

In terms of kurtosis (X) and skewness 1 (X) = E((X )3 ), we …nd that

4 3 2 2

2

W = 4 3 1 (X) + 2

(X) + 2

.

4 2

Remarks.

1) In the case of a normal distribution, we …nd that

4 2

2 1

W = 4

+ 2

= CV 4 + CV 2 .

2 2

2) In the case of an exponential distribution with parameter , we have

= = 1= , 1 = 2, =6

and then CV = 2W = 1.

3) For the Poisson( )-distribution, we have

2 1=2 1

= = , 1 = , =

1=2

and then CV = and

2 1 1

W = + 2.

2 4

If = 0, then CV is not de…ned but we can always calculate

1 X

= .

SCV s

10

P

If 2 < 1, the central limit theorem together with s2 ! 2

shows that we

have p

n p X d

= n =) Z

SCV s

where Z s N (0; 1). Now note that for x > 0, we have

p

n SCV 1

P( > x) = P ( p < ),

SCV n x

p

n SCV 1

P( < x) = P ( p > ).

SCV n x

As a consequence, we have

SCV d 1

p =) U = .

n Z

1 1 1 1

fU (u) = 2

fZ ( ) = 2 p exp( ).

u u u 2u2

2

From this it follows that E(U ) = 0 and U = 1.

4.4 A t-statistic

In the place of SCV we can study T = 1=SCV = X=s. This is a quantity

related to the t-statistic t = (X )=s. As in section 4.2, we obtain that

p d

n(T ) =) W

2

where W s N (0; U) where

4 3 2 2

2 2

U = 4 W =1 3 1 (X) + 2

(X) + 2

.

4 2

Note that for the t-statistic, we have the simpler result that

p X d

n =) Z s N (0; 1).

s

Another related statistic is related to the dispersion D = 2 = . This measure is

well de…ned for 6= 0 and can D be used for example to compare distributions

with di¤erent means. The corresponding sample dispersion is given by

s2

SD = .

X

11

!

To study SD, we consider A = (X; X 2 ), ! = ( ; E(X 2 )) and the function

2

f (x; y) = (y x )=x. Clearly we have

! n 1

f(A) = SD,

n

!

f ( ) = D,

! 2

1

= ( 2

2; ).

p d

n(SD D) =) W

2

where W s N (0; W) with

2 2

W = V ar( 1X + 2X )

2 4 3

2

= ( 2

( (X) + 2) + 4

2 3 1 (X)).

2 2

2 2

W = 2

(2 + 2

).

p 1 p X d 1

n = n 2 =) 2 Z,

SD s

where Z s N (0; 1), and then it follows as in section 4.3. that

1 d 2 1

p SD =) .

n Z

5.1 The sample covariance

!

Consider the vector A = (X; Y ; XY ), ! = ( 1; 2 ; E(XY )) and let f (x; y; z) =

z xy. In this case we …nd

!

f ( A ) = XY X Y

f (!) = Cov(X; Y )

and

!

=( 1; 2 ; 1)

It follows that

p !

P ( n(f ( A ) Cov(X; Y )) x) ! P (W x)

12

2

where W s N (0; W) and

0 1

V ar(X) Cov(X; Y ) Cov(X; XY )

!@ !

2

W = Cov(X; Y ) V ar(Y ) Cov(Y; XY ) A t

Cov(X; XY ) Cov(Y; XY ) V ar(XY )

!

Assuming …rst for simplicity that 1 = 2 = 0, we …nd = (0; 0; 1) and

2

W = V ar(XY ). In the general case we …nd that

2

W = V ar((X 1 )(Y 2 )).

2 2 2 2 2

W = E((X 1 ) (Y 2) ) = 1 2.

2 2

For a sample (X1 ; Y1 ); (X2 ; Y2 ); :::; (Xn ; Yn ) from (X; Y ) s A( 1; 2; 1; 2; ),

the sample correlation coe¢ cient is de…ned as

n XY X Y

r= . (3)

n 1 s1 s2

For n ! 1, a rough estimate gives

E(XY ) E(X)E(Y )

rt = ,

1 2

!

A = (X; Y ; X 2 ; Y 2 ; XY ),

! = ( ; ; E(X 2 ); E(Y 2 ); E(XY ))

1 2

e ab

f (a; b; c; d; e) = p

(c a2 )(d b2 )

!

Now we …nd f (!) = , f ( A ) = r, and the derivatives:

f b (e ab)a

= p + p

a (c a2 )(d b2 ) (c a2 ) (c a2 )(d b2 )

f a (e ab)b

= p + p

b (c a2 )(d b2 ) (d b2 ) (c a2 )(d b2 )

f 1 (e ab)

= p

c 2 (c a2 ) (c a2 )(d b2 )

f 1 (e ab)

= p

d 2 (d b ) (c a2 )(d

2 b2 )

f 1

= p

e (c a2 )(d b2 )

13

It follows that p

P ( n(r ) x) ! P (W x). (4)

! !t

where W s N (0; 2W ) and 2

W = with given in (2).

In this case we have

1 = + 2 , 2 = + 2 ,

1 2 1 1 2 2

1 1 1

3 = 2, 4 = 2, 5 = .

2 1 2 2 1 2

!

In the special case where 1 = 2 = 0 and 1 = 2 = 1, we …nd =

(0; 0; =2; =2; 1) and then we have

!

( )1 = Cov(X; X 2 ) Cov(X; Y 2 ) + Cov(X; XY )

2 2

!

( )2 = Cov(Y; X 2 ) Cov(Y; Y 2 ) + Cov(Y; XY )

2 2

!

( )3 = V ar(X 2 ) Cov(X 2 ; Y 2 ) + Cov(X 2 ; XY )

2 2

!

( )4 = Cov(X 2 ; Y 2 ) V ar(Y 2 ) + Cov(Y 2 ; XY )

2 2

!

( )5 = Cov(X 2 ; XY ) Cov(Y 2 ; XY ) + V ar(XY )

2 2

and then (recall that 1 = 2 = 0 and 1 = 2 = 1) we have:

! !t 2 2

2

W = = V ar(X 2 ) + Cov(X 2 ; Y 2 ) Cov(X 2 ; XY )

4 4 2

2 2

+ Cov(X 2 ; Y 2 ) + V ar(Y 2 ) Cov(Y 2 ; XY )

4 4 2

Cov(X 2 ; XY ) Cov(Y 2 ; XY ) + V ar(XY )

2 2

2

= V ar(X 2 ) + 2Cov(X 2 ; Y 2 ) + V ar(Y 2 )

4

(Cov(X 2 ; XY ) + Cov(Y 2 ; XY )) + V ar(XY )

2

= (E(X 4 ) 1 + 2E(X 2 Y 2 ) 2 + E(Y 4 ) 1)

4

(E(X 3 Y ) + E(XY 3 ) ) + E(X 2 Y 2 ) 2

2

= (E(X 4 ) + 2E(X 2 Y 2 ) + E(Y 4 ))

4

(E(X 3 Y ) + E(XY 3 )) + E(X 2 Y 2 )

In the general case, we …nd that

2

2

W = (E(X 4 ) + 2E(X 2 Y 2

) + E(Y 4

)) (5)

4

(E(X 3 Y ) + E(X Y 3

)) + E(X 2 Y 2

),

14

where

X E(X) Y E(Y )

X = and Y = .

1 2

2

The …nal result is that (4) holds with W given in (5).

Remarks.

2

1) We can rewrite W more compact as follows. Assuming standardized

variables, we have

2

2

W = V ar(X 2 ) + 2Cov(X 2 ; Y 2 ) + V ar(Y 2 )

4

(Cov(X 2 ; XY ) + Cov(Y 2 ; XY )) + V ar(XY )

2

= V ar(X 2 + Y 2 ) Cov(X 2 + Y 2 ; XY ) + V ar(XY )

4

= V ar( (X 2 + Y 2 ) XY )

2

2) Note that the asymptotic variance 2W only depends on and fourth-order

central moments of the underlying distribution.

3) If = 0, we …nd that 2W = E(X 2 Y 2 ).

4) If X and Y are independent, we have = 0 and 2W = E(X 2 Y 2 ) =

E(X 2 )E(Y 2 ) = 1.

5) If Y = a + bX, b > 0 we …nd = 1, Y = X and 2W = 0.

5.3 Application

To model dependence, one often uses a model of the following form. Starting

from arbitrary independent random variables A and B we construct the vector

(X; Y ) = (A; B + A). Given a sample (Xi ; Yi ) we want to test e.g. the

hypothesis H0 : = 0 versus Ha : 6= 0.

It is clear that

2 2

V ar(X) = X = A

2 2 2 2

V ar(Y ) = Y = B + A

2

Cov(X; Y ) = A

2

A

= (X; Y ) = q

2( 2 2 2

A B + A)

p d

nr =) Z s N (0; 1).

15

5.4 The bivariate normal case

For a standard bivariate normal distribution (X; Y ) s BN (0; 0; 1; 1; ), we show

how to calculate 2W , cf. (5).

First note that (U; V ) = (X Y; Y ) also has a bivariate normal distribution

with

Cov(U; V ) = Cov(X; Y ) Cov(Y; Y ) = 0.

It follows that U and V are independent with V s N (0; 1) and U s N (0; 1 2 ).

For general W s N (0; 2 ), we have W (t) = exp 12 2 t2 and then E(W ) =

E(W 3 ) = 0 and E(W 2 ) = 2 , E(W 4 ) = 3 4 .

Now observe that Y = V and X = U + V . We …nd

E(Y 4 ) = E(X 4 ) = 3;

E(Y X 3 ) = E(Y 3 X) = E(V 3 U + V 4 ) = 3 ;

E(Y 2 X 2 ) = E(V 2 (U 2 + 2 U V + 2 V 2 ) = 1 + 2 2 ;

It follows that

2

2 2 2

W = (3 + 2 + 4 + 3) (3 + 3 ) + 1 + 2

4

4 2 2 2

= 2 + 1 = (1 )

2 2 2 2 2

In general, for (X; Y ) s BN ( 1; 2; 1; 2; ), we also …nd that W = (1 ) ,

and then

2 2

(1 )

r t N( ; )

n

The approach of the previous section can now be used to construct con…dence

intervals for and to test hypothesis concerning .

In the bivariate normal case it is often necessary to test H0 : = 0 versus

Ha : 6= 0. In the bivariate normal case, usually one uses the t-transformation:

x

t(x) = p .

1 x2

Observe that we have

1

t0 (x) = p

(1 x2 ) 1 x2

Under H0 we have t( ) = 0 and t0 ( ) = 1 and then the t transformation shows

that p d

n t(r) =) Z s N (0; 1).

16

Remark. Note that

r3

t(r) r= p p .

1 r2 (1 + 1 r2 )

Under H0 it follows that

d 1 3

n3=2 (t(r) r) =) Z .

2

For large samples it is not very useful to use the t-transformation.

case, usually one uses the Fisher F -transformation:

1 1+x

F (x) = ln( ).

2 1 x

In this case we have

1

F 0 (x) =

1 x2

The F transformation leads to the popular result that

p p

n(F (r) F ( )) t F 0 ( ) n(r )

so that p d

n(F (r) F ( )) =) Z s N (0; 1).

This approach can also be used in the case where 0 = 0.

To see whether or not two ordinal variables are associated, one can use Spear-

man’s rank correlation coe¢ cient rS . In this case we start from the sample of

ordinal variables (X1 ; Y1 ); (X2 ; Y2 ); :::; (Xn ; Yn ) and we assign a rank going from

1 to n. The smallest X value gets label 1, the next smallest X value gets label

2,..., the largest of the X values is labelled with rank n. In a similar way we

label the Y values. In the case of ties, we assign each variable the average of

the rankings, cf. the example below.

Starting from (X1 ; Y1 ); (X2 ; Y2 ); :::; (Xn ; Yn ), we thus obtain a sequence of

ranks (R1 ; R1 ); (R2 ; R2 ); :::; (Rn ; Rn ). The rank correlation rS is given by the

ordinary correlation coe¢ cient between the two rankings. We use the notation

rS = rS (X; Y ) = r(R; R ).

As before, we calculate rS by using the general formula (3) as before. Formula

(3) can be rewritten as

P

Ri Ri nR R

rS = q .

P 2 2 P 2

( Ri nR )( Ri 2 nR )

17

Now note that (with or without ties):

X X n(n + 1)

Ri = Ri = 1 + 2 + :: + n = .

2

If there are no ties, we also have:

X X n(n + 1)(2n + 1)

Ri2 = Ri 2 = 1 + 22 + ::: + n2 = ,

6

X 2 n(n + 1)(2n + 1) (n + 1)2 n(n2 1)

Ri2 nR = n =

6 4 12

1X n(n + 1)(2n + 1) X

(Ri Ri )2 = R i Ri .

2 6

In the case of no ties, after simplifying, we …nd that:

Pn

6 i=1 (Ri Ri )2

rS = 1 . (6)

n(n2 1)

For independent variables, we can use the result of section 5.2 to conclude that

p d

nrS =) Z s N (0; 1).

the average of the rankings. Formula (5) to calculate rS should be modi…ed.

Consider the following example:

X Y R R R R (R R )2

3 10 1 1 0 0

6 15 2 2 0 0

9 30 3 4; 5 1; 5 2; 25

12 35 4 6 2 4

15 25 5 3 2 4

18 30 6 4; 5 1; 5 2; 25

21 50 7 8 1 1

24 45 8 7 1 1

P 2 P 2

InPthe case of no

P ties we had Ri = Ri = 204. In our example, we

have Ri2 = 204, Ri 2 = 203; 5. If there is 1 tie involving 2 observations, we

see that there is a di¤erence of 0; 5.

t2 = the number of ties involving 2 observations;

t3 = the number of ties involving 3 observations;

...

tk = the number of ties involving k observations.

18

Now we calculate the correction factor

23 2 33 3 k3 k

T = t2 + t3 + ::: + tk

12 12 12

In the case of ties, we replace (6) by:

Pn

6(T + i=1 (Ri Ri )2 )

rS = 1 .

n(n2 1)

6 Comparing variances

Testing hypothesis concerning di¤erences between means is well known and can

be found in any textbook about statistics. Less is known about comparing

variances. In the case of unpaired samples from normal distributions, the dis-

tribution of the quotient of the sample variances s21 =s22 can be determined and

is related to an F -distribution. In general, the analysis of s21 =s22 is more compli-

cated. In this section we study s21 =s22 for large samples. We consider unpaired

samples as well as paired samples.

Suppose that we have unpaired samples X1 ; X2 ; :::; Xn from X s A( 1 ; 21 ) and

Y1 ; Y2 ; :::; Ym from Y s B( 2 ; 22 ). In order to test whether or not 22 = 21 one

can use a test based on s21 and s22 . We need the following lemma.

Lemma 7 Suppose that E(X 4 + Y 4 ) < 1. As n ! 1 and m ! 1, we have

p p

P ( n(s21 2

1) x; m(s22 2

2) y)

p 2 2

p 2 2

P ( n(s1 1) x)P ( m(s2 2) y) ! P (U1 x)P (U2 y),

2 2

where U1 s N (0; V ar(X 1) ) and U2 s N (0; V ar((Y 2 ) ).

2

2 s21

K= 2 1.

1 s22

Clearly we have

2 2 2 2 2 2

2 (s1 1) 1 (s2 2) Q

K= = .

s22 21 s22 21

Now we write

p

p 2

p p n

nQ = 2 n(s21 2

1)

2

1 m(s22 2

2) p .

m

Using the notations of Lemma 7 we have the following result.

19

Theorem 8 Suppose that E(X 4 + Y 4 ) < 1. If n ! 1 and m ! 1 in such a

way that n=m ! 2 (0 < 1), then

p d d 1 1

nK =) V = 2 U1 2 U2 ,

1 2

2

and V s N (0; V ) with

2 1 2 2 1 2

V = 4V ar((X 1) ) + 4V ar((Y 2 ) ). (7)

1 2

p d

nQ =) W ,

d 2 2 P

where W = 2 U1 1 U2 . Using s2i ! 2

i (i = 1; 2), it follows that

p d 1 d 1 1

nK =) 2 1W = 2 U1 2 U2

1 2 1 2

Remarks.

1) If = 1, we can interchange the role of n and m.

2) From the practical point of view, we can use (7) to write

1 2 1 1 2 1 1 2

V t V ar((X 1) ) + V ar((Y 2 ) ).

n n 41 m 42

3) Note that the asymptotic variance depends on the kurtosis of the under-

lying distributions. We …nd that.

2 2

V = (X) + 2 + ( (Y ) + 2),

and then

1 2 1 1

V = ( (X) + 2) + ( (Y ) + 2)

n n m

p d

have nK =) V , where V s N (0; 2V ) with

2 2

V = V ar(X 2 ) + V ar(Y 2

),

Section 5.3. we …nd that 2V = 2(1 + 2 ), and then

1 2 1 1

V t 2( + )

n n m

2 2 2

4) If 1 = 2 = , we can study the pooled variance given by:

(n 1)s21 + (m 1)s22

s2p = .

n+m 2

20

Now we …nd that

p

p n 1 p m 1 np

n(s2p 2

)=( ) n(s21 2

)+ p m(s22 2

)

n+m 2 n+m 2 m

It follows that

2

p d d

n(s2p 2

) =) W = 2 U1 + 2 U2

+1 +1

2

In this case W s N (0; W ), with

2

2

W =( 2 )2 V ar((X 2

1) ) +( 2 )2 V ar((Y 2

2 ) ).

+1 +1

2 2 2

In the case of samples from normal distributions with 1 = 2 = , we …nd

that

2

2 4

W = 2 ( 2 )2 + 2 4

( 2 )2

+1 +1

2

4 4 n

= 2 2 t2 .

1+ n+m

Let (X1 ; Y1 ); (X2 ; Y2 ); :::; (Xn ; Yn ) denote a sample from an arbitrary bivariate

distribution (X; Y ) s A( 1 ; 2 ; 21 ; 22 ; ). We prove the following result.

p p

P ( n(s21 2

1) x; n(s22 2

2) y) ! P (U1 x; U2 y).

where (U1 ; U2 ) has a bivariate normal distribution with zero means and with

variance-covariance matrix

2 2 2

V ar((X 1) ) Cov((X 1 ) ; (Y 2) )

2 2 2 .

Cov((X 1 ) ; (Y 2) ) V ar((Y 2) )

Proof. Take arbitrary real numbers (u; v) 6= (0; 0) and consider the vectors

!

A = (X; Y ; X 2 ; Y 2 ),

! = ( ; ; E(X 2 ); E(Y 2 )),

1 2

! 2 2

f(A) = u(X 2 X ) + v(Y 2 Y )

n 1

= (us21 + vs22 )

n

21

!

and f (!) = u 21 + v 22 . It is easy to see that = ( 2u 1 ; 2v 2 ; u; v). The

transfer results of section 3.2 show that

p !

P ( n(f ( A ) f (!)) x) ! P (W x),

2 2 ! !t

where W s N (0; W) with W = and

0 2

1

1 Cov(X; Y ) Cov(X; X 2 ) Cov(X; Y 2 )

B 2

Cov(Y; X 2 ) Cov(Y; Y 2 ) C

=B

@

2 C.

V ar(X 2 ) Cov(X 2 ; Y 2 ) A

V ar(Y 2 )

! !t 2 2

= V ar(u(X 1) + v(Y 2 ) ).

It follows that

p n 1

P ( n( (us21 + vs22 ) (u 2

1 +v 2

2 )) x) ! P (W x),

n

d

where W = uU1 + vU2 , and (U1 ; U2 ) has the desired bivariate normal distribu-

tion. It is clear that the correction factor (n 1)=n is not important. The result

follows by using the Cramer-Wold-device. p

As in Theorem 8, we consider K and now we conclude that P ( nK x) !

P (V x), where

d 1 1

V = 2 U1 2 U2 .

1 2

2

We …nd that V s N (0; V ) with

2 2 2 2

2 V ar((X 1) V ar((Y 2) ) Cov((X 1 ) ; (Y 2) )

V = 4 + 4 2 2 2

2 2 1 2

Remarks

1) We can rewrite 2V more compact as follows. Using the notation X =

(X 1 )= 1 and Y = (Y 2 )= 2 we have

2

V = V ar(X 2 ) + V ar(Y 2

) 2Cov(X 2 ; Y 2

)

= V ar(X 2 Y 2 )

= E((X 2 Y 2 )2 )

(cf Section 5.3) that

2

V = E(X 4 ) + E(Y 4

) 2E(X 2 Y 2

) = 4(1 2

).

In the case of = 0 we …nd back the result of the unpaired case with = 1.

22

7 References

1. Bentkus,V., Jing, B.Y., Shao, Q.M. and Zhou, W., 2006, Limiting distri-

butions of the non-central t-statistic and their applications to the power

of t-tests under non-normality. Bernoulli 13:2, 346-364

2. P. Billingsley (1968). Convergence of probability measures. Wiley, New

York.

3. W. Feller, (1971). An introduction to probability theory and its applica-

tions, Vol. 2 (2nd edition). Wiley, New York.

4. G. Grimmet and D. Stirzaker (2002). Probability and Random Processes

(3rd edition). Oxford University Press, London.

5. Ladoucette, S.A. (2007). Analysis of Heavy-Tailed Risks. Ph.D. Thesis,

Catholic University of Leuven.

and applications. To appear: Stochastics: An International Journal of

Probability and Stochastic Processes, Vol. 80, N 2-3, 211-227.. Available

on http://arxiv.org/abs/0712.3440

7. S. Ross (1998). A …rst course in probability (5th edition). Prentice-Hall,

New York.

8. O. Rykunova (1997). Some applications of asymptotic distribution of the

sample correlation coe¢ cient Proceedings of Tartu Conference on Compu-

tational Statistics and Statistics Education (Ed. E.M. Tiit), University of

Tartu, Estonia, 140-147.

9. Sta¤ of Research and Education Association (1986). The Statistics Prob-

lem Solver. Research and Education Association, New York

23

- Normal DistributionUploaded byshahzaiblatafat12
- _24_._09_._2018_Economic DUploaded byMarius Ndong
- MATH30-6 Lecture 8.pptxUploaded bymisaka
- Ch-4 Worksheet Bus Stat2017Uploaded byAbebe Tilahun K
- StatisticsUploaded byesjai
- 1. Humanities-The Use of Participatory Learning Method in Training-Maryam GhorbaniUploaded byImpact Journals
- Biostatistics - Multiple Choice QuestionsUploaded bySwastik Suresh
- Greenwood Management Annual Report - Acacia Mangium 2012Uploaded byGreenwood Management ApS
- STAT 3360 Homework Chapter 7Uploaded byxxambertaimexx
- MAT 540 Statistical Concepts for ResearchUploaded bynequwan79
- UUM-BWRR3033-Risk Management--CHAPTER 04 Risk MeasurementUploaded byRodziah Ahmad
- Standard Deviation and Mean-VarianceUploaded bypdfdocs14
- 1030_3934_1045Uploaded byCitra Mustika
- Risk and InsuranceUploaded byhatanolove
- STA230-20100329163207Uploaded byPi
- UBLUploaded byNitin Kumar
- Chapter 15Uploaded bydodder19
- Chap008(1)Uploaded bysumana114
- Ch2 Introduce to Firing TheoryUploaded byfatihy73
- lecture-1_2Uploaded byNhu Huynh
- Manjit S. Kang, Hugh G. Gauch-Genotype -by- environment interaction.pdfUploaded byLenio Urzeda
- Special Topic Project.docxUploaded byCường Phú
- 2 Sampling Distribution Problem Answers.pdfUploaded bypraveenmantur
- MCS_in_RUploaded byDebajyoti Nandy
- Normal Distribution NotesUploaded byAnjnaKandari
- Ec203I2EE Lec3 KtUploaded bydevurenemy
- r e p o r t in s t a t i s t i c sUploaded byEmily Gilber
- 12-confint (3)Uploaded byNovica Petkovic
- EL531WHpdfUploaded byJudea Estrada
- Poe 4 FormulasUploaded bymissinu

- BBA3Econometrics_ExampleTextUploaded byEdward Omey
- Random sums of random variables and vectors by E. Omey and R. VesiloUploaded byEdward Omey
- BBA3EconometricsUploaded byEdward Omey
- Central Limit theory for the sample variance and correlation coefficientUploaded byEdward Omey
- Generalized regular variation of order n by E.Omey and J.SegersUploaded byEdward Omey

- fbla 276Uploaded byapi-324719969
- Stainless Steel Electrodes EngUploaded byKeneth Del Carmen
- [Solaris] Memory Blacklisting, Duplicate IP Address & Recovery, Group Package Installations, .Uploaded byramalingam_dec
- Reflections on Business Process LevelingUploaded bycobalt vj
- Cognitive PoeticsUploaded byMona-Lisa Donea
- Vitamin E and Decline in Alzheimer's DiseaseUploaded byPier Paolo Dal Monte
- Didactic GuidelinesUploaded byDionysia Nima
- Controllability in Process DesignUploaded byKokil Jain
- Hindustan Unilever Limited is the Indian Arm of the AngloUploaded byParle Diamond
- Clinical Manifestations and Diagnosis of Gastroesophageal Reflux Disease in Children and Adolescents - UpToDateUploaded bydenis
- Media IN GLOBALIASTIONUploaded byPranjal Chopda
- problem based learningUploaded byFifi Epi
- Spaziano v. Florida, 468 U.S. 447 (1984)Uploaded byScribd Government Docs
- Multiple.integralsUploaded byparinaz_bhot
- 10 Principles of Change ManagementUploaded byextraordineriz
- CentralizersUploaded byDaniel
- Mine Haul Road Design, Construction and Maintenance management.pdfUploaded bymuhammadfarid
- Pentaho Data Lake-1Uploaded byhokusmanoli
- 01 - CK - Formalisms and ConceptsUploaded byHassen Di Maria
- HeinekenUploaded byPriya Tahalramani
- Compensating ControlUploaded bykasim_arkin2707
- Copy of the JudgmentUploaded byPallavi Sawhney
- Quest Design - Jeff HowardUploaded byKiko79
- lab2-SignalsUploaded byMelih Atasever
- Oregon State Sovereignty DeclarationUploaded byGuy Razer
- TetraselmisUploaded byMuhammad Ikhram Fuady
- The Event Horizon Telescope Collaboration 2019 ApJL 875 L5Uploaded byEconomy 365
- ethno bank.pdfUploaded byMuhammad Nasrum
- Lecture #1 - Modern Programming Platforms (CSHTP3e_03)Uploaded byOrçan Yedal Öksüz
- Subjects and Objects of International LawUploaded byjayarcy