Handout 12

Plan for lecture 12
1. The distribution of the sample variance of independent,

normally distributed random variables (Chap 8.3)
2. The t and F distributions (Chap 8.4 until Beta Distribution)
The distribution of the sample variance

Theorem
(Theorem 8.3.6 ) Let X1 , ..., Xn be independent, N(, 2 )
distributed. Then :
1. X and Xi X are independent.
2. X and S 2 are independent.
2
2 (n 1).
3. (n1)S
2
We now prove the Statement 3.
Recall
2

n
n
X
(n 1)S 2 X Xi X
Xi 2
=
and
2 (n)
2
i=1
i=1
Interpretation: If we replace by its estimator X , we have to

hand in a degree of freedom.
The distribution of the sample variance S 2

Proof
We prove the theorem for n = 2, in other words, we show that
S2
2 (1).
2
2
2
1 X
Xi X
S =
21
2
i=1
(X1 X2 )2
.
=
2
X1 , X2 are N(, 2 ) distributed.
Subsequently X1 X2 is normally distributed (as linear
combination of normally distributed variables), with expected value
E (X1 ) E (X2 ) = 0 and variance V (X1 ) + V (X2 ) = 2 2 (X1 and
X2 are independent, so cov (X1 , X2 ) = 0)
The distribution of the sample variance S 2
X
1 X2
is standard
2
I Z 2 is 2 (1) distributed
I
Z=
From Z 2 =
(21)S 2
2
normally distributed
.
follows that
(21)S 2
2
is 2 (1) distributed.
I
I
I
In the same way one can prove that

(n1)S 2
2 (n 1)
2
2
1 Pn
2
Idea: Write (n1)S
=
2
i=1 (Xi )
2
1 Pn
2
2
i=1 (Xi ) (n)
2
(X )2
2
n
(X )2
2
n
2 (1)
Show (using the method of the moment generating function)

that if U, V are two independent variables such that
U + V 2 (k + l) and V 2 (l), then U 2 (k).
S 2 and X are independent variables
Then
(n1)S 2
2
I
I
(n1)S 2
2
(X )2
2
n
(X )2
and
(X )2
2
n
2
n
1
2
are also independent
Pn
i=1 (Xi
2 (1)
This gives
(n1)S 2
2
2 (n 1)
)2 2 (n)
The expected value of S 2
E (S 2 ) = 2 .
Proof:
I
The expected value of a 2 (k) distributed variable is k, so

E (S 2 ) =
2
(n 1)S 2
2
E(
(n 1) = 2
)
=
2
n1
n1
Thus, S 2 on average equals 2 .
The variance of S 2
2 4
Var (S ) =
n1
Interpretation: The larger the value of n, the more the
distribution of S 2 concentrates around 2 (the variance of S 2
approaches 0).
Proof:
2
The variance of a 2 (k)-distributed variable is 2k, so

2

2
(n 1)S 2
Var (S ) =
Var
n1
2
2 2
2 4
=
2(n 1) =
n1
n1
2
The t-distribution
I
I
I
Let X1 , , Xn be a sample from the N(, 2 ).
Z = (X )/(/ n) N(0, 1)
Which distribution can we use when is not known and is
estimated by S? In other words, what is the distribution of
the following random variable ?
X
S/ n
Observe that
X
(X )/(/ n)
=
S/
S/ n
We know that V = (n 1)S 2 / 2 is 2 (n 1) distributed;
Z=
Hence,
and V = (n 1)S 2 / 2 are independent.
S/ n
(X )/(/ n)
S/
qZ
V
n1
The t-distribution
Let Z be a N(0, 1) distributed variable
Let V be a 2 () distributed variable.
If Z and V are independent, then

Z
T =q
is t-distributed with degrees of freedom.
The t-distribution
In our case, if X1 , , Xn be a random sample from N(, 2 ).
Z = (X )/(/ n) is N(0, 1) distributed;
V = (n 1)S 2 / 2 is 2 (n 1) distributed;
Z and V are independent,
Thus,
X
Z
=p
S/ n
V /
is t-distributed with = n 1 degrees of freedom.
The t -distribution
What is he pdf and cdf of a t distribution?
p
I Let T = Z / V /.
I
We want to calculate the cdf and pdf of a t distributed

random variable. It is sufficient to calculate
FT (t) = P(T t).
p
P(T t) = P(Z / V / t)
Z +
p
=
P(Z / V / t|V = a)fV (a)da
0
Z +
p
=
P(Z t a/|V = v )fV (a)da
0
Because Z and
V are independent,
p
R +
P(T t) = 0 P(Z t a/)fV (a)da
The t-distribution
I
By differentiating
R + p over tpit follows (using the chain rule) that:
fT (t) = 0
a/(t a/)fV (a)da
fT (t) =
0
a 1 t 2 a/2
e
1
a/21 e a/2 da
/2
2 (/2)
Z
2
=C
a(+1)/21 e (1+t / )a/2 da
with C =
1
1
2(+1)/2 (/2)
We recognize under the integral is close to the density function

2
of a Gamma distribution with k = +1
2 and = 1+t 2 / .
Normalizing such an integral with k (k) will result in 1.
Hence the integral equals to k (k).

fT (t) = C k (k)
+1
+1
2
2 (
)
)
2
1 + t /
2
+1
+1
+1
2
= C 2 2 (
) 1 + t 2 /
2

(+1)/2
1 (( + 1)/2)
t2
=
1+
(/2)
=C (
Properties of the t distribution
Let T be a t distributed variable with degrees of freedom.
The probability density function of a t-distributed variable is

given by
1 (( + 1)/2)
f (t) =
(/2)
For > 1, E (T ) = 0
For > 2, Var (T ) =
2 .

(+1)/2
t2
1+
Properties of the t distribution
The F -distribution
I
Let X11 , , X1n1 be a random sample from N(1 , 2 1 ) (1 is

unknown)
Let X21 , , X2n2 be a different random sample from

N(2 , 2 2 ). (2 is unknown.)
Compare 12 and 22 .
We can use
n
1
2
1 X
=
X1i X1 ,
n1 1
i=1
n
2
2
1 X
=
X2i X2
n2 1
i=1
to estimate 12 and 22 , respectively.
The F distribution
I
V1 is a 2 (1 )-distributed random variable ,
V2 is a 2 (2 )-distributed random variable ,
V1 and V2 are independent.
The distribution of
V1 /1
V2 /2
is called the F -distribution with 1 degrees of freedom in the

numerator and 2 degrees of freedom in the denominator and
is denoted by F (1 , 2 )
I
The probability density function is given by

((1 + 2 )/2)
x 1 /21
(1 /2 )1 /2
(1 /2)(2 /2)
(1 + (1 /2 )x)(1 +2 )/2
for x 0.
The F distribution
This gives:
V1 =
V2 =
V1 and V2 are independent
The random variable
(n1 1)S12
2 1
(n2 1)S22
2 2
has a 2 (n1 1) degrees of freedom

has a 2 (n2 1) degrees of freedom
V1 /(n1 1)
S 2 1 / 2 1
=
F = 2
2
S 2 / 2
V2 /(n2 1)
is F (n1 1, n2 1) .
Property of the F-distribution
If X F (1 , 2 ), then Y = X1 F (2 , 1 ).
Thus P(Y y ) = P( X1 y ) = P(X y1 ) = 1 P(X y1 ).
Looking up in the table on page 609
1. Assume that we want to find b for which P(X < b) = 0.9, if

X F (8, 4).
We search in the column 1 = 8 and the group for 2 = 4, row 0.9
and get b = 3.95.
2. Assume that we want to find b for which P(X < b) = 0.1, if
X F (8, 4).
0.1 is not tabulated, but we can find it as follows.
The probability P(X < b) = P( X1 > b1 ) = 1 P(Y < b1 ) with
Y F (4, 8). We therefore have to find b such that
P(Y < b1 ) = 0.9 for Y F (4, 8). In the 4-th column, group
1
2 = 8, we find the value 2.81. Therefore b = 2.81
.
Exercise 8.15
1. Let Xi N(, 2 ), i = 1, ..., n and Zi N(0, 1), i = 1, ..., k be

independent random variables. Give the probability distribution of
the following variables:
(h) ZZ21
(e)
(p)
n(X )
SZ P
(k1) ni=1 (Xi X )2
P
(n1) 2 ki=1 (Zi Z )2
(h)
I
Z1 N(0, 1) and Z22 (1) (see Theorem 8.3.5)
Z1 and Z2 are independent (VERY IMPORTANT), so from

the definitions of the tdistribution follows that
Z
q 1 t(1)
Z22
(e) The variable whose distribution we have to determine is defined

in terms of the variables X and SZ , therefore we think of
distributions related to these variables and try to combine them.
I
Xi N(, 2 ), so
N(0, 1). (see slides Lecture 10 or
Theorem 8.3.1 in the textbook (+corollary))

I
(k1)S 2
Z
Zi N(0, 1),i = 1, ..., k so
2 (k 1) (see Lecture
12
11 or Theorem 8.3.6 (c) in the textbook)
Because Xi and Z are mutually independent, X and SZ2 are
also mutually independent.
According to the definitions of the t distribution, it follows
that
(X )/(/ n)
q
t(k 1).
2
(k 1)SZ /(k 1)
After a few calculations it follows that:

n(X )
t(k 1).
SZ
(p)
(n1)SX2
(n 1)
2
(k1)SZ2
I
(k 1)
12
I Because Xi , i = 1, ..., n
I
and Zi , i = 1, ..., k are mutually

independent, SX and SZ are also mutually independent.
From the definitions of the F distribution it follows that:

((n 1)SX2 / 2 )/(n 1)
F (n 1, k 1).
(k 1)SZ2 /(k 1)
This is equivalent to:

P
(k 1) ni=1 (Xi X )2
F (n 1, k 1).
P
(n 1) 2 ki=1 (Zi Z )2
Exercise 2
2. Let V be a 2 (k) distributed random variable. Give an
approximation for the probability P(V /k u) for k large.
I
We can view V as the sum of Y1 , Y2 , . . . , Yk , with Yi being

independent, 2 (1) distributed random variables.
V /k can be seen as the sample mean of the random sample

Y1 , . . . , Yk from a 2 (1) distribution.
The 2 (1) distribution has expected value = 1 and variance

2 = 2.
From the CLT ( variant sample mean) it follows that V /k can

be approximated by a normal distribution with expected value
1 and variance 2/k (standard deviation 2k . )
Exercise 3
3. Let X1 , . . . , Xn be a random sample from a normal distribution

with variance 2 . Give an approximation for the distribution of S 2
for n large.
I
From Exercise 2 it follows that the distribution of the S 2 / 2

can be approximated by a normal distribution with expected
value 1 and variance 2/(n 1).
From S 2 = 2 S 2 / 2 and S 2 / 2 being approximately normally

distributed, it follows that S 2 is approximately normally
distributed with expected value 2 and variance 2 4 /(n 1).
Exercise 4
The Rayleigh distribution with parameter 1 has the probability
density function given by
(
2
xe x /2 for x > 0
f (x) =
0
otherwise
(a) Show that if X is Rayleigh distributed with parameter 1, the
moment generating function of Y = X 2 is given by
MY (t) = (1 2t)1 for t < 21 .
(b) Use the moment generating function to derive E (Y ) and
Var (Y ).
P
(c) Define S = ni=1 Xi2 . Show that for large n, the distribution of
the variable
S
U = n( 1)
2n
can be approximated by a standard normal distribution.
Solution. (a)
tX 2
MY (t) = E [e ] =
e tx xe x
0
Z
1
e tw e w /2 dw
=
2 0
= (1 2t)1
2 /2
dx
for t < 12 .
(b) Calculate E (Y ) = MY0 (0) = 2 and E (Y 2 ) = MY00 (0) = 8. The
variance of Y is thus equal to 4.
(c) From the Central limit theorem (variant with the sample mean),
the standard normal distribution is the limit distribution of
It follows that, for large n,
n(
S
1)
2n
could also be approximated by the N(0, 1) distribution .
S
2
n
2
n

Handout 12

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Handout 12

Uploaded by

Copyright:

Available Formats

Plan for lecture 12

1. The distribution of the sample variance of independent,

The distribution of the sample variance

Interpretation: If we replace by its estimator X , we have to

The distribution of the sample variance S 2

The distribution of the sample variance S 2

In the same way one can prove that

Show (using the method of the moment generating function)

S 2 and X are independent variables

are also independent

The expected value of S 2

The expected value of a 2 (k) distributed variable is k, so

Thus, S 2 on average equals 2 .

The variance of a 2 (k)-distributed variable is 2k, so

Let X1 , , Xn be a sample from the N(, 2 ).

We know that V = (n 1)S 2 / 2 is 2 (n 1) distributed;

and V = (n 1)S 2 / 2 are independent.

Let Z be a N(0, 1) distributed variable

Let V be a 2 () distributed variable.

If Z and V are independent, then

is t-distributed with degrees of freedom.

In our case, if X1 , , Xn be a random sample from N(, 2 ).

Z = (X )/(/ n) is N(0, 1) distributed;

Z and V are independent,

is t-distributed with = n 1 degrees of freedom.

We want to calculate the cdf and pdf of a t distributed

We recognize under the integral is close to the density function

Normalizing such an integral with k (k) will result in 1.

Hence the integral equals to k (k).

Properties of the t distribution

Let T be a t distributed variable with degrees of freedom.

The probability density function of a t-distributed variable is

For > 2, Var (T ) =

Properties of the t distribution

Let X11 , , X1n1 be a random sample from N(1 , 2 1 ) (1 is

Let X21 , , X2n2 be a different random sample from

to estimate 12 and 22 , respectively.

V1 is a 2 (1 )-distributed random variable ,

V2 is a 2 (2 )-distributed random variable ,

V1 and V2 are independent.

is called the F -distribution with 1 degrees of freedom in the

The probability density function is given by

V1 and V2 are independent

The random variable

has a 2 (n1 1) degrees of freedom

Property of the F-distribution

Looking up in the table on page 609

1. Assume that we want to find b for which P(X < b) = 0.9, if

1. Let Xi N(, 2 ), i = 1, ..., n and Zi N(0, 1), i = 1, ..., k be

Z1 N(0, 1) and Z22 (1) (see Theorem 8.3.5)

Z1 and Z2 are independent (VERY IMPORTANT), so from

(e) The variable whose distribution we have to determine is defined

N(0, 1). (see slides Lecture 10 or

Theorem 8.3.1 in the textbook (+corollary))

After a few calculations it follows that:

and Zi , i = 1, ..., k are mutually

From the definitions of the F distribution it follows that:

This is equivalent to:

We can view V as the sum of Y1 , Y2 , . . . , Yk , with Yi being

V /k can be seen as the sample mean of the random sample

The 2 (1) distribution has expected value = 1 and variance

From the CLT ( variant sample mean) it follows that V /k can

3. Let X1 , . . . , Xn be a random sample from a normal distribution