You are on page 1of 27

15-08-2010_1300IST

INTERESTING RESULTS

Let us state few interesting results which will be used quite frequently and universally in the

study of interval estimation as well as hypothesis testing.

Consider that if ( X 1 , X 2 , , X n ) are n number of r.v. observations such that each time we

have the realized values as (x1, j , x2, j , , xn , j ) for j being the number of such samples we take,

2
from X ~ N ,( 2
). Then (
X 1 + X 2 + + X n ~ N n , n 2
) and X n ~ N , , i.e.,
n

Xn 1 n
(X i X n )2
n
~ N (0,1) . Now let us define (i) sn/ = ( ) 1

2
=
2 2
X and s
n i =1
i n
(n 1) i=1
n

Result 1

Then we have

X
2 2
nsn/ n
= i ~ n and
2

2 i =1

(n 1)sn2
2
n
X Xn
= i ~ n21
2 i =1

Here we see that it losses one degree of freedom due to the fact that

1 n
X i = (X1 + X 2 + + X n ) = X n .
1

n i =1 n

Note: For a normal distribution X n and sn2 are the sample mean and the sample variance and

they are distributed normal and n2 or n21 respectively depending on whether population

mean, , is known or unknown.

1
15-08-2010_1300IST

Result 2

Then we have

n ( X n )

= n ( X n ) ~ t . In case n is large, then we know that t ~ N (0,1) .
n1
(n 1)sn2
n1
sn

2

(n 1)

Result 3

Assume ( X 1 , X 2 , , X m ) are m number of r.v. observations drawn from X ~ N(1, 12) such

that each time we have the realized values as (x1, j , x2, j , , xn , j ) for j being the number of such

samples and (Y1 , Y2 , , Yn ) be n number of r.v. observations drawn from Y ~ N(2, 22) such

that each time we have the realized values as ( y1,k , y2,k , , yn ,k ) for k being the number of

1 m
X i = (X1 + X 2 + + X m )
1
such samples. Then Xm =
m i =1 m
and

1 n
Yn = Yi = (Y1 + Y2 + + Yn ) are the respectively sample means for the two different
1
n j =1 n

1 m
samples of size m and n respectively. While we define,
2
sm/ = ( X i 1 )2 ,
m i =1

(X i X m )2 , sn/ 2 = 1 (Y j 2 )2 and sn2 = 1 (Y j Yn )2 . Then


m n n
1
sm2 =
(m 1) i=1 n j =1 (n 1) j =1

(m 1)sm2 = m X i X m ~ 2
2 2
X 1
2
msm/ m

12
= i
1
~ m2 and
12

i =1
m 1
i =1 1

(n 1)sn2
2
Y j 2
2
2 n Y Y
nsn/ n

22
=
j =1 2

~ 2
n and
22
= j n ~ n21

j =1 2

2
15-08-2010_1300IST

ms / 2 (m 1)sm2
m
1
2 2

m
1

(m 1)
2 s / 2 22 sm2
Then = 2 m
~ Fm ,n and = ~ Fm1,n1 are true
ns / 2 12 sn/ 2 (n 1)sn2 12 sn2
n 2
22

2

(n 1)
n


3
15-08-2010_1300IST

STATISTICAL INFERENCE: INTERVAL ESTIMATION

We consider a set of n observations, ( X 1 , X 2 ,, X n ) , then say we find two estimators,

t n ,1 ( X 1 , X 2 , , X n ) and t n , 2 ( X 1 , X 2 , , X n ) in order to estimate the population parameter, .

Moreover we are interested in (1-) being the level of confidence that we are sure that the

value of the parameter, whci his by itself constant lies in the random interval between

t n ,1 ( X 1 , X 2 , , X n ) and t n , 2 ( X 1 , X 2 , , X n ) , i.e., as shown

t n ,1,1 t n ,1, 2 t n , 2,1 t n , 2, 2

Thus we should have P[t n ,1 ( X 1 , X 2 ,, X n ) t n , 2 ( X 1 , X 2 ,, X n )] = (1 ) , this means that

t n , 2 ( X 1 , X 2 ,, X n ) t n ,1 ( X 1 , X 2 ,, X n ) is the confidence interval (C.I.) which has the

parameter, , inside it with probability (1 )

Confidence Interval for when is known

Xn
We already know that ~ N (0,1) , hence the C.I. is given by


n



X
P Z n Z = (1 )
1 2
2

n



Xn

P Z Z = (1 )
2
2

n

4
15-08-2010_1300IST


P X n Z X n + Z = (1 ) . We see the interval between
2 n 2 n


X n Z = t n ,1 ( X 1 , X 2 ,, X n ) and X n + Z = t n , 2 ( X 1 , X 2 ,, X n ) is random, but
n
2 n 2

the probability that this interval will contain the population parameter, in this case the

expected value, is given by (1 ) . Remember this distance between t n ,1 and t n , 2 is the

shortest distance. Such that it contains the parameter with probability (1 ) .

5
15-08-2010_1300IST

Confidence Interval for when is unknown

We have the following, i.e., facts which are:

1 n
( X i X n )
2
1. sn2 =
n 1 i =1

2. X n and sn2 are independent

Xn
3. ~ N (0,1)


n


X

n



sn2 n n (X n )
4. (n 1) ~ n21 , such that ~ tn 1 , i.e., ~ tn 1
2 2 sn
(n 1) sn2

(n 1)
Hence the C.I is given as

n (X n )
P t t = (1 )
n 1,1 2 sn n 1,
2

n (X n )
P t t = (1 )
n 1, 2 sn n 1,
2

s s
P X n n t X n + n t = (1 )
n n 1, 2 n n 1, 2

6
15-08-2010_1300IST

Confidence Interval for when is known

We have the following, i.e., facts which are:

1 n
1. sn/ = ( X i )2
2

n i =1

Xi
2 2
sn/ n
2. n
2
=
i =1
~ n
2


2
s/
Hence we would have P n2,11 n n2 n2, 2 = (1 ) , i.e.,

nsn/ nsn/
2 2

P 2 2 = (1 ) is the C.I. for this example, such that [(1-) + (1-1) + 2 =


2

n , 2 n ,11

1] is true and for the shortest interval we would have 1 = 2 = /2, such that the C.I. is now

/2 2
nsn nsn/
P 2 2 = (1 ) holds
2


n, n ,1
2 2

7
15-08-2010_1300IST

Confidence Interval for when is unknown

We have the following, i.e., facts which are:

2
sn2 n X i X n
(X i X n ) , hence 2. (n 1) 2 =
1 n
~ n21
2
1. s =
2

n 1 i =1 i =1
n

s2
Hence we would have P n21,11 (n 1) n2 n21, 2 = (1 ) , i.e.,

(n 1)sn2 (n 1)sn2
P 2 2 = (1 ) is the C.I. for this example, such that [(1-) + (1-1) +
2

n 1, 2 n 1,11

2 = 1] is true and for the shortest interval we would have 1 = 2 = /2, such that


(n 1)sn ( ) n
2 2

= (1 ) is holds
n 1 s
P 2 2 2

n 1, n 1,1

2 2

8
15-08-2010_1300IST

Confidence Interval for (1 - 2) when both 1 and 2 are known

( ) (
Consider we have two population where X ~ N 1 , 12 and Y ~ N 2 , 22 and from these )
two populations we take m and n number of observations respectively, such that

X m 1 Y 2
~ N (0,1) and n ~ N (0,1) are true, and one wants to find the C.I for the case of
1 2

m n



(X m Yn ) (1 2 )
(1 - 2), hence ~ N (0,1) , so that the C.I. can finally be written as
12 22
+
m n

12 22 12 22
P (X m Yn ) Z + (1 2 ) (X m Yn ) + Z + = (1 ) .

2 m n
2 m n

9
15-08-2010_1300IST

Confidence Interval for (1 - 2) when both 1 and 2 are unknown but equal

As in the previous example, again consider we have two population where X ~ N 1 , 12 ( )


( )
and Y ~ N 2 , 22 and from these two populations we take m and n number of observations

X m 1 Y 2
respectively, such that ~ tm 1 and n ~ t n 1 , hence the C.I. can finally be written
sm,1 sm, 2

m n

as

1 1
P ( X m Yn ) t sp (1 2 ) ( X m Yn ) + tm+n2, s p = (1 )
m+ n2 ,
2 m+n 2 m + n

(m 1)sm2 ,1 + (n 1)sn2, 2
Where s = 2
is the pooled sample variance

p
(m + n 2)

10
15-08-2010_1300IST

Confidence Interval for (1 - 2) when both 1 and 2 are unknown but unequal

As in the previous example, again consider we have two population where X ~ N 1 , 12 ( )


( )
and Y ~ N 2 , 22 and from these two populations we take m and n number of observations

X m 1 Y 2
respectively, such that ~ tm 1 and n ~ t n 1 , hence the C.I. can finally be written
sm,1 sm, 2

m n

as

sm2 ,1 sn2,1 sm2 ,1 sn2,1


P ( X m Yn ) t + (1 2 ) (
m n
X Y ) + t + = (1 )
m + n 2, m n


m + n 2, m n

2 2

11
15-08-2010_1300IST

12
Confidence Interval for 2 when both 1 and 2 are known
2

( ) (
Consider we have two population where X ~ N 1 , 12 and Y ~ N 2 , 22 and from these )
two populations we take m and n number of observations respectively, such that
2 2 2 2
sm/ ,1 m
X s/ n
Y
m = i 2 1 ~ m2 and n n ,22 = i 2 2 ~ n2 . Now formulate the interval
m2 i =1 m n i =1 n

accordingly

12
Confidence Interval for 2 when both 1 and 2 are unknown
2

( ) (
Consider we have two population where X ~ N 1 , 12 and Y ~ N 2 , 22 and from these )
two populations we take m and n number of observations respectively, such that

2 2
sm2 ,1 m
Xi Xm sn2, 2 n
Yi Yn
(m 1) =
~ 2
m 1 and (n 1) = ~ n21 . Hence formulate
m2 i =1 m
2
n i =1 n
2 2

the interval accordingly

12
15-08-2010_1300IST

STATISTICAL INFERENCE: HYPOTHESIS TESTING

Before considering what are the aspects of hypothesis testing and the statistical rules for the

same depending on how we formulate our hypothesis we would like to initiate the discussion

of this topic through few simple examples.

Example 1

A manufacturer of a particular type of electrical motor has come up with a better hp rating

motor then its existing competitors and wants to market that. As is the norm for any

manufacturing product, a certain warranty life is to be specified by the manufacturer and the

company under our consideration specifies a warranty life of 1 year instead of 8 months

given for such rating products. Now you as an engineer are quite skeptical on hearing that the

warranty time is 1 years and want to test the validity of this statement which the manufacturer

is making.

Example 2

The food and beverage company which manufactures jelly and jams sells them in bottles of

100 gms, 250 gms and kg sizes and you are the marketing general manager of that firm. In

order to meet the growing market demand for these products your company has installed a

new high productive automatic jelly/jam filling machine, but there has been complains

afterwards that on an average the weight of the 100 gms bottles for the jams are never exactly

the same as they have been found to be either more or less than 100 gms. So in order to

answer this complain and monitor the productivity of the new machine the company has

entrusted you the responsibility to solve the problem and hence you would like to test

whether the weights on an average for the bottles coming out are about 100 gms (with some

errors) or is there a significant difference in the weights of the bottles, which may be a major

13
15-08-2010_1300IST

concern for the company and hence may necessitate the implementation of some corrective

action in order to first identify and then rectify the problem.

Example 3

The traffic flow at the main market road in the city of Amritsar is highest in the each day in

the morning from 1000 to 1200 hours, and you as the commissioner of police of Amritsar,

want to see whether it is really needed to have the road closed to motor cars and commercial

vehicles for that part of the day in order avoid any accidents, which has been reported at a

rate of 3 per week with a certain distribution.

Example 4

In the city of Guwahati a new internet service provider (ISP) has just opened its service and is

providing high speed internet services and claims the speed of its services are 500 Mbps. You

are the resident of that city are interested to get a new internet connection but would like to

verify this fact and then take a decision whether to take the connection from this new ISP or

continue with the old ISP.

In all these four examples discussed above there are some statements/hypothesis or to be

prices they are basically statistical statements about one or more unknown

characteristics/facts about the population which needs to be verified. So when we make a

claim about a particular hypothesis, at the same time a complementary hypothesis also crops

up, such that the original hypothesis is called or termed as the null hypothesis (H0), while its

competitor/negative/opposite hypothesis is the alternative hypothesis (HA) respectively. So

the problem in hypothesis testing, very simply sated is to test the validity of H0 against HA.

14
15-08-2010_1300IST

For the general implementation of hypothesis testing the following steps are generally

followed which are

1. As is the norm for sampling we first consider a sample of size n, such that the set of

observations are ( X 1 , X 2 , , X n ) which are taken from the population.

2. Then one sets up a decision rule, based on which one tests and makes a decision and

chooses one of the decisions, i.e., one of the hypothesis. One should also remember that the

decision rule is a technique for making a decision and the decision is generally described by a

statistic. Hence we can have any two of the outcomes which are (i) H0: Accepted and HA:

Rejected and (ii) H0: Not Accepted and HA: Accepted.

Now in all the cases when ever we do a hypotheses testing we have the following situation

which is denoted very simply by the following situation which is as follows

NATURE

H0: True HA: True

ACTION H0: Reject Type I error ()

HA: Reject Type II error ()

Example

( )
Consider we have X ~ N , 2 , with being unknown while 2 is known, i.e., suppoe we

have the following whci is X ~ N ( ,0.25) . Also assume that we have the two competing

hypothesis which are: H0: = 2 vs HA: = 1 and we are to test the efficacy of the hypothesis

statement. No if we ake a sample of size 10, such that we have

= X 10 =
1
( X 1 + X 2 + + X 10 ) as we know the sample estimate is the UMVUE of the
10

0.25
population mean, i.e., its expected value. Then it is also true that X 10 ~ N , and
10

15
15-08-2010_1300IST

suppose that after choosing a sample of size 10, we get = X 10 = 1.5 , then in that case we

will definitely reject H0. But remember the value of = X 10 = 1.5 is just random as taking

samples will give us different value of = X 10 , so in the case when we obtain = X 10 = 2 ,

we accept H0.

HA H0

X ~ N (1,0.25) X ~ N (2,0.25)

1.5

The hypothesis could have been framed like H0: > 2 vs HA: 2, then in that case the

decision to accept or not accept H0 will depend on the value of the sample statistic which is

the sample mean we obtain after we randomly select a sample of size, n.

HA H0

X ~ N (1.6,0.25) X ~ N (2.7,0.25)

1.95

Reducing by choosing will increase the value of , else trying to minimize by choosing

will increase the value of , and this is very clearly demonstrated in the two figures above.

Ence simulataneously minimizaing ( + ) for a choosen value of , will give us the result.

16
15-08-2010_1300IST

Thus we first select whci is always at a very low value, and then try to maximize (1 - ),

which is the same thing as minimizing . This is also know as trying to maximize the power

of the test, i.e., = (1 - ) and this would ensure minimization of variation. Remember this

type of test is the most powerful decision rule for the level of significance which is already

fixed by the value of .

17
15-08-2010_1300IST

Example

Suppose we draw n number of random observations from the normal distribution given by

( )
X ~ N , 2 , x , and 2 + . We also know that we have a set of random

observatiosn given by (X 1, X 2 ,, X n ) . Also assume that is unknown (one has to

hypothesize on this fact and draw meaningful conclusion about , based on the sample

chosen) while 2 is known, and we are given the following hypotheses to test, which are

H0: = 0 vs HA: = A (A < 0)

-z


So the rule is reject H0 if X n < 0 z holds
n

18
15-08-2010_1300IST

H0: = 0 vs HA: = A (A > 0)

-z


So the rule is reject H0 if X n > 0 + z holds
n

19
15-08-2010_1300IST

H0: = 0 vs HA: = A (A 0)

/2 /2

- z/2 z/2


So the rule is reject H0 if X n 0 > z holds, i.e., X n < 0 z or
2 n 2 n


X n > 0 + z holds
2 n

20
15-08-2010_1300IST

Example

Suppose we draw n number of random observations from the normal distribution given by

( )
X ~ N , 2 , x , and 2 + . We also know that we have a set of random

observatiosn given by (X 1, X 2 ,, X n ) . Also assume that is unknown (one has to

hypothesize on this fact and draw meaningful conclusion about , based on the sample

chosen) while 2 is is unknown. And we are given the following hypotheses to test, which

are

H0: = 0 vs HA: = A (A < 0)

s
So the rule is reject H0 if X n < 0 tn 1, n holds
n

H0: = 0 vs HA: = A (A > 0)

s
So the rule is reject H0 if X n > 0 + tn 1, n holds
n

H0: = 0 vs HA: = A (A 0)

sn s
So the rule is reject H0 if X n 0 > t holds. i.e., X n < 0 t n or
2 n 2 n
n 1, n 1,

sn
X n > 0 + t holds
n 1,
2 n

21
15-08-2010_1300IST

Example

Suppose we draw n number of random observations from the normal distribution given by

( )
X ~ N , 2 , x , and 2 + . We also know that we have a set of random

observatiosn given by ( X 1 , X 2 , , X n ) . Also assume that is known while 2 is unknown

(one has to hypothesize on this fact and draw meaningful conclusion about 2 , based on the

sample chosen), and we are given the following hypotheses to test, which are

H0: = 0 vs HA: = A (A < 0), i.e.,

H0: 2 - 20 = 0 vs HA: 2 - 2A < 0 (A < 0), i.e.,

2 A2
H0: 2 = 1 vs HA: 2 < 1 (A < 0), i.e.,
0

02 n2,1
So the rule is reject H0 if sn*2 < holds
n

H0: = 0 vs HA: = A (A > 0), i.e.,

H0: 2 - 20 = 0 vs HA: 2 - 2A > 0 (A > 0), i.e.,

2 A2
H0: 2 = 1 vs HA: 2 > 1 (A > 0), i.e.,
0

02 n2,
So the rule is reject H0 if sn*2 > holds
n

H0: = 0 vs HA: A (A 0), i.e.,

H0: 2 - 20 = 0 vs HA: 2 - 2A 0 (A 0), i.e.,

2 2 A2
H0 : = 1 vs H A: 1 or 1 (A 0) ), i.e.,
02 A2 2

02 2 02 2
n ,1 n,
So the rule is reject H0 if sn*2 < 2
or sn*2 > 2
holds
n n

22
15-08-2010_1300IST

Example

Suppose we draw n number of random observations from the normal distribution given by

( )
X ~ N , 2 , x , and 2 + . We also know that we have a set of random

observatiosn given by (X 1, X 2 ,, X n ) . Also assume that is unknown while 2 is

unknown (one has to hypothesize on this fact and draw meaningful conclusion about 2 ,

based on the sample chosen), and we are given the following hypotheses to test, which are

H0: = 0 vs HA: = A (A < 0), i.e.,

H0: 2 - 20 = 0 vs HA: 2 - 2A < 0 (A < 0), i.e.,

2 A2
H0: 2 = 1 vs HA: 2 < 1 (A < 0), i.e.,
0

2 2
So the rule is reject H0 if sn2 < 0 n 1,1 holds
(n 1)

H0: = 0 vs HA: = A (A > 0), i.e.,

H0: 2 - 20 = 0 vs HA: 2 - 2A > 0 (A > 0), i.e.,

2 A2
H0 : = 1 vs H A: > 1 (A > 0), i.e.,
02 2

02 n21,
So the rule is reject H0 if s > 2
holds
(n 1)
n

H0: = 0 vs HA: A (A 0), i.e.,

H0: 2 - 20 = 0 vs HA: 2 - 2A 0 (A 0), i.e.,

2 2 A2
H0 : = 1 vs H A: 1 or 1 (A 0) ), i.e.,
02 A2 2

02 2 02 2
n 1,1
2 n 1,
So the rule is reject H0 if sn2 < or sn >
*2 2
holds
( n 1 ) ( n 1 )

23
15-08-2010_1300IST

Example

Suppose we draw m and n number of random observations from two normal distributions

( ) (
given by X ~ N X , X2 , where x , X , X2 + hold and Y ~ N Y , Y2 , where)
y , Y , Y2 + . We also know that we have a set of random observatiosn given by

( X 1 , X 2 ,, X m ) and (Y1 , Y2 ,, Yn ) . Also assume that X and Y are unknown (one has to

hypothesize on this fact and draw meaningful conclusion about X and Y , based on the

samples chosen), while X2 and Y2 are known, and we are given the following hypotheses to

test, which are

H0: X - Y = 0 vs HA: X - Y < 0

2 2
So the rule is reject H0 if ( X m Y )n < ( X Y ) z X
+ Y holds
m n

H0: X - Y = 0 vs HA: X - Y > 0

2 2
So the rule is reject H0 if (X m Y )n > ( X Y ) + z X
+ Y holds
m n

H0: X - Y = 0 vs HA: X - Y 0

2 2
So the rule is reject H0 if ( X m Y )n < ( X Y ) z X
+ Y or
2
m n

2 2
(X Y ) > ( ) + z X
+ Y holds
n
m n X Y
2
m

24
15-08-2010_1300IST

Example

Suppose we draw m and n number of random observations from two normal distributions

( )
given by X ~ N X , X2 , where x , X , X2 + hold and Y ~ N Y , Y2 , where ( )
y , Y , Y2 + . We also know that we have a set of random observatiosn given by

( X 1 , X 2 ,, X m ) and (Y1 , Y2 ,, Yn ) . Also assume that X and Y are known while X2 and

Y2 are unknown (one has to hypothesize on this fact and draw meaningful conclusion about

X2 and Y2 , based on the samples chosen), and we are given the following hypotheses to test,

which are

H0: = 0 vs HA: = A (A < 0), i.e.,

H0: 2 - 20 = 0 vs HA: 2 - 2A < 0 (A < 0)

sm*2
So the rule is reject H0 if < Fm,n ,1 holds
sn*2

H0: = 0 vs HA: = A (A > 0), i.e.,

H0: 2 - 20 = 0 vs HA: 2 - 2A > 0 (A > 0)

sm*2
So the rule is reject H0 if > Fm ,n , holds
sn*2

H0: = 0 vs HA: = A (A 0), i.e.,

H0: 2 - 20 = 0 vs HA: 2 - 2A 0 (A 0)

sm*2 sm*2
So the rule is reject H0 if < F or > F holds
sn*2 m , n ,1
2
sn*2 m, n,
2

25
15-08-2010_1300IST

Example

Suppose we draw m and n number of random observations from two normal distributions

( ) (
given by X ~ N X , X2 , where x , X , X2 + hold and Y ~ N Y , Y2 , where )
y , Y , Y2 + . We also know that we have a set of random observatiosn given by

( X 1 , X 2 ,, X m ) and (Y1 , Y2 ,, Yn ) . Also assume that X and Y are unknown while X2 and

Y2 are unknown (one has to hypothesize on this fact and draw meaningful conclusion about

X2 and Y2 , based on the samples chosen), and we are given the following hypotheses to test,

which are

26
15-08-2010_1300IST

Example

Suppose we draw m and n number of random observations from two normal distributions

( )
given by X ~ N X , X2 , where x , X , X2 + hold and Y ~ N Y , Y2 , where ( )
y , Y , Y2 + . We also know that we have a set of random observatiosn given by

( X 1 , X 2 ,, X m ) and (Y1 , Y2 ,, Yn ) . Also assume that X and Y are unknown while X2 and

Y2 are unknown (but both are equal), and we are given the following hypotheses to test,

which are

H0: X - Y = 0 vs HA: X - Y < 0

sm2 ,1 sn2,1
So the rule is reject H0 if (X m Yn ) < (1 2 ) tm + n 2, +
m n

H0: X - Y = 0 vs HA: X - Y > 0

sm2 ,1 sn2,1
So the rule is reject H0 if (X m Yn ) > (1 2 ) + t m+n2, +
m n

H0: X - Y = 0 vs HA: X - Y 0

sm2 ,1 sn2,1
So the rule is reject H0 if (X m Yn ) < (1 2 ) t + or
m+ n2, m n
2

sm2 ,1 sn2,1
(X m Yn ) > (1 2 ) + tm+n2, + holds
2
m n

27

You might also like