You are on page 1of 79

MPZ 4230 Engineering Mathematics

The Open University of Sri Lanka

06/04/2014

M.P.Dhanushika

06/04/2014

M.P.Dhanushika

Hypothesis Testing

06/04/2014

M.P.Dhanushika

06/04/2014

M.P.Dhanushika

Lets get a help from Sam


Go with Sam for bowling
Sam says my long time average mark is 150
Over 3 games, Sams average score is 30.
Do you believe his statement my long time average mark is 150

Is he Dodgy / Liar?
06/04/2014

M.P.Dhanushika

Is he Dodgy?
You will probably not believe him.
Over 3 games, Sams average score is 30.
Dont believe him
Over 3 games, Sams average score is 140.
More likely to believe him

At what point between 30 and 140 Do you make the


decision to believe Sam or not?
06/04/2014

M.P.Dhanushika

Is he Dodgy?
What is your cut off score for Sam?
If Sam scores below 120 over 3 games
If Sam scores over 120 over 3 games
There is a claim Sams long time average mark is 150

Make assumption Claim is True


If sample outcome falls below a cut-off value

REJECT Claim
06/04/2014

M.P.Dhanushika

Probability Distribution of Bowling Scores


Under the assumption that claim is true

06/04/2014

M.P.Dhanushika

Probability Distribution of Bowling Scores


Under the assumption that claim is true

Rejection Region

c =120

06/04/2014

M.P.Dhanushika

Your Mum
Go with Mum for bowling
Mum says Honey you know Im excellent bowler so my long
time average mark is 150
Over 3 games, mums average score is 30.

Do you believe Mum statement my long time average


mark is 150?
Or
Do you call Mum is a Liar
06/04/2014

M.P.Dhanushika

Your Mum
What if you want to be more sure that the claim is false
before rejecting it?
Choose LOWER VALUE for a cut-off mark

For Sam this is120

For Mum this is 50

06/04/2014

Sam always lies


Calling him as Liar is not a
harsh
You love your mum
She never tell lies
Calling her as a Liar is so cruel

M.P.Dhanushika

Probability Distribution of Bowling Scores


Under the assumption that claim is true
Rejection Region is much smaller than Sams case
Smaller rejection region says,
You are less likely to call your Mum is liar
Because you know the fact that she is telling TRUTH

Rejection Region

c =50

06/04/2014

M.P.Dhanushika

Probability Distribution of Bowling Scores


Under the assumption that claim is true
Rejection Region is larger in Sams case
Larger rejection region says,
You are more likely to say He is a liar
Because you know the fact that most of the time Sam says lies
So calling him as liar is not a harsh thing

Rejection Region

c =120

06/04/2014

M.P.Dhanushika

Cut off value for rejection region is decided by you


By RESEARCHER

Depend on
How sure you want to be when rejecting a claim

Smaller rejection region


We are more sure when rejecting the claim
Larger rejection region
We are less sure when rejecting the claim

06/04/2014

M.P.Dhanushika

Null Hypothesis (H0) is the claim


Sams long term average bowling score is 150
Alternative Hypothesis (H1) is what must be true if H0 is false
Sams average bowling score is below 150: he is a liar: he is a dodgy friend

Sample Statistic is observed sample estimate which is used to test whether


the claim is rejected or not rejected
Sams average bowling score over 3 games
Critical value is the cut-off value which indicates whether the claim is
rejected or not rejected
If Sams average bowling score over 3 games is
below 120 I will reject his claim

Decision Rule

Significance level () measures how sure you want to be when rejecting H0


Smaller significance () level the more sure when you are rejecting H0

Estimation
Dont know value of the parameter
Collect sample data
Give an approximate value or range for parameter

Testing
First, make a claim about the parameter
Then test the claim using sample data

06/04/2014

M.P.Dhanushika

Population

(100 number of games of


bowling)
Bowling scores for 100 games

Claim
Average bowling
score over 100
games / long term
average score = 150

Sample

Bowling scores
over 3 Games
06/04/2014

SAMPLE STATISTIC
Average score over 3 games
M.P.Dhanushika

In reality
Claim is True
H0 is True

Decision from Sample


Accept H0
Reject H0
Correct
Wrong
Type I Error

(long term avg score =


150)

Claim is False
H0 is False

Wrong
Type II Error

Correct

(long term avg score <


150)

Pr(Type I Error) = Pr(Reject H0 when H0 is True) =


Significance () level How sure you want to be when you are
rejecting H0
06/04/2014

M.P.Dhanushika

In reality
Claim is True
H0 is True

Decision from Sample


Accept H0
Reject H0
Correct
Wrong
Type I Error

(long term avg score =


150)

Claim is False
H0 is False

Wrong
Type II Error

Correct

(long term avg score <


150)

Pr(Type I Error) = Pr(Reject H0 when H0 is True) =


Pr(Type 2 Error) = Pr(Accept H0 when H0 is False) =
Power of the test = Pr(Reject H0 when H0 is False) = 1-

our decisions are always about the null hypothesis


H0 : = 10
H1 : 10
Finally we conclude that, 10

Finally we conclude that, = 10

How to say your decision?

How to say your decision?

06/04/2014

Accept H1

Accept H0

Reject Ho

Do not Reject Ho

M.P.Dhanushika

our decisions are always about the null hypothesis


our decision is to 'reject H0'

What is the probability associated with making a Type 1 error? Answer: alpha=.05.
Thus a decision to 'reject H0' is accompanied by knowledge about the probability of
being in error
our decision is to 'not reject H0
What is the probability associated with making a Type 2 error? Answer: beta is unknown

Thus a decision to 'not reject H0' is accompanied by uncertainty about the probability of
being in error.
Without having any idea about the probability of making an error,
we cannot say we have proven that H0 is true
Cannot make a stronger statement, like 'accept H0',
Thats why we word this result so weakly.
06/04/2014

M.P.Dhanushika

H0 : = 10
H1 : 10
Two Rejection Regions

Test Statistic =

Two Tailed Test


H 0 : = 0
H 1 : 0
10
06/04/2014

X
M.P.Dhanushika

H0 : = 10
H1 : > 10

Right Tailed Test / Upper Tailed Test


H 0 : = 0
H 1 : > 0

Rejection Region

Test Statistic =

10
06/04/2014

X
M.P.Dhanushika

H0 : = 10
H1 : < 10

Left Tailed Test / Lower Tailed Test


H 0 : = 0
H 1 : < 0

Rejection Region

Test Statistic =

10
06/04/2014

X
M.P.Dhanushika

The null hypothesis (H0) will always state,


Parameter has a specified value

The alternative hypothesis (H1) will always state,


Parameter has value different from the value we specify in H0

Example:
(1) Someone claims that the average daily sales of a firm is Rs. 2000.
06/04/2014

M.P.Dhanushika

(1) Two tailed Z test known


(2) Two tailed Z test unknown
(3) One tailed Z test - known
(4) One tailed Z test - unknown

(5) Two tailed t test unknown


(6) One tailed t test unknown
06/04/2014

M.P.Dhanushika

(1) Two tailed Z test known


Step 1: State the H0 and H1
Mean stress capacity of the population (Total Production) = = 80000
Standard deviation of the population = = 4000

Sample Data
n = 100, Mean stress capacity of the sample =

= 79600

H0 : = 80000
Two tailed test
H1 : 80000

(1) Two tailed Z test known


Step 2: Specify level of significance
How sure you want to be when you are rejecting H0 = Pr(Type I Error)

Rejection region = = 0.05

H1 : 80000

/ 2 = 0.025

/ 2 = 0.025

0 80000

(1) Two tailed Z test known


Step 3: Deciding which distribution to use
Assumption: population has a Normal distribution or t distribution
If we know population standard deviation

Normal distribution
(Z table )

If we dont know population standard deviation


n > 30
n 30

Normal distribution
(Z table )

Standard deviation of the


population = = 4000
Select Normal Distribution

t distribution
(t table )

(1) Two tailed Z test known


Step 4: Find Z Table values & critical values

Normal distribution

/ 2 = 0.025

0 Z / 2 X

/ 2 = 0.025

0 80000

0 Z / 2 X

(1) Two tailed Z test known


Step 4: Find Z Table values & critical values
Normal distribution

X stadard error of the mean


/ n

/ 2 = 0.025

0 Z / 2 X
79216

/ 2 = 0.025

0 80000

0 Z / 2 X
80784

(1) Two tailed Z test known

Step 5: Decision Rule


Acceptance Region
Rejection Region

H1 : 80000
/ 2 = 0.025

0 Z / 2 X

/ 2 = 0.025

0 80000

0 Z / 2 X

79216
if Xis between 79216 and 80784

80784
Do not reject H0

(1) Two tailed Z test known


Step 6: Conclusion

Acceptance Region
Rejection Region

/ 2 = 0.025

0 Z / 2 X

/ 2 = 0.025

0 80000

79216

0 Z / 2 X

80784

since X 79600 is between79216 and 80784

We conclude that we do not reject H0 at .05 level of significance.


There is no enough evidence to support H1
There is no enough evidence to say that production doesnt run as meeting
stress requirements

(2) Two tailed Z test unknown


Step 1: State the H0 and H1
Average income for government employees = = 18750
Standard deviation of the population =

Sample Data
n = 100, average salary of employees of the sample = X = 19240
Standard deviation of the salary of employees of the sample = s = 2610

H0 : = 18750
Two tailed test
H1 : 18750

(2) Two tailed Z test unknown


Step 2: Specify level of significance
How sure you want to be when you are rejecting H0 = Pr(Type I Error)

Rejection region = = 0.05

/ 2 = 0.025

/ 2 = 0.025

0 18750

H1 : 18750

(2) Two tailed Z test unknown


Step 3: Deciding which distribution to use
Assumption: population has a Normal distribution or t distribution
If we know population standard deviation

Normal distribution
(Z table )

If we dont know population standard deviation


n > 30
n 30

Normal distribution
(Z table )

Standard deviation of the


sample = s = 2610
n = 100

t distribution
(t table )

Select Normal Distribution

(2) Two tailed Z test unknown


Step 4: Find Z Table values & Critical values

Normal distribution

/ 2 = 0.025

0 Z / 2 ( S X )

/ 2 = 0.025

0 18750

0 Z / 2 ( S X )

(2) Two tailed Z test unknown


Step 4: Find Z Table values & critical values

Normal distribution
S X stadard error of the sample mean
S/ n
/ 2 = 0.025

0 Z / 2 ( S X )
18238.44

/ 2 = 0.025

0 18750

0 Z / 2 ( S X )
19261.56

(2) Two tailed Z test unknown

Step 5: Decision Rule


Acceptance Region
Rejection Region

/ 2 = 0.025

0 Z / 2 ( S X )
18238.44

/ 2 = 0.025

0 18750

H1 : 18750

0 Z / 2 ( S X )
19261.56

if Xis between18238.44and 19261.56

Do not reject H0

(2) Two tailed Z test unknown


Step 6: Conclusion

Acceptance Region
Rejection Region

/ 2 = 0.025

0 Z / 2 ( S X )
18238.44

/ 2 = 0.025

0 18750

since X 19240 is between18238.44and 19261.56

0 Z / 2 ( S X )
19261.56

We conclude that we do not reject H0 at .05 level of significance.

There is no enough evidence to support H1


There is no significant difference between the average salary of the
government employees in Washington and the national average

(3) One tailed Z test known


Step 1: State the H0 and H1
Population mean dosage of the drug = = 100
Standard deviation of the population = = 2

Sample Data
n = 50, sample mean of the dosage = X

= 99.75

H0 : = 100
One tailed test
H1 : 100
H1 : < 100

(3) One tailed Z test known


Step 2: Specify level of significance
Rejection region = = 0.1

H1 : < 100

= 0.1

0 100

(3) One tailed Z test known


Step 3: Deciding which distribution to use
Assumption: population has a Normal distribution or t distribution
If we know population standard deviation

Normal distribution
(Z table )

If we dont know population standard deviation


n > 30
n 30

Normal distribution
(Z table )

Standard deviation of the


population = = 2
Select Normal Distribution

t distribution
(t table )

(3) One tailed Z test known


Step 4: Find Z Table values & Critical Values

Normal distribution

= 0.1

0 Z X

0 100

(3) One tailed Z test known

Step 4: Find Z Table values & Critical values


Normal distribution

X stadard error of the mean


/ n

= 0.1

0 Z X
99.638

0 100

(3) One tailed Z test known

Step 5: Decision Rule


Acceptance Region
Rejection Region

H1 : < 100

= 0.1

0 Z X
99.638

0 100

if X is greater than 99.638

Do not reject H0

(3) One tailed Z test known


Step 5: Conclusion

Acceptance Region

Rejection Region

= 0.1

99.638

since X 99.75 is greater than 99.638

0 100

We conclude that we do not reject H0 at 0.1 level of significance.

There is no enough evidence to support H1


There is no enough evidence to say that dosages in the shipment are
too small.

(4) One tailed Z test unknown


Step 1: State the H0 and H1
Average bulb life (Population mean) = = 1600
Standard deviation of the population =

Sample Data
n = 100, average bulb life of the sample= X = 1570
Standard deviation of the bulb life of the sample= s = 120

H0 : = 1600
One tailed test
H1 : 1600
H1 : < 1600

H1 : > 1600

(4) One tailed Z test unknown


Step 2: Specify level of significance
Rejection region = = 0.01

H1 : < 1600

= 0.01

0 1600

(4) One tailed Z test unknown


Step 3: Deciding which distribution to use
Assumption: population has a Normal distribution or t distribution
If we know population standard deviation

Normal distribution
(Z table )

If we dont know population standard deviation


n > 30
n 30

Normal distribution
(Z table )

Standard deviation of the


population =
Sample size = 100

t distribution
(t table )

Select Normal Distribution

(4) One tailed Z test unknown


Step 4: Find Z Table values & critical values

Normal distribution

= 0.01

0 Z S X

0 1600

(4) One tailed Z test unknown

Step 4: Find Z Table values & critical values


Normal distribution
S X stadard error of the mean
S/ n

= 0.01

0 Z S X
1572.04

0 1600

(4) One tailed Z test unknown

Step 5: Decision Rule


Acceptance Region
Rejection Region

H1 : < 1600

= 0.01

0 Z S X

0 1600

1572.04
if X is greater than 1572.04

Do not reject H0

(4) One tailed Z test unknown


Step 5: Conclusion

Acceptance Region

Rejection Region

= 0.01

1572.04

0 1600

since X 1570 is less than 1572.04

We conclude that we reject H0 at .01 level of significance.


A light bulb lasts on an average less than 1600 hours / manufacturers
claim is not valid.

(5) Two tailed t test unknown


Step 1: State the H0 and H1
Average number of days lost each year per worker due to sickness = = 15
Standard deviation of the population =

Sample Data
n = 25, Average number of days lost per worker of the sample =
Standard deviation of the sample= S = 15.845

H0 : = 15
Two tailed test
H1 : 15

= 18.32

(5) Two tailed t test unknown


Step 2: Specify level of significance
Rejection region = = 0.05

(5) Two tailed t test unknown


Step 3: Deciding which distribution to use
Assumption: population has a Normal distribution or t distribution
If we know population standard deviation

Normal distribution
(Z table )

If we dont know population standard deviation


n > 30
n 30

Normal distribution
(Z table )

Standard deviation of the


population =
Sample size = 25 < 30

t distribution
(t table )

Select t Distribution

(5) Two tailed t test unknown


Step 2: Specify level of significance
Rejection region = = 0.05

t distribution

/ 2 = 0.025

/ 2 = 0.025

t
0

H1 : 15

X 0
t
SX

(5) Two tailed t test unknown


Step 4: Find t Table values

t distribution

/ 2 = 0.025

t / 2,n 1

/ 2 = 0.025

t
0

t / 2,n 1

(5) Two tailed t test unknown


Step 4: Find t Table values

t distribution

/ 2 = 0.025

t / 2

2.064

/ 2 = 0.025

t
0

t / 2

2.064

Critical Values from the table

(5) Two tailed t test unknown


Step 4: Find actual critical values (using sample data)
Standardize sample mean

X 0
t
SX

SX S / n
t 1.05

t distribution

/ 2 = 0.025

t / 2

2.064

/ 2 = 0.025

t
0

t / 2

2.064

(5) Two tailed t test unknown

Step 5: Decision Rule

Acceptance Region
Rejection Region

X 0
t
SX

SX S / n
t 1.05

/ 2 = 0.025

t / 2

2.064
Do not reject H0

/ 2 = 0.025

t
0

1.05 t / 2
2.064

(5) Two tailed t test unknown


Acceptance Region

Step 6: Conclusion

X 0
t
SX

SX S / n
t 1.05

Rejection Region

/ 2 = 0.025

t / 2 2.064

/ 2 = 0.025

t
0

1.05 t / 2

2.064

We conclude that we do not reject H0 at .05 level of significance.


There is no enough evidence to reject H0
There is no enough evidence to say that number of days lost each year
through sickness was 15 days per worker on average

(6) One tailed t test unknown


Step 1: State the H0 and H1
Average percentage of lint in seed cotton = = 40
Standard deviation of the population =

Sample Data
n = 18, sample mean =X = 37.206
Standard deviation of the sample= S = 0.796

H0 : = 40
One tailed test
H1 : 40
H1 : < 40

(6) One tailed t test unknown


Step 2: Specify level of significance
Rejection region = = 0.01

(6) One tailed t test unknown


Step 3: Deciding which distribution to use
Assumption: population has a Normal distribution or t distribution
If we know population standard deviation

Normal distribution
(Z table )

If we dont know population standard deviation


n > 30
n 30

Normal distribution
(Z table )

Standard deviation of the


population =
Sample size = 18 < 30

t distribution
(t table )

Select t Distribution

(6) One tailed t test unknown


Step 2: Specify level of significance
Rejection region = = 0.01

t distribution

= 0.01

t
0

H1 : < 40

X 0
t
SX

(6) One tailed t test unknown


Step 4: Find t Table values

t distribution

= 0.01

t ,n 1

(6) One tailed t test unknown


Step 4: Find t Table values

t distribution

= 0.01

2.567

Critical Values from the table

(6) One tailed t test unknown


Step 4: Find actual critical values (using sample data)
Standardize sample mean
t distribution

X 0
t
SX

SX S / n

= 0.01

t 14.89

2.567

(6) One tailed t test unknown

Step 5: Decision Rule

Acceptance Region
Rejection Region

X 0
t
SX

SX S / n

= 0.01

14.89 t

t 14.89

2.567
reject H0

(6) One tailed t test unknown

Step 6: Conclusion

Rejection Region

Acceptance Region

X 0
t
SX

SX S / n
t 14.89

= 0.01

14.89 t
2.567

We conclude that we reject H0 at .01 level of significance.


Average percentage of lint in this cotton variety less than 40.

(1) Two tailed Z test known

H0 : = 0
H1 : 0

(2) Two tailed Z test unknown

H0 : = 0
H1 : 0

(3) One tailed Z test - known

H0 : = 0 H0 : = 0
H1 : < 0 H1 : > 0

(4) One tailed Z test - unknown

H0 : = 0 H0 : = 0
H1 : < 0 H1 : > 0
H0 : = 0
H1 : 0

(5) Two tailed t test unknown

(6) One tailed t test unknown


06/04/2014

H0 : = 0 H0 : = 0
H1 : < 0 H1 : > 0
M.P.Dhanushika

You might also like