You are on page 1of 14

1/1/2007

1
Hypothesis Testing
Prof.Dhananjay M.Apte
Trainer-Six Sigma,
Faculty-Statistics, Q.T. Operation Mgt.
dhananjayapte@yahoo.com
Cell- 98231 90939
For private circulation only. All rights reserved
Example 1
Mango farms in Ratnagiri District produce average 500
mangoes (per farm) with a standard deviation 96
(Information source- Food Report journal)
After Inclusion of special fertilizer, out put was measured
from 50 farms, The average output was 535 mangoes per farm
Is the out put more ?..................................yes, algebraically.
Is it statistically MORE ?
Is there STATISTICAL DIFFERENCE between the out put after and before the inclusion of
fertilizer ?
cal
Always consider
absolute value
Also need Z tab (Critical value)
At 0.05 significance level
1/1/2007
2
1.645
In the said problem
Since Z cal > Z tab.There IS Statistical DIFFERENCE between x bar and Mu.
Hence Inference (conclusion) is:
X bar is not only algebraically but statistically more than Mu
Output after inclusion of fertilizer, is statistically more
than Out put Before inclusion of fertilizer
Fertilizer is effective.
Z cal = 2.57 Z tab 0.05 = 1.645
Z cal > Z tab Z cal < Z tab
There is Statistical difference
between x bar and Mu
There is NO Statistical difference
between x bar and Mu
1/1/2007
3
Example 2
A Sensex Office claims that the average annual family income on Metropolis
is $ 48,432 . But In a report prepared by the Economic Research Department
of a major bank the Department Manager says that an average income of
$ 48,574 out of random sample of 400 families. Is Sensex offices claim is
L E S S than that of Department Manager. Consider a standard deviation of
sample 2000 ?
Z tab 0.05 = 1.645
Since Z cal < Z tab..NO DIFFERENCE
Hence Inference (conclusion) is:
Both claims are (Statistically) same
Sensex offices claim is algebraically less but statistically NOT
Whatsoever the algebraic difference is seen is just Chance
No significant difference.
S.D.plays vital role
Test Statistic
Area = Significance level = called Alfa
Confidence level is the level of trueness / confidence
= 1- Significance level
If not mentioned in the problemconsider =
0.05
is described in Fraction as 0.05 or in
Percentage as 5 %
1/1/2007
4
Example 3
The lifetime of a certain brand of heat pump is known to be normally
distributed with a standard deviation of 2. A sample of 6 heat pumps yielded
the observations: 2.0 1.3 6.0 1.9 5.1 4
At = 0 .05 is there reason to believe that the mean life of the heat pumps is
GREATER THAN 2 ?
Since Z cal > Z tab..There IS a DIFFERENCE
Hence Inference (conclusion) is:
Mean life for Heat pump is greater than 2
There is reason to believe that the mean life of the heat pumps is
significantly greater than 2.
What, if S.D. is not given ?calculate it from the data
Z cal = 1.69
Z tab 0.05 = 1.645
.x bar = 3.383 , = 2
Exercise
The maker of a certain model car claimed that his car
averaged at least 31 miles per gallon of gasoline. A
sample of 36 cars was selected and each car was driven
with one gallon of regular gasoline. The sample showed
a mean of 29.43 miles with a standard deviation of 3
miles. With 95 % Confidence level, what do you
conclude about the manufacturers claim?
( Is he making TALLER claim ? )
Ans.
Cal = 3.14
Tab = 1.645
Taller claim
1/1/2007
5
t test
Example 4
The maker of a certain model car claimed that his car averaged at least 31
miles per gallon of gasoline. A sample of 9 cars was selected and each car
was driven with one gallon of regular gasoline. The sample showed a mean
of 29.43 miles with a standard deviation of 3 miles. With 95 % Confidence
level ( Alfa = 0.05) , what do you conclude about the manufacturers claim?
Whether More?
If S.D. of Population is NOT known as well as sample size is less or
equal to 30 Then use another table, t distribution table to calculate
tab value, ttab.This is.. t Test .
How ever calculated value, t cal is calculated same way as z cal
..Consider absolute value = 1.57
For knowing t tab, we must know Degrees of Freedom (DOF) = n-1 =9-1 =8
1/1/2007
6
t cal = 1.57 t tab 0.05,8 = 1.860
Since t cal < t tab..There is NO DIFFERENCE
Hence Inference (conclusion) is:
Manufacturers claim and test result is same
There is insufficient evidence to doubt the
manufacturers claim concerning the gas mileage.
2 Tailed Test
Tests with > or < signs (i.e. Greater than, Less
than ) are 1 tailed Tests
Tests with = or not equal to signs are 2 tailed
Tests
Cal values remain unchanged in 2 tail,
only Tab values change.
Tab values are determined from Alfa/ 2
1/1/2007
7
2 Tailed test
Example 5
Suppose we have a random sample of n = 25 measurements of chest
circumference from a population of newborns with = 0.7 inches and the
sample mean = 12.6 inches. Is it likely that the population mean has the
value = 13.0 inches ? Consider Alfa = 0.05 ( Is.. x bar = Mu.. ? )
1. Its 2 Tailed Test.because equality is a concern
2. And its Z Test S.D. of population is known.
Test Statistic
Z tab 0.025 = 1.96
Refer to table
Since Z cal > Z tab..There IS a DIFFERENCE
Hence Inference (conclusion) is:
our observed value of x bar = 12.6 for the sample mean
is too rare for us to believe that = 13.0
Hypotheses terms and wordings
Null Hypothesis H
0
Statement of innocence made before Test calculations
There is NO difference
H
0
: x bar =
Alternative HypothesisH
A
There is difference
H
A
: x bar
Hypothesis is Accepted, Rejected, Not accepted,
Not rejected
Acceptance, Rejection Region, Critical Value
1/1/2007
8
Errors in Hypothesis Testing
A type I error consists of rejecting the
null hypothesis H
0
when it was true.
A type II error consists of not rejecting
H
0
when H
0
is false.
and o |
are the probabilities of type
I and type II error, respectively (The so
called Alfa, Beta Error)
Students in class C Students in class D
Student
Height
(cm)
145 149 152 153 154 148 153 157 161 162
154 158 160 166 166 162 163 167 172 172
166 167 175 177 182 175 177 183 185 187
Example 6 A random sample of 15 students- State whether there is a difference
Between (mean) height of students in class C & D (Consider 5 % significance level)
Its a 2 tailed t test
2 Samples Test
1/1/2007
9
x1 161.6
x2 168.27
s1 10.86
s2 11.74
n1,n2
15
tcal
1.62
t tab 0.025,28 = 2.048
Since t cal < t tabThere is NO DIFFERENCE
Paired Tests
Does additive improves performance ? Test it with 5 % significance
(is the mileage more ?.1 tailed, t test sample less than 30)
1/1/2007
10
Determining
t tab
DOF = 9
Solution
1/1/2007
11
Thus Paired Test considers.
Confidence Interval (C.I.) of Mean
Point Estimate- We take a sample from population, Find
Mean (X bar). We estimate the mean of Population.
In some cases, the estimation of mean is stated in
interval.Called Interval Estimate or Confidence Interval
(C.I.)
Formulae of C.I...
Z need to be replaced by t if n < = 30 & S.D.of population is not known
Z, t are Tabulated values ,needs two tailed consideration.
If S.D.of Population is not known, then consider that of Sample.
Alfa is generally considered = 0.05
1/1/2007
12
Example Sample of 20 sales invoices (bills), mean amount $110.27,
sample std dev =$28.95 . Find at what range Mean of all the invoices
(in population) lies. (To determine C.I.)
( )
56 . 13 27 . 110
20
95 . 28
093 . 2 27 . 110
1
=
=


n
s
t X
n
83 . 123 $ 71 . 96 $ s s or
Hence Inference (conclusion) is:
Mean amount of all the invoices of population will lie between
96.71 & 123.83
The surety level (Confidence Level) for above statement is 95 %
Or 95 % Samples will have their (sample) mean lying in the above
range
As Confidence level increasesThe range ( C.I.) increases
n < = 30 & S.D.of population
is not known
t tab 0.025, 19 = 2.093
Confidence Interval (C.I.) for Proportion
CI for p is given below, with p
s
= sample
proportion
When n is large use Z value
When n is small use t value
( )
n
p p
z p P
s s
s

= =
1
t
1/1/2007
13
Example
Sample of 100 sales invoices, 10 have
errors, what is 95% CI
P
s
= 10/100 =0.1
95% CI for P is =
( )( )
100
9 . 0 1 . 0
96 . 1 1 . 0
0588 . 0 1 . 0 =
P - Value
The P-value is the smallest level of
significance at which H
0
would be
rejected when a specified test procedure
is used on a given data set.
0
1. -value
reject at a level of
P
H
o
o
s

0
2. -value
do not reject at a level of
P
H
o
o
>

1/1/2007
14
P - Value
The P-value is the probability,
calculated assuming H
0
is true, of
obtaining a test statistic value at least as
contradictory to H
0
as the value that
actually resulted. The smaller the P-
value, the more contradictory is the data
to H
0
.

You might also like