You are on page 1of 75

PAN African e Network Project

DBM
Quantitative Techniques in Management

Semester - 1
Session - 5

Dr. Sarika Jain

HYPOTHESIS TESTING

Hypothesis Testing
1 Overview
2 Fundamentals of Hypothesis Testing
3 Testing a Claim about a Mean: Large
Samples
4 Testing a Claim about a Mean: Small
Samples

5 Testing a Claim about a Proportion


6 Testing a Claim about a Standard
Deviation or Variance

Hypothesis
in statistics, is a claim or statement about
a property of a population

Hypothesis Testing
is to test the claim or statement
Example: A conjecture is made that the
average starting salary for computer
science gradate is $30,000 per year.

Question:
How can we justify/test this conjecture?

A. What do we need to know to justify

this conjecture?

B. Based on what we know, how should

we justify this conjecture?

Answer to A:
Randomly select, say 100, computer
science graduates and find out their
annual salaries
---- We need to have some sample
observations, i.e., a sample set!

Answer to B:
That is what we will learn in this
chapter
---- Make conclusions based on the
sample observations

Statistical Reasoning
Analyze the sample set in an attempt to
distinguish between results that can
easily occur and results that are highly
unlikely.

Central Limit Theorem:


Distribution of Sample Means
Assume the
conjecture is
true!

Likely sample means

x = 30k

Central Limit Theorem:


Distribution of Sample Means
Assume the
conjecture is
true!

Likely sample means

x = 30k
z = 1.96
or

x = 20.2k

z=

1.96
or

x = 39.8k

Central Limit Theorem:


Distribution of Sample Means
Assume the
conjecture is
true!

Likely sample means

Sample data: z = 2.62


or
x = 43.1k

x = 30k
z = 1.96
or

x = 20.2k

z=

1.96
or

x = 39.8k

Components of a
Formal
Hypothesis Test

Definitions
Null

Hypothesis (denoted H 0):

is the statement being tested in a


test of hypothesis.
Alternative

Hypothesis (H 1):

is what is believe to be true if the


null hypothesis is false.

Null Hypothesis: H0

Must contain condition of equality


=, , or
Test the Null Hypothesis directly
Reject H 0 or fail to reject H 0

Alternative Hypothesis: H1
Must be true if H0 is false
, <, >
opposite of Null
Example:
H0 : = 30 versus H1 : > 30

Stating Your Own Hypothesis


If you wish to support your claim, the
claim must be stated so that it becomes
the alternative hypothesis.

Important Notes:
H0 must always contain equality; however some
claims are not stated using equality. Therefore
sometimes the claim and H0 will not be the
same.

Ideally all claims should be stated that they are


Null Hypothesis so that the most serious error
would be a Type I error.

Type I Error
The mistake of rejecting the null hypothesis when it
is true.
The probability of doing this is called the
significance level, denoted by (alpha).
Common choices for : 0.05 and 0.01
Example: rejecting a perfectly good parachute
and refusing to jump

Type II Error
the mistake of failing to reject the null
hypothesis when it is false.
denoted by (beta)
Example: failing to reject a defective
parachute and jumping out of a
plane with it.

Type I and Type II Errors


True State of Nature
The null
hypothesis is
true
We decide to
reject the
null hypothesis

Type I error
(rejecting a true
null hypothesis)

The null
hypothesis is
false
Correct
decision

Decision
We fail to
reject the
null hypothesis

Correct
decision

Type II error
(failing to reject
a false null
hypothesis)

Definition
Test Statistic:
is a sample statistic or value based on sample data
Example:

z=

x x
n

Definition
Critical Region :

is the set of all values of the test statistic


that would cause a rejection of the null
hypothesis

Critical Region

Set of all values of the test statistic that


would cause a rejection of the
null hypothesis
Critical
Region

Critical Region

Set of all values of the test statistic that


would cause a rejection of the
null hypothesis
Critical
Region

Critical Region

Set of all values of the test statistic that


would cause a rejection of the
null hypothesis
Critical
Regions

Definition
Critical Value:
is the value (s) that separates the critical
region from the values that would not lead
to a rejection of H 0

Critical Value
Value (s) that separates the critical region
from the values that would not lead to a
rejection of H 0

Critical Value
( z score )

Critical Value
Value (s) that separates the critical region
from the values that would not lead to a
rejection of H 0
Reject H0

Critical Value
( z score )

Fail to reject H0

Controlling Type I and Type II


Errors

, , and n are related

when two of the three are chosen, the third is determined


and n are usually chosen
try to use the largest you can tolerate
if Type I error is serious, select a smaller value and a
larger n value

Conclusions
in Hypothesis Testing
always test the null hypothesis

1. Fail to reject the H 0


2. Reject the H 0
need to formulate correct wording of final
conclusion

Wording of Conclusions
in Hypothesis Tests

Original
claim is H0

Do
you reject
H0?.

No

(Fail to
reject H0)

Original
claim is H1

Do
you reject
H0?

(This is the
only case
in
which the
original
claim
There is not sufficient
is rejected).
evidence to warrant
rejection of the claim
that. . . (original claim).

There is sufficient
evidence to warrant
(Reject H0) rejection of the claim
that. . . (original claim).

Yes

Yes
(Reject H0)

No
(Fail to
reject H0)

(This is the
The sample data
only case in
supports the claim that which the
. . . (original claim).
original claim
is supported).
There is not sufficient
evidence to support
the claim that. . .
(original claim).

Two-tailed,
Left-tailed,
Right-tailed
Tests

Left-tailed Test
H0: 200
H1: < 200

Left-tailed Test
H0: 200
H1: < 200
Points Left

Left-tailed Test
H0: 200
H1: < 200
Points Left
Reject H0

Values that
differ significantly
from 200

Fail to reject H0

200

Right-tailed Test
H0: 200
H1: > 200

Right-tailed Test
H0: 200
H1: > 200

Points Right

Right-tailed Test
H0: 200
H1: > 200

Points Right
Fail to reject H0

200

Reject H0

Values that
differ significantly
from 200

Two-tailed Test
H0: = 200
H1: 200

Two-tailed Test
H0: = 200
H1: 200

is divided equally between


the two tails of the critical
region

Two-tailed Test
H0: = 200
H1: 200

Means less than or greater than

is divided equally between


the two tails of the critical
region

Two-tailed Test
H0: = 200
H1: 200

is divided equally between


the two tails of the critical
region

Means less than or greater than

Reject H0

Fail to reject H0

Reject H0

200

Values that differ significantly from 200

Testing Hypotheses:
Using The Five Step Model
1. Make Assumptions and meet test
requirements.
2. State the null hypothesis.
3. Select the sampling distribution and
establish the critical region.
4. Compute the test statistic.
5. Make a decision and interpret results.

Step 1: Make Assumptions and


Meet Test Requirements

Random sampling

Hypothesis testing assumes samples were selected using


random sampling.
In this case, the sample of 117 cases was randomly selected
from all education majors.

Level of Measurement is Interval-Ratio


GPA is I-R so the mean is an appropriate statistic.

Sampling Distribution is normal in shape


This is a large sample (n100).

Step 2 State the Null


Hypothesis
H0: = 2.7 (in other words, H0:

= )

You can also state Ho: No difference between the sample


mean and the population parameter
(In other words, the sample mean of 3.0 really the same as
the population mean of 2.7 the difference is not real but
is due to chance.)
The sample of 117 comes from a population that has a
GPA of 2.7.
The difference between 2.7 and 3.0 is trivial and caused by
random chance.

Step 2 (cont.) State the Alternate


Hypothesis

H1: 2.7 (or, H0:

Or H1: There is a difference between the sample mean and


the population parameter
The sample of 117 comes a population that does not have
a GPA of 2.7. In reality, it comes from a different
population.
The difference between 2.7 and 3.0 reflects an actual
difference between education majors and other students.
Note that we are testing whether the population the sample
comes from is from a different population or is the same as
the general student population.

Step 3 Select Sampling Distribution


and Establish the Critical Region
Sampling Distribution= Z
Alpha () = .05
is the indicator of rare events.
Any difference with a probability less than
is rare and will cause us to reject the H0.

Step 3 (cont.) Select Sampling


Distribution and Establish the Critical
Region
Critical Region begins at Z= 1.96
This is the critical Z score associated
with = .05, two-tailed test.
If the obtained Z score falls in the Critical
Region, or the region of rejection, then
we would reject the H0.

Step 4: Use Formula to Compute the


Test Statistic (Z for large samples (

100)

Z
N

When the Population is not


known,
use the following formula:

Z
s n 1

Test the Hypotheses


3.0 2.7
Z
4.62
.7
117 1
We can substitute the sample standard deviation
S for pop. s.d.) and correct for bias by
substituting N-1 in the denominator.
Substituting the values into the formula, we
calculate a Z score of 4.62.

Step 5 Make a Decision and


Interpret Results

The obtained Z score fell in the Critical Region, so we reject


the H0.
If the H0 were true, a sample outcome of 3.00 would be
unlikely.
Therefore, the H0 is false and must be rejected.
Education majors have a GPA that is significantly different
from the general student body (Z = 4.62, = .05).*
*Note: Always report significant statistics.

Looking at the curve:


(Area C = Critical Region when
=.05)

Rule of Thumb:
If the test statistic is in the Critical Region
( =.05, beyond 1.96):
Reject the H0. The difference is significant.
If the test statistic is not in the Critical Region
(at =.05, between +1.96 and -1.96):
Fail to reject the H0. The difference is not
significant.

Using the Students t Distribution for


Small Samples (One Sample T-Test)
When the sample size is small
(approximately < 100) then the Students t
distribution should be used
The test statistic is known as t.
The curve of the t distribution is flatter than
that of the Z distribution but as the sample
size increases, the t-curve starts to
resemble the Z-curve

Degrees of Freedom
The curve of the t distribution varies with
sample size (the smaller the size, the flatter
the curve)
In using the t-table, we use degrees of
freedom based on the sample size.
For a one-sample test, df = n 1.
When looking at the table, find the t-value for
the appropriate df = n-1. This will be the
cutoff point for your critical region.

Formula for one sample t-test:

t
S
n 1

Example
A random sample of 26 sociology
graduates scored 458 on the GRE
advanced sociology test with a standard
deviation of 20. Is this significantly
different from the population average
( = 440)?

Solution (using five step model)


Step 1: Make Assumptions and Meet Test
Requirements:
1. Random sample
2. Level of measurement is interval-ratio
3. The sample is small (<100)

Solution (cont.)
Step 2: State the null and alternate
hypotheses.
H0: = 440 (or H0:
H1: 440

= )

Solution (cont.)

Step 3: Select Sampling Distribution and


Establish the Critical Region

1. Small sample, I-R level, so use t


distribution.
2. Alpha () = .05
3. Degrees of Freedom = n-1 = 26-1 = 25
4. Critical t = 2.060

Solution (cont.)
Step 4: Use Formula to Compute the Test Statistic

458 440
t

4.5
S
20
n 1
26 1

Looking at the curve for the t


distribution
Alpha () = .05

Step 5 Make a Decision and


Interpret Results

The obtained t score fell in the Critical Region, so


we reject the H0 (t (obtained) > t (critical)
If the H0 were true, a sample outcome of 458
would be unlikely.
Therefore, the H0 is false and must be rejected.
Sociology graduates have a GRE score that is
significantly different from the general student body
(t = 4.5, df = 25, = .05).

Testing Sample Proportions:


When your variable is at the nominal (or
ordinal) level the one sample z-test for
proportions should be used.
If the data are in % format, convert to a
proportion first.
The method is the same as the one
sample Z-test for means (see above)

Formula for Proportions:


Note: Ps is the sample proportion
Pu is the population proportion

Ps Pu
Pu (1 Pu ) / n

Example
In a recent statewide (or provincial)
election, 55% of voters rejected lotteries. A
random sample of 150 urban (or rural
communities) precincts showed that 49%
of voters rejected lotteries. Is the
difference significant?
Use the formula for proportions and 5 step
method to solve

Solution:
Step 1:
Random sample
L.O.M. is nominal
The sample is large
Step 2:
H0: Pu = .55 (convert % to proportion)
(Note you can also say H0: Ps = Pu )
H1: Pu .55
Step 3:
The sample is large, use Z distribution
Alpha () = .05
Critical Z = 1.96

Solution (cont.)
Step 4

Ps Pu
.49 .55

1.48
Pu (1 Pu ) / n
.55(1 .55) / 150

Step 5
Z (obtained) < Z (critical)
Fail to reject Ho. There is no significant difference
between the state population and the precincts.

Main Considerations in
Hypothesis Testing:
Sample size
Use Z for large samples, t for small (<100)

There are two other choices to be made:


One-tailed or two-tailed test
Is there a difference? = 2-tailed test
Is the difference less than or greater than? = 1-tailed test

Alpha () level
.05, .01, or .001? (=.05 is most common)

Two-tailed vs. One-tailed Tests


In a two-tailed test, the direction of the
difference is not predicted.
A two-tailed test splits the critical region
equally on both sides of the curve.
In a one-tailed test, the researcher predicts
the direction (i.e. greater or less than) of the
difference.
All of the critical region is placed on the side
of the curve in the direction of the prediction.

The Curve for Two- vs. One-tailed Tests at = .05:


Two-tailed test:
is there a significant
difference?
One-tailed tests:
is the sample mean
greater than or Pu?
is the sample mean
less than or Pu?

Type I and Type II Errors


Type I, or alpha error:
Rejecting a true null hypothesis

Type II, or beta error:


Failing to reject a false null hypothesis

Please forward your query


To: sjain@amity.edu

You might also like