You are on page 1of 76

PAN African e Network Project

DBM
Quantitative Techniques in Management

Semester - 1
Session - 6

Dr. Sarika Jain

HYPOTHESIS TESTING

Example
A random sample of 26 sociology
graduates scored 458 on the GRE
advanced sociology test with a standard
deviation of 20. Is this significantly
different from the population average
( = 440)?

Solution (using five step model)


Step 1: Make Assumptions and Meet Test
Requirements:
1. Random sample
2. Level of measurement is interval-ratio
3. The sample is small (<100)

Solution (cont.)
Step 2: State the null and alternate
hypotheses.
H0: = 440 (or H0:
H1: 440

= )

Solution (cont.)

Step 3: Select Sampling Distribution and


Establish the Critical Region

1. Small sample, I-R level, so use t


distribution.
2. Alpha () = .05
3. Degrees of Freedom = n-1 = 26-1 = 25
4. Critical t = 2.060

Solution (cont.)
Step 4: Use Formula to Compute the Test Statistic

458 440
t

4.5
S
20
n 1
26 1

Looking at the curve for the t


distribution
Alpha () = .05

Step 5 Make a Decision and


Interpret Results

The obtained t score fell in the Critical Region, so


we reject the H0 (t (obtained) > t (critical)
If the H0 were true, a sample outcome of 458
would be unlikely.
Therefore, the H0 is false and must be rejected.
Sociology graduates have a GRE score that is
significantly different from the general student body
(t = 4.5, df = 25, = .05).

Chi-Square (2) and Frequency


Data

If the data that we analyze consists of frequencies; that


is, the number of individuals falling into categories. In
other words, the variables are measured on a nominal
scale.
The test statistic for frequency data is Pearson ChiSquare. The magnitude of Pearson Chi-Square reflects
the amount of discrepancy between observed
frequencies and expected frequencies.

Steps in Test of Hypothesis


1.
2.
3.
4.
5.
6.

Determine the appropriate test


Establish the level of significance:
Formulate the statistical hypothesis
Calculate the test statistic
Determine the degree of freedom
Compare computed test statistic against a
tabled/critical value

1. Determine Appropriate Test


Chi Square is used when both variables are
measured on a nominal scale.
It can be applied to interval or ratio data that
have been categorized into a small number of
groups.
It assumes that the observations are randomly
sampled from the population.
All observations are independent (an individual
can appear only once in a table and there are
no overlapping categories).
It does not make any assumptions about the
shape of the distribution nor about the
homogeneity of variances.

2. Establish Level of
Significance

is a predetermined value
The convention
= .05
= .01
= .001

3. Determine The Hypothesis:


Whether There is an
Association or Not
Ho : The two variables are independent
Ha : The two variables are associated

4. Calculating Test Statistics


Contrasts observed frequencies in each cell of a
contingency table with expected frequencies.
The expected frequencies represent the number
of cases that would be found in each cell if the
null hypothesis were true ( i.e. the nominal
variables are unrelated).
Expected frequency of two unrelated events is
product of the row and column frequency divided
by number of cases.
Fe= Fr Fc / N

4. Calculating Test Statistics

( Fo Fe )

Fe

4. Calculating Test Statistics


O
f r e b se
qu rv
en ed
cie
s

( Fo Fe )

Fe

Ex
fre pec
qu ted
en
cy

d
cte y
pe nc
Ex que
fre

5. Determine Degrees of
Freedom
df = (R-1)(C-1)

of
ber
Num ls in
leve n
m
colu le
b
varia
N
u
m
b
er of
le
v
e
l
s
in ro
variab w
le

6. Compare computed test statistic


against a tabled/critical value
The computed value of the Pearson chisquare statistic is compared with the critical
value to determine if the computed value is
improbable
The critical tabled values are based on
sampling distributions of the Pearson chisquare statistic
If calculated 2 is greater than 2 table value,
reject Ho

Example
Suppose a researcher is interested in
voting preferences on gun control issues.
A questionnaire was developed and sent
to a random sample of 90 voters.
The researcher also collects information
about the political party membership of the
sample of 90 respondents.

Bivariate Frequency Table or


Contingency Table
Favor

Neutral

Oppose

f row

10

10

30

50

Republican 15

15

10

40

f column

25

40

n = 90

Democrat

25

Bivariate Frequency Table or


Contingency Table
Favor

Neutral

Oppose

f row

10

10

30

50

Republican 15

15

10

40

25

40

n = 90

Democrat

d
e
s
rv cie25
e
column bs en
O qu
fre

Row frequency

Bivariate Frequency Table or


Contingency Table
Favor

Neutral

Oppose

f row

10

10

30

50

Republican 15

15

10

40

f column

25

40

n = 90

Democrat

25

Bivariate Frequency Table or


Contingency Table
Favor

Neutral

Oppose

f row

10

10

30

50

Republican 15

15

10

40

f column

25

40

n = 90

Democrat

Column frequency

25

1. Determine Appropriate Test


1. Party Membership ( 2 levels) and
Nominal
2. Voting Preference ( 3 levels) and
Nominal

2. Establish Level of
Significance
Alpha of .05

3. Determine The Hypothesis


Ho : There is no difference between D & R
in their opinion on gun control issue.
Ha : There is an association between
responses to the gun control survey and
the party membership in the population.

4. Calculating Test Statistics

Democrat

Favor

Neutral

Oppose

f row

fo =10

fo =10

fo =30

50

fe =13.9 fe =13.9
Republican fo =15
f column

fo =15

fe=22.2
fo =10

fe =11.1 fe =11.1

fe =17.8

25

40

25

40
n = 90

4. Calculating Test Statistics


Favor
Democrat

fo =10

Neutral

= 50*25/90

fo =10

fe =13.9 fe =13.9
Republican fo =15
f column

Oppose

fo =15

fo =30

50

fe=22.2
fo =10

fe =11.1 fe =11.1

fe =17.8

25

40

25

f row

40
n = 90

4. Calculating Test Statistics

Democrat

Favor

Neutral

Oppose

f row

fo =10

fo =10

fo =30

50

fe =13.9 fe =13.9
Republican fo =15
f column

fe=22.2

= 40* 25/90

fo =15

fo =10

fe =11.1 fe =11.1

fe =17.8

25

40

25

40
n = 90

4. Calculating Test Statistics

(10 13.89) 2 (10 13.89) 2 (30 22.2) 2


13.89
13.89
22.2
2

(15 11.11) 2 (15 11.11) 2 (10 17.8) 2

11.11
11.11
17.8

= 11.03

5. Determine Degrees of
Freedom
df = (R-1)(C-1) =
(2-1)(3-1) = 2

6. Compare computed test statistic


against a tabled/critical value

= 0.05
df = 2
Critical tabled value = 5.991
Test statistic, 11.03, exceeds critical value
Null hypothesis is rejected
Democrats & Republicans differ
significantly in their opinions on gun
control issues

SPSS Output for Gun Control


Example
Chi-Square Tests

Pearson Chi-Square
Likelihood Ratio
Linear-by-Linear
Association
N of Valid Cases

Value
11.025a
11.365
8.722

2
2

Asymp. Sig.
(2-sided)
.004
.003

.003

df

90

a. 0 cells (.0%) have expected count less than 5. The


minimum expected count is 11.11.

Additional Information in SPSS


Output
Exceptions that might distort 2
Assumptions
Associations in some but not all categories
Low expected frequency per cell

Extent of association is not same as


statistical significance

ANOVA
1.
2.
3.
4.

Discuss the general idea of analysis of


variance.
List the characteristics of the F distribution.
Conduct a test of hypothesis to determine
whether the variances of two populations are
equal.
Organize data into a one-way and a
two-way ANOVA table.

5.
6.
7.

Define the terms treatments and blocks.


Conduct a test of hypothesis to determine
whether three or more treatment means are
equal.
Develop multiple tests for difference between
each pair of treatment means.

Characteristics of
of the
the
Characteristics
F-Distribution
F-Distribution
Thereis
isaafamily
familyof
ofF-Distributions:
F-Distributions:
There
Each member of the family is determined by
the numerator degrees of freedom, and
denominator degrees of freedom

F cannot

two parameters:

the

be negative, and it is a continuous distribution

The F distribution is positively skewed


Its values range from 0 to as F , the curve approaches
the X-axis

Test for
for equal
equal Variances
Variances
Test
For the two tailed test, the test
statistic is given by:
F

F
s12 and s22

2
2
ss11
22
s
s 22

are the sample variances for

the two samples

The null hypothesis is rejected if the computed value of the test


statistic is greater than the critical value

Question
Question
Colin, a stockbroker at Critical Securities, reported that the mean rate
of return on a sample of 10 internet stocks was 12.6 percent with a
standard deviation of 3.9 percent.

The mean rate of return on a sample of 8 utility stocks was 10.9 percent with a
standard deviation of 3.5 percent.

At the .05 significance level, can Colin conclude that there is


more variation in the internet stocks?

Recall
Recall

HypothesisTesting
Testing
Hypothesis

Step 11
Step

Statethe
thenull
nulland
andalternate
alternatehypotheses
hypotheses
State

Step 22
Step

Selectthe
thelevel
levelof
ofsignificance
significance
Select

Step 33
Step

Identifythe
thetest
teststatistic
statistic
Identify

Step 44
Step

Statethe
thedecision
decisionrule
rule
State

Step 55
Step

Computethe
thevalue
valueof
ofthe
thetest
teststatistic
statisticand
andmake
makeaa
Compute
decision
decision

Donot
not reject
reject HH00
Do

Reject HH00and
and accept
accept HH11
Reject

HypothesisTest
Test
Hypothesis

Step 11
Step

Statethe
thenull
nulland
andalternate
alternate
State
hypotheses
hypotheses

Step 22
Step

Selectthe
thelevel
levelof
ofsignificance
significance
Select

Step 33
Step

Identifythe
thetest
teststatistic
statistic
Identify

Step 44
Step

Statethe
thedecision
decisionrule
rule
State

Step 55
Step

Computethe
the
Compute
test statistic
statisticand
and
test
makeaadecision
decision
make

H 0: I2 U2
H : I2 2
1
U
= .05

The test statistic is the


F
distribution

Reject H0 if F > 3.68 The df are 9 in the


numerator and 7 in the denominator.

2
1
s22

( 3 .9 ) 2
( 3.5) 2

= 1.2416

Conclusion: Do not reject the null hypothesis; there is insufficient


evidence to show more variation in the internet stocks.

Underlying assumptions for ANOVA


TheFFdistribution
distributionisisalso
alsoused
usedfor
fortesting
testingwhether
whether
The

two or
or more
more
two
samplemeans
means
camefrom
from the
the
sample
came
same or
or equal
equalpopulations
populations
same
This technique is called
analysis of variance or

ANOVA

ANOVA requires the following


conditions that
the sampled populations follow the normal distribution

the populations have equal standard deviations


the samples are randomly selected
are independent

and

ANOVA Procedure
The Null Hypothesis (H0) is that the population means are the same

The Alternative Hypothesis (H1) is that

at least one of the

means is different

The Test Statistic is the F distribution


The Decision rule is to

reject H0

if
F(computed) is

greater than

F(table)

with numerator and denominator df

Terminology
Total Variation

is the sum of the squared differences


between each observation and
the overall mean
Treatment Variation
is the sum of the squared differences
between each treatment mean and
the overall mean
Random Variation

is the sum of the squared


differences between each
observation and its treatment
mean

Analysis of variance Procedure


If there k populations being sampled, the numerator
degrees of freedom is k 1

If there are a total of n observations the denominator degrees


of freedom is n - k
The test statistic is computed by:

k 1
SST
F

SSE n k

Analysis of variance Procedure

SS Total is the total sum of


squares

( X )
SSTotal X
n
2

Analysis of variance Procedure

SST is the treatment sum of


2
2
squares
Tc
X


SST
n
nc

TC is the column total, nc is the


number of observations in each
column, X the sum of all the
observations, and n the total number
of observations

Analysis of variance Procedure

SSE is the sum of squares


error

SSE SStotal- SST

Analysis of variance Procedure


Example
Easy Meals Restaurants specialize in meals for senior
citizens.
Katy Smith, President, recently developed a new
meat loaf dinner. Before making it a part of
the regular menu she decides to test it in several of her
restaurants.
She would like to know if there is a
difference in the mean number of dinners
sold per day at the Aynor, Loris, and Lander
restaurants.

Analysis of variance Procedure


Aynor
13
12
14
12
TTcc
nncc

51
51
44

Loris
10
12
13
11
46
46
44

Lander
18
16
17
17
17
85
85
55

Analysis of variance Procedure

SS Total (is the total sum of


squares)

(
X
)
SS Total X 2 n

= 2634 -

(182)2

13

= 86

Analysis of variance Procedure

SST is the treatment sum of squares


2
T c2
X


SST
n
c

51 2 46 2 85 2

4
= 76.25

2
(
182
)
13

Analysis of variance Procedure

SSE is the sum of squares

error SSE = SS Total - SST


86

76.25
= 9.75

HypothesisTest
Test
Hypothesis

Step 11
Step
Step 22
Step
Step 33
Step
Step 44
Step

Step 55
Step

Statethe
thenull
nulland
andalternate
alternate
State
hypotheses
hypotheses

H 0:

1 = 2 = 3

H :
1

Treatment means
are not all equal
Select
the
level
of
significance
Select the level of significance
= .05
Identifythe
thetest
teststatistic
statistic
The test statistic is the
Identify
F
distribution
Statethe
thedecision
decisionrule
rule
State

Reject H0 if F > 4.10 The df are 2 in the


numerator and 10 in the denominator.

Computethe
the
Compute
test statistic
statisticand
and
test
makeaadecision
decision
make

SST k 1
SSE

n k

76.25
9.75

2
10

= 39.10

Analysis of variance Procedure

Conclusion:

The decision is to reject the null

hypothesis
The treatment means are not the same
The mean number of meals sold at the
three locations is not the same

ANOVA Table
Table
ANOVA
from the Minitab system
Analysis of Variance
Source
DF
SS
MS
F
P
Factor
2
76.250
38.125
39.10
0.000
Error
10
9.750
0.975
Total
12
86.000
Individual 95% CIs For Mean Based on Pooled St.Dev
Level
N
Mean
St.Dev ---------+--------+---------+------Aynor
4
12.750
0.957
(---*---)
Loris
4
11.500
1.291
(---*---)
Lander
5
17.000
0.707
(---*---)
---------+---------+---------+------Pooled St.Dev =
0.987
12.5
15.0
17.5

Inferences
About
Treatment
Means

Inferences about treatment means


When we reject the null hypothesis that
the means are equal,
we may want to know
which
means
differ
One oftreatment
the simplest
procedures
is
through the use of confidence
intervals

Confidence interval for the


difference between two means

1
1
X1 X2 t MSE
n1 n2

MSE = [SSE(n-k)]

Confidence interval for the


difference between two means
Develop a

Example

95% confidence interval


for the
difference in the
mean number
of meat loaf
dinners sold in Lander and Aynor.
Can Katy conclude that there is a
difference between the two restaurants?

Confidence interval for the


difference between two means

X2 t

MSE

1
n1

n2

MSE
MSE

(17-12.75) 2.228 .975

4 .25 1.48 ( 2.77, 5.73)

1
5

Confidence interval for the


difference between two means

Because zero is not in the interval, we


conclude that this pair of means differs
The mean number of meals sold in
Aynor is different from Lander

Two Factor Anova


For the two-factor ANOVA we test whether there is a significant difference
between the treatment effect and whether
there is a difference in the blocking effect!

2
B r2
( X)

SSB

n
k

Let Br be the block totals ( for rows)


Let SSB represent the sum of squares for the blocks

Two Factor Anova


The Bieber Manufacturing Co. operates 24 hours a day, five days a week.
The workers rotate shifts each
week.
Todd Bieber, the owner, is interested in whether there is a difference in
the number of units produced when the employees
work on various shifts.
A sample of five workers is selected and their output recorded on each
shift. At the .05 significance level, can we conclude there is a
difference in the mean production by shift and in the mean production
by employee?

Two Factor Anova


Example
Employee

Day

Evening

Night

Output

Output

Output

McCartney

31

25

35

Neary

33

26

33

Schoen

28

24

30

Thompson

30

29

28

Wagner

28

26

27

HypothesisTest
Test
Hypothesis
Differencebetween
betweenvarious
variousshifts?
shifts?
Difference

Step 11
Step

Statethe
thenull
nulland
andalternate
alternate
State
hypotheses
hypotheses

Step 22
Step

Selectthe
thelevel
levelof
ofsignificance
significance
Select

Step 33
Step

Identifythe
thetest
teststatistic
statistic
Identify

Step 44
Step

Statethe
thedecision
decisionrule
rule
State

Step 55
Step

Computethe
the
Compute
test statistic
statisticand
and
test
makeaadecision
decision
make

H 0:

1 = 2 = 3

H : Not
1

all means are


equal
= .05

The test statistic is the


F
distribution
Reject H0 if F > 4.46.
The df are 2 and 8

SST k 1
SSE ( k 1)( b 1)

Two Factor Anova


Example
Compute the various sum of squares:
SS(total) = 139.73
SST
= 62.53
SSB
= 33.73
SSE
= 43.47
df(block) = 4, df(treatment) = 2
df(error)=8

Two Factor Anova


Example
Step 55
Step

SST k 1
SSE ( k 1)( b 1)

3 1
43.47 3 1 5 1
62 . 53

= 5.754

Since 5.754 > 4.46, H0 is rejected.


Conclusion: There is a difference in the mean number
of units produced on the different shifts.

HypothesisTest
Test
Hypothesis
Differencebetween
betweenvarious
variousshifts?
shifts?
Difference

Step 11
Step
Step 22
Step
Step 33
Step
Step 44
Step
Step 55
Step

Statethe
thenull
nulland
andalternate
alternate
State
hypotheses
hypotheses

H 0:

1 = 2 = 3

H : Not
1

all means are


equal
Selectthe
thelevel
levelof
ofsignificance
significance
Select
=0.05
Identifythe
thetest
teststatistic
statistic
The test statistic is the
Identify
F
distribution
Statethe
thedecision
decisionrule
rule
State
Reject H0 if F > 3.84
are 4 and 8

Computethe
the
Compute
test statistic
statisticand
and
test
makeaadecision
decision
make

The df

SST k 1
SSE ( k 1)( b 1)

Two Factor Anova

Step 55
Step

SST k 1
SSE ( k 1)( b 1)
4
= 1.55
43.47 24
33.73

Since 1.55 < 3.84, H0 is not rejected.


Conclusion There is no significant difference in the mean
number of units produced by the various employees.

Two Factor Anova


from the Minitab system

Units
versus
Worker,
Shift
Units versus Worker, Shift
Analysis
of
Variance
for
Units
Analysis of Variance for Units

Source
Source
Worker
Worker
Shift
Shift
Error
Error
Total
Total

DF
SS
DF
SS
4
33.73
4
33.73
2
62.53
2
62.53
8
43.47
8
43.47
14 139.73
14 139.73

MS
MS
8.43
8.43
31.27
31.27
5.43
5.43

F
P
F
P
1.55 0.276
1.55 0.276
5.75 0.028
5.75 0.028

Please forward your query


To: sjain@amity.edu

You might also like