Day 10

Session-I

Learning Objectives

using one way analysis of variance

To use the F distribution to test hypotheses about

two population variances

To learn about randomized block design

To learn the technique of two-way analysis of

variance and the concept of interaction

Chapter Overview

ANOVA Block Design ANOVA

Comparisons Effects

Tukey-

Kramer

test

Investigator controls one or more independent

variables

Called factors (or treatment variables)

Each factor contains two or more levels (or groups or

categories/classifications)

Observe effects on the dependent variable

Response to levels of independent variable

Experimental design: the plan used to collect

the data

Completely Randomized Design

randomly to treatments

Subjects are assumed homogeneous

Only one factor or independent variable

With two or more treatment levels

Analyzed by one-way analysis of variance

(ANOVA)

or more groups

Examples: Accident rates for 1st, 2nd, and 3rd shift

Expected mileage for five brands of tires

Assumptions

Populations are normally distributed

Hypotheses of One-Way ANOVA

H0 : µ1 = µ2 = µ3 = L = µc

All population means are equal

i.e., no treatment effect (no variation in means among

groups)

H1 : Not all of the population means are the same

At least one population mean is different

i.e., there is a treatment effect

Does not mean that all population means are different

(some pairs may be the same)

One-Factor ANOVA

H0 : µ1 = µ2 = µ3 = L = µc

H1 : Not all µ j are the same

The Null Hypothesis is True

(No Treatment Effect)

µ1 = µ2 = µ3

One-Factor ANOVA

(continued)

H0 : µ1 = µ2 = µ3 = L = µc

H1 : Not all µ j are the same

At least one mean is different:

The Null Hypothesis is NOT true

(Treatment Effect is present)

or

µ1 = µ2 ≠ µ3 µ1 ≠ µ2 ≠ µ3

(Total variation)

SSA = Sum of Squares Among Groups

(Among-group variation)

SSW = Sum of Squares Within Groups

(Within-group variation)

Partitioning the Variation

(continued)

data values across the various factor levels (SST)

sample means (SSA)

the data values within a particular factor level (SSW)

d.f. = n – 1

= Factor (SSA) + Sampling (SSW)

d.f. = c – 1 d.f. = n – c

Sum of Squares Between Sum of Squares Within

Sum of Squares Among Sum of Squares Error

Sum of Squares Explained Sum of Squares Unexplained

Among Groups Variation Within-Group Variation

Total Sum of Squares

SST = SSA + SSW

c nj

Where: j=1 i =1

c = number of groups (levels or treatments)

nj = number of observations in group j

Xij = ith observation from group j

X = grand mean (mean of all data values)

Total Variation

(continued)

Response, X

Among-Group Variation

SST = SSA + SSW

c

SSA = ∑ n j ( X j − X)2

j=1

Where:

SSA = Sum of squares among groups

c = number of groups

nj = sample size from group j

Xj = sample mean from group j

X = grand mean (mean of all data values)

Among-Group Variation

(continued)

c

SSA = ∑ n j ( X j − X)2

j=1

Differences Among Groups

MSA =

c −1

Mean Square Among =

SSA/degrees of freedom

µi µj

Among-Group Variation

(continued)

SSA = n1 ( x1 − x )2 + n 2 ( x 2 − x )2 + ... + nc ( x c − x )2

Response, X

X3

X2 X

X1

Within-Group Variation

SST = SSA + SSW

c nj

SSW = ∑ ∑ ( Xij − X j )2

j=1 i=1

Where:

c = number of groups

nj = sample size from group j

Xj = sample mean from group j

Xij = ith observation in group j

Within-Group Variation

(continued)

c nj

SSW = ∑ ∑ ( Xij − X j )2

j=1 i=1

within each group and then

MSW =

adding over all groups n−c

Mean Square Within =

SSW/degrees of freedom

µj

Within-Group Variation

(continued)

Response, X

X3

X2

X1

Obtaining the Mean Squares

SSA

MSA =

c −1

SSW

MSW =

n−c

SST

MST =

n −1

Source of SS df MS F ratio

Variation (Variance)

Among SSA MSA

SSA c-1 MSA = F=

Groups c-1 MSW

Within SSW

SSW n-c MSW =

Groups n-c

SST =

Total n-1

SSA+SSW

c = number of groups

n = sum of the sample sizes from all groups

df = degrees of freedom

One-Way ANOVA

F Test Statistic

H0: µ1= µ2 = … = µc

H1: At least two population means are different

F=

MSW

MSA is mean squares among groups

MSW is mean squares within groups

Degrees of freedom

df1 = c – 1 (c = number of groups)

df2 = n – c (n = sum of sample sizes from all populations)

F Statistic

The F statistic is the ratio of the among

estimate of variance and the within estimate

of variance

The ratio must always be positive

df1 = c -1 will typically be small

df2 = n - c will typically be large

Decision Rule:

Reject H0 if F > FU, α = .05

otherwise do not

reject H0 0 Do not Reject H0

reject H0

FU

One-Way ANOVA

F Test Example

different golf clubs yield 254 234 200

different distances. You 263 218 222

randomly select five 241 235 197

measurements from trials on 237 227 206

an automated driving 251 216 204

machine for each club. At the

0.05 significance level, is

there a difference in mean

distance?

Scatter Diagram

Distance

Club 1 Club 2 Club 3 270

254 234 200 260 •

263 218 222 ••

241 235 197

250 X1

240 •

237 227 206 • ••

251 216 204 230

220

•

X2 • X

••

210

x1 = 249.2 x 2 = 226.0 x 3 = 205.8 200

••

••

X3

x = 227.0 190

1 2 3

Club

One-Way ANOVA Example

Computations

Club 1 Club 2 Club 3 X1 = 249.2 n1 = 5

254 234 200 X2 = 226.0 n2 = 5

263 218 222

X3 = 205.8 n3 = 5

241 235 197

237 227 206 n = 15

X = 227.0

251 216 204 c=3

SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4

SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6

F= = 25.275

MSW = 1119.6 / (15-3) = 93.3 93.3

Solution

H0: µ1 = µ2 = µ3 Test Statistic:

H1: µj not all equal

MSA 2358.2

α = 0.05 F= = = 25.275

df1= 2 df2 = 12 MSW 93.3

Critical Decision:

Value:

Reject H0 at α = 0.05

FU = 3.89

α = .05 Conclusion:

There is evidence that

0 Do not Reject H0 at least one µj differs

reject H0 F = 25.275

FU = 3.89 from the rest

One-Way ANOVA

Excel Output

EXCEL: tools | data analysis | ANOVA: single factor

SUMMARY

Groups Count Sum Average Variance

Club 1 5 1246 249.2 108.2

Club 2 5 1130 226 77.5

Club 3 5 1029 205.8 94.2

ANOVA

Source of

SS df MS F P-value F crit

Variation

Between

4716.4 2 2358.2 25.275 4.99E-05 3.89

Groups

Within

1119.6 12 93.3

Groups

Total 5836.0 14

Numerical Problems

Ref: 11-28. Page no. 604 The following data show the

number of claims processed per day for a group of

four insurance company employees observed for a

number of days. Test the hypothesis that the

employees’ mean claims per day are all the same.

Use the 0.05 level of significance.

Employee 1 15 17 14 12

Employee 2 12 10 13 17

Employee 3 11 14 13 15 12

Employee 4 13 12 12 14 10 9

One-Way ANOVA Excel Output

EXCEL: tools | data analysis | ANOVA: single factor

SUMMARY

Employee 1 4 58 14.5 4.333333

Employee 2 4 52 13 8.666667

Employee 3 5 65 13 2.5

Employee 4 6 70 11.67 3.466667

Sources of variation SS Df Ms F P-value F Crit

Between groups 19.45 3 6.48 1.46 0.26 3.28

Total 85.78 18

Do not reject Ho. The employees' productivities are not

significantly different

Day 10

Session-II

The Tukey-Kramer Procedure

different

e.g.: µ1 = µ2 ≠ µ3

Done after rejection of equal means in ANOVA

Allows pair-wise comparisons

Compare absolute mean differences with critical

range

µ1= µ2 µ3 x

MSW 1 1

Critical Range = QU +

2 n j n j'

where:

QU = Value from Studentized Range Distribution

with c and n - c degrees of freedom for

the desired level of α (see appendix E.9 table)

MSW = Mean Square Within

nj and nj’ = Sample sizes from groups j and j’

The Tukey-Kramer Procedure:

Example

1. Compute absolute mean

Club 1 Club 2 Club 3 differences:

254 234 200

263 218 222 x1 − x 2 = 249.2 − 226.0 = 23.2

241 235 197 x1 − x 3 = 249.2 − 205.8 = 43.4

237 227 206

251 216 204 x 2 − x 3 = 226.0 − 205.8 = 20.2

c = 3 and (n – c) = (15 – 3) = 12 degrees of freedom

for the desired level of α (α = 0.05 used here):

QU = 3.77

Example

(continued)

3. Compute Critical Range:

MSW 1 1 93.3 1 1

Critical Range = QU + = 3.77 + = 16.285

2 n j n j' 2 5 5

4. Compare:

5. All of the absolute mean differences x1 − x 2 = 23.2

are greater than critical range.

Therefore there is a significant x1 − x 3 = 43.4

difference between each pair of

means at 5% level of significance. x 2 − x 3 = 20.2

Thus, with 95% confidence we can conclude

that the mean distance for club 1 is greater

than club 2 and 3, and club 2 is greater than

club 3.

The Randomized Block Design

population means (for different factor levels, for

example)...

from a second factor (with two or more levels)

SSA = Among-Group variation

SSBL = Among-Block variation

SSE = Random variation

Sum of Squares for Blocking

SST = SSA + SSBL + SSE

r

SSBL = c ∑ ( Xi. − X)2

i=1

Where:

c = number of groups

r = number of blocks

Xi. = mean of all values in block i

X = grand mean (mean of all data values)

Total variation can now be split into three parts:

computed as they were

in One-Way ANOVA

Mean Squares

SSBL

MSBL = Mean square blocking =

r −1

SSA

MSA = Mean square among groups =

c −1

SSE

MSE = Mean square error =

(r − 1)(c − 1)

Source of

SS df MS F ratio

Variation

Among MSA

Treatments SSA c-1 MSA

MSE

Among SSBL r-1 MSBL MSBL

Blocks

MSE

Error SSE (r–1)(c-1) MSE

Total SST rc - 1

c = number of populations rc = sum of the sample sizes from all populations

r = number of blocks df = degrees of freedom

Blocking Test

H0 : µ1. = µ2. = µ3. = ...

H1 : Not all block means are equal

MSBL

F=

MSE

Blocking test: df1 = r – 1

df2 = (r – 1)(c – 1)

Reject H0 if F > FU

H0 : µ.1 = µ.2 = µ.3 = ... = µ.c

H1 : Not all population means are equal

MSA

F=

MSE

Main Factor test: df1 = c – 1

df2 = (r – 1)(c – 1)

Reject H0 if F > FU

The Tukey Procedure

different

e.g.: µ1 = µ2 ≠ µ3

Done after rejection of equal means in randomized

block ANOVA design

Allows pair-wise comparisons

Compare absolute mean differences with critical

range

µ1= µ2 µ3 x

(continued)

MSE

Critical Range = Qu

r

Compare:

Is x.j − x.j' > Critical Range ? x.1 − x .2

If the absolute mean difference x.1 − x .3

is greater than the critical range

then there is a significant x.2 − x .3

difference between that pair of

means at the chosen level of etc...

significance.

Factorial Design:

Two-Way ANOVA

Examines the effect of

Two factors of interest on the dependent

variable

e.g., Percent carbonation and line speed on soft drink

bottling process

Interaction between the different levels of these

two factors

e.g., Does the effect of one particular carbonation

level depend on which level the line speed is set?

Two-Way ANOVA

(continued)

Assumptions

Populations have equal variances

Independent random samples are

drawn

Two-Way ANOVA

Sources of Variation

Two Factors of interest: A and B

r = number of levels of factor A

c = number of levels of factor B

n’ = number of replications for each cell

n = total number of observations in all cells

(n = rcn’)

Xijk = value of the kth observation of level i of

factor A and level j of factor B

Two-Way ANOVA

Sources of Variation (continued)

Freedom:

SSA r–1

Factor A Variation

Factor B Variation

Total Variation

SSAB

Variation due to interaction (r – 1)(c – 1)

between A and B

n-1

SSE rc(n’ – 1)

Random variation (Error)

Two Factor ANOVA Equations

Total Variation: r c n′

SST = ∑∑∑ ( Xijk − X)2

i=1 j =1 k =1

Factor A Variation: r

SSA = cn′∑ ( Xi.. − X)2

i=1

Factor B Variation:

c

SSB = rn′∑ ( X. j. − X)2

j =1

(continued)

Interaction Variation:

r c

SSAB = n′∑∑ ( Xij. − Xi.. − X.j. + X)2

i =1 j=1

r c n′

SSE = ∑∑∑ ( Xijk − Xij. )2

i=1 j=1 k =1

Two Factor ANOVA Equations

(continued)

r c n′

where: ∑∑∑ X

i=1 j=1 k =1

ijk

X= = Grand Mean

c n′

rcn′

∑∑ X

j=1 k =1

ijk

cn′

r n′

∑∑ X ijk

X. j. = i =1 k =1

= Mean of jth level of factor B (j = 1, 2, ..., c)

rn′

n′ Xijk

Xij. = ∑

r = number of levels of factor A

= Mean of cell ij c = number of levels of factor B

k =1 n ′

n’ = number of replications in each cell

SSA

MSA = Mean square factor A =

r −1

SSB

MSB = Mean square factor B =

c −1

SSAB

MSAB = Mean square interaction =

(r − 1)(c − 1)

SSE

MSE = Mean square error =

rc(n'−1)

Two-Way ANOVA:

The F Test Statistic

F Test for Factor A Effect

H0: µ1.. = µ2.. = µ3.. = • • •

MSA Reject H0

H1: Not all µi.. are equal F=

MSE if F > FU

H0: µ.1. = µ.2. = µ.3. = • • •

MSB Reject H0

H1: Not all µ.j. are equal F=

MSE if F > FU

H0: the interaction of A and B is

equal to zero

MSAB

H1: interaction of A and B is not F= Reject H0

MSE if F > FU

zero

Two-Way ANOVA

Summary Table

Source of Sum of Degrees of Mean F

Variation Squares Freedom Squares Statistic

MSA MSA

Factor A SSA r–1

= SSA /(r – 1) MSE

MSB MSB

Factor B SSB c–1

= SSB /(c – 1) MSE

AB MSAB MSAB

SSAB (r – 1)(c – 1)

(Interaction) = SSAB / (r – 1)(c – 1) MSE

MSE =

Error SSE rc(n’ – 1)

SSE/rc(n’ – 1)

Total SST n–1

Features of Two-Way ANOVA

F Test

Degrees of freedom always add up

n-1 = rc(n’-1) + (r-1) + (c-1) + (r-1)(c-1)

Total = error + factor A + factor B + interaction

same but the numerator is different

The sums of squares always add up

SST = SSE + SSA + SSB + SSAB

Total = error + factor A + factor B + interaction

Examples:

Interaction vs. No Interaction

Interaction is

No interaction:

present:

Factor B Level 1

Mean Response

Mean Response

Factor B Level 1

Factor B Level 3

Factor B Level 2

Factor B Level 2

Factor B Level 3

Multiple Comparisons:

The Tukey Procedure

Unless there is a significant interaction, you

can determine the levels that are significantly

different using the Tukey procedure

Consider all absolute mean differences and

compare to the calculated critical range

Example: Absolute differences X1.. − X 2..

for factor A, assuming three factors:

X1.. − X 3..

X 2.. − X 3..

Multiple Comparisons:

The Tukey Procedure

Critical Range for Factor A:

MSE

Critical Range = QU

c n'

(where Qu is from Table E.10 with r and rc(n’–1) d.f.)

MSE

Critical Range = QU

r n'

(where Qu is from Table E.10 with c and rc(n’–1) d.f.)

Summary

Described one-way analysis of variance

The logic of ANOVA

ANOVA assumptions

F test for difference in c means

The Tukey-Kramer procedure for multiple comparisons

Considered the Randomized Block Design

Treatment and Block Effects

Multiple Comparisons: Tukey Procedure

Described two-way analysis of variance

Examined effects of multiple factors

Examined interaction between factors

