You are on page 1of 70

ANOVA: Analysis of Variance

1-way ANOVA

Anthony J Greene

ANOVA
I.

What is Analysis of Variance


1.
2.
3.
4.

II.

Different computational concerns for ANOVA


1.
2.

III.
IV.
V.

The F-ratio
Used for testing hypotheses among more than two means
As with t-test, effect is measured in numerator, error variance
in the denomenator
Partitioning the Variance
Degrees Freedom for Numerator and Denominator
No such thing as a negative value

Using Table B.4


The Source Table
Hypothesis testing
Anthony J Greene

M1

M2

Anthony J Greene

M3

ANOVA
Analysis of Variance
Hypothesis testing for more than 2 groups
For only 2 groups t2(n) = F(1,n)

2
effect
2
error

seffect
serror

M1 M 2

sM

Anthony J Greene

Grp 1

Grp 2

Grp 3

BASIC IDEA

Is the Effect
Variability

Large
M1 = 1 M2 = 5 M3 = 1

Compared to the
Random
Variability

Between Treatment Variance Effect V


F
=
Within Treatment Variance Random V
As with the t-test, the numerator expresses the
differences among the dependent measure between
experimental groups, and the denominator is the
error.
If the effect is enough larger than random error, we
reject the null hypothesis.
Anthony J Greene

BASIC IDEA

Treatment Effect Error


F
Error
If the differences accounted for by the manipulation
are low (or zero) then F = 1
If the effects are twice as large as the error, then
F = 3, which generally indicates an effect.
Anthony J Greene

Sources of Variance

Anthony J Greene

Why Is It Called Analysis of Variance?


Arent We Interested In Means, Not Variance?
Most statisticians do not know the answer to this
question?
If were interested in differences among means
why do an analysis of variance?
The misconception is that it compares 12 to 22.
No
The comparison is between effect variance
(differences in group means) to random variance.
Anthony J Greene

Learning Under Three


Temperature Conditions

T x, G T

M1

M2

T is the treatment total, G is the


Grand total
M3
Anthony J Greene
9

Computing the Sums of Squares

Anthony J Greene

10

How Variance is Partitioned


SSTotal

G
X
N
2

This simply disregards


group membership and
computes an overall SS
Variability Between
and Within Groups
is Included

Keep in mind the general


formula for SS
2

X
SS X 2

Grp 1

Grp 2

Grp 3

M1 = 1 M2 = 5 M3 = 1

Anthony J Greene

11

How Variance is Partitioned


2

SS Between

T
G

n
N

Imagine there were no


individual differences at all.
The SS for all scores would
measure only the fact that
there were group differences.

Keep in mind the general


formula for SS
2

X
SS X 2

Grp 1

Grp 2

1
1
1
1
1

5
5
5
5
5

Grp 3
1
1
1
1
1

T1 = 5 T2 = 25 T3 = 5
M1 = 1 M2 = 5 M3 = 1

Anthony J Greene

12

How Variance is Partitioned


SSWithin SS

Keep in mind the general


formula for SS

SS ( X M ) 2
SS computed within a
column removes the mean.

Grp 1
0-1
1-1
3-1
1-1
0-1

Thus summing the SSs for


each column computes the
M =1
overall variability except for
the mean differences
between groups.
Anthony J Greene
1

Grp 2
4-5
3-5
6-5
3-5
4-5

Grp 3
1-1
2-1
2-1
0-1
0-1

M2 = 5 M3 = 1

13

How Variance is Partitioned


SSTotal

SS Between
Grp 1
0-1
1-1
3-1
1-1
0-1

T
G

n
N
Grp 2
4-5
3-5
6-5
3-5
4-5

G
X
N
2

SSWithin SS

SS ( X M ) 2

Grp 3
1-1
2-1
2-1
0-1
0-1

M1 = 1 M2 = 5 M3 = 1

Anthony J Greene

14

Computing Degrees Freedom


df between is k-1, where k is the number of
treatment groups (for the prior example, 3,
since there were 3 temperature conditions)
df within is N-k , where N is the total
number of ns across groups. Recall that for
a t-test with two independent groups, df was
2n-2? 2n was all the subjects N and 2 was
the number of groups, k.
Anthony J Greene

15

Computing Degrees Freedom

Anthony J Greene

16

How Degrees Freedom Are Partitioned


N-1 = (N - k) + (k - 1)
N-1 = N - k + k 1

Anthony J Greene

17

Partitioning The Sums of Squares

Anthony J Greene

18

Computing An F-Ratio
MSbetween

SSbetween

df between

MS within

MSbetween
F
MS within
Anthony J Greene

SS within

df within

19

Consult Table B-4


Take a standard normal
distribution, square each
value, and it looks like this

Anthony J Greene

20

Table B-4

Anthony J Greene

21

Two different F-curves

Anthony J Greene

22

ANOVA: Hypothesis Testing

Anthony J Greene

23

Basic Properties of F-Curves


Property 1: The total area under an F-curve is equal to 1.
Property 2: An F-curve starts at 0 on the horizontal axis
and extends indefinitely to the right, approaching, but
never touching, the horizontal axis as it does so.
Property 3: An F-curve is right skewed.

Anthony J Greene

24

Finding the F-value having area


0.05 to its right

Anthony J Greene

25

Assumptions for
One-Way ANOVA
1.

Independentsamples:Thesamplestakenfrom
thepopulationsunderconsiderationare
independentofoneanother.

2.

Normalpopulations:Foreachpopulation,the
variableunderconsiderationisnormally
distributed.

3.

Equalstandarddeviations:Thestandard
deviationsofthevariableunderconsideration
arethesameforallthepopulations.
Anthony J Greene

26

Learning Under Three


Temperature Conditions

M1 = 1

M2 = 5

M3 = 1

Anthony J Greene

27

Learning Under Three Temperature Conditions

Is the Effect
Variability
Large
M1 = 1

M2 = 5

M3 = 1

Compared to the
Random Variability

Anthony J Greene

28

Learning Under Three


Temperature Conditions

T x
Anthony J Greene

29

Learning Under Three


Temperature Conditions

T x
Anthony J Greene

30

Learning Under Three


Temperature Conditions

Anthony J Greene

31

Learning Under Three


Temperature Conditions

Anthony J Greene

32

Learning Under Three


Temperature Conditions

M1

M2

M3

Anthony J Greene

33

Learning Under Three


Temperature Conditions
16
9
36
9
16

1
9
1

M1

M2

1
4
4

X2 = 106

M3

Anthony J Greene

34

Learning Under Three


Temperature Conditions

G T
M1

M2

M3

Anthony J Greene

35

Learning Under Three


Temperature Conditions

M1

M2

M3

36

Calculating the F statistic


X2-G2/N = 46
T
G

SSbetween = n N
Sstotal =

SSbetween = 30
SStotal= Ssbetween + SSwithin
Sswithin = 16

MSbetweeen
F
MS within

SSbetween 30
15
df between
2

11.28
SS within
16 1.33
df within
12

Distribution of the F-Statistic for


One-Way ANOVA
Supposethevariableunderconsiderationisnormally
distributedoneachofkpopulationsandthatthepopulation
standarddeviationsareequal.Then,forindependentsamples
fromthekpopulations,thevariable

MSbetween MStreatment
F

MS within
MSerror
hastheFdistributionwithdf=(k1,nk)ifthenull
hypothesisofequalpopulationmeansistrue.Herendenotes
thetotalnumberofobservations.
Anthony J Greene

38

ANOVA
Source Table
for a one-way analysis of variance

Anthony J Greene

39

The one-way ANOVA test for k


population means (Slide 1 of 3)
Step 1 The null and alternative hypotheses are
Ho: 1 = 2 = 3 = = k
Ha: Not all the means are equal
Step 2 Decide On the significance level,
Step 3 The critical value of F, with df = (k - 1, N - k), where N is
the total number of observations.

Anthony J Greene

40

The one-way ANOVA test for k


population means (Slide 2 of 3)

Anthony J Greene

41

The one-way ANOVA test for k


population means (Slide 3 of 3)
Step 4 Obtain the three sums of squares, STT, STTR, and SSE
Step 5 Construct a one-way ANOVA table:

Step 6 If the value of the F-statistic falls in the rejection region,


reject H0;
Anthony J Greene
42

Post Hocs
H0 : 1 = 2 = 3 = = k
Rejecting H0 means that not all means are equal.
Pairwise tests are required to determine which of
the means are different.
One problem is for large k. For example with k = 7,
21 means must be compared. Post-Hoc tests are
designed to reduce the likelihood of groupwise type
I error.
Anthony J Greene

43

Criterion for deciding whether or


not to reject the null hypothesis

Anthony J Greene

44

One-Way
ANOVA

control
0
1
3
0
1

low dose
1
3
4
1
1

high dose
5
8
6
4
7

A researcher wants to test the effects of St. Johns Wort, an


over the counter, herbal anti-depressant. The measure is a
scale of self-worth. The subjects are clinically depressed
patients. Use = 0.01
Anthony J Greene

45

control
0
One-Way
1
ANOVA
3
0
Compute the
1
treatment totals, T,
and the grand total, G T1=5

low dose
1
3
4
1
1
T2=10

high dose
5
8
G=45
6
4
7
T3=30

Anthony J Greene

46

control
0
One-Way
1
ANOVA
3
0
Count n for each
1
treatment, the
T1=5
total N, and k
n1=5

low dose
1
3
4
1
1
T2=10

high dose
5
8
G=45
6
N=15
4
k=3
7
T3=30

n2=5

Anthony J Greene

n3=5

47

control
0
One-Way
1
ANOVA
3
0
Compute the
1
treatment means
T1=5

low dose
1
3
4
1
1
T2=10

high dose
5
8
G=45
6
N=15
4
k=3
7
T3=30

n1=5

n2=5

n3=5

M1=1

M2 =2

M3 =6

Anthony J Greene

48

One-Way
ANOVA
Compute the
treatment SSs

control
low dose
0(0-1)2=1 1
1(1-1)2=0 3
3(3-1)2=4 4
0(0-1)2=1 1
1(1-1)2=0 1
T1=5 sum T2=10

high dose
5
8
G=45
6
N=15
4
k=3
7
T3=30

n1=5

n2=5

n3=5

M1=1

M2 =2

M3 =6

SS=6

SS=8

SS=10

Anthony J Greene

49

control
0
One-Way
1
ANOVA
3
0
Compute all X2s
1
and sum them
T1=5

low dose
1
3
4
1
1
T2=10

high dose
5
8
6
4
7
T3=30

n1=5

n2=5

n3=5

M1=1

M2 =2

M3 =6

SS=6

SS=8

SS=10

Anthony J Greene

G=45
N=15
k=3
X2= 229

50

One-Way
ANOVA
Compute SSTotal
SSTotal= X2 G2

control
0
1
3
0
1
T1=5
N

low dose
1
3
4
1
1
T2=10

high dose
5
8
6
4
7
T3=30

n1=5

n2=5

n3=5

M1=1

M2 =2

M3 =6

SS=6

SS=8

SS=10

Anthony J Greene

G=45
N=15
k=3
X2= 229
SSTotal=94

51

One-Way
ANOVA
Compute SSWithin
SSWithin= SSi

control
0
1
3
0
1
T1=5

low dose
1
3
4
1
1
T2=10

high dose
5
8
6
4
7
T3=30

n1=5

n2=5

n3=5

M1=1

M2 =2

M3 =6

SS1=6

SS2=8

SS3=10

Anthony J Greene

G=45
N=15
k=3
X2= 229
SSTotal=94
SSWithin=24

52

control
0
One-Way 1
ANOVA 3
0
Determine d.f.s
1
d.f. Within=N-k
T1=5

low dose
1
3
4
1
1
T2=10

high dose
5
8
6
4
7
T3=30

G=45
N=15
k=3
X2= 229
SSTotal=94

d.f. Between=k-1

n1=5

n2=5

n3=5

SSWithin=24

d.f. Total=N-1

M1=1

M2 =2

M3 =6

d.f. Within=12

Note that (N-k)+(k-1)=N-1

SS1=6

SS2=8

SS3=10

d.f. Between=2

Anthony J Greene

d.f. Total=14
53

control
0
One-Way 1
ANOVA 3
0
Ready to move it
1
to a source table
T1=5

low dose
1
3
4
1
1
T2=10

high dose
5
8
6
4
7
T3=30

G=45
N=15
k=3
X2= 229
SSTotal=94

n1=5

n2=5

n3=5

MX1=1

MX2 =2

MX3 =6

d.f. Within=12

SS1=6

SS2=8

SS3=10

d.f. Between=2

Anthony J Greene

SSWithin=24

d.f. Total=14
54

One-Way ANOVA
Compute the missing values
Source
Between
Within

SS
70
24

df
2
12

Total

94

14

MS

Anthony J Greene

55

One-Way ANOVA
Compute the missing values
Source
Between
Within

SS
70
24

df
2
12

Total

94

14

MS
35
2

Anthony J Greene

56

One-Way ANOVA
Compute the missing values
Source
Between
Within

SS
70
24

df
2
12

Total

94

14

MS
35
2

Anthony J Greene

F
17.5

57

One-Way ANOVA
1. Compare your F of 17.5 with the critical value at
2,12 degrees of freedom, = 0.01: 6.93
2. reject H0
Source
Between
Within

SS
70
24

df
2
12

Total

94

14

MS
35
2

Anthony J Greene

F
17.5

58

Students want to know if studying has an impact on a 10-point statistics


quiz, so they divided into 3 groups: low studying (0-5hrs./wk), medium
studying (6-15 hrs./wk) and high studying (16+ hours/week). At =0.01,
does the amount of studying impact quiz scores?

Low

One-Way
ANOVA

2
4
3
0
2
1

Medium
6
4
5
3
6
6
Anthony J Greene

High
9
10
8
10
8
9
59

One-Way
ANOVA

low
2
4
3
0
2
1

Compute the
treatment totals, T,
and the grand total,
G
T1=12

medium
6
4
5
3
6
6

high
9
10
8
10
8
9

T2=30

T3=54

Anthony J Greene

G=96

60

One-Way
ANOVA
Count n for each
treatment, the
total N, and k

low
2
4
3
0
2
1

medium
6
4
5
3
6
6

high
9
10
8
10
8
9

T1=12

T2=30

T3=54

n1=6

n2=6

n3=6

Anthony J Greene

G=96
N=18
k=3

61

One-Way
ANOVA
Compute the
treatment means

low
2
4
3
0
2
1

medium
6
4
5
3
6
6

high
9
10
8
10
8
9

T1=12

T2=30

T3=54

n1=6

n2=6

n3=6

M1=2

M2 =5

M3 =9

Anthony J Greene

G=96
N=18
k=3

62

One-Way
ANOVA
Compute the
treatment SSs

low
medium
2(2-2)2=0 6
4(4-2)2=4 4
3(3-2)2=1 5
0(0-2)2=4 3
2(2-2)2=0 6
1(1-2)2=1 6
T1=12

high
9
10
8
10
8
9

T2=30

T3=30

n2=6

n3=6

M1=2

M2 =5

M3 =9

SS=10

SS=8

SS=10

sum

n1=6

Anthony J Greene

G=96
N=18
k=3

63

One-Way
ANOVA
Compute all X2s
and sum them

low
2
4
3
0
2
1

medium
6
4
5
3
6
6

high
9
10
8
10
8
9

T1=12

T2=30

T3=54

n1=6

n2=6

n3=6

M1=2

M2 =5

M3 =9

SS=10

SS=8

SS=10

Anthony J Greene

G=96
N=18
k=3
X2=682

64

One-Way
ANOVA
Compute SSTotal
SSTotal= X
2

low
2
4
3
0
2
1

medium
6
4
5
3
6
6

high
9
10
8
10
8
9

T1=12

T2=30

T3=54

n1=6

n2=6

n3=6

M1=2

M2 =5

M3 =9

SS=10

SS=8

SS=10

Anthony J Greene

G=96
N=18
k=3
X2= 682
SSTotal=170

65

One-Way
ANOVA
Compute SSWithin
SSWithin= SSi

low
2
4
3
0
2
1

medium
6
4
5
3
6
6

high
9
10
8
10
8
9

T1=12

T2=30

T3=54

SSTotal=170

n1=6

n2=6

n3=6

SSWithin=28

M1=2

M2 =5

M3 =9

SS1=10

SS2=8

SS3=10

Anthony J Greene

G=96
N=18
k=3
X2= 682

66

d.f. Within=N-k

low
2
4
3
0
2
1

d.f. Between=k-1

T1=12

T2=30

T3=54

SSTotal=170

d.f. Total=N-1

n1=6

n2=6

n3=6

SSWithin=28

M1=2

M2 =5

M3 =9

d.f. Within=15

SS1=10

SS2=8

SS3=10

d.f. Between=2
67

One-Way
ANOVA
Determine d.f.s

Note that (N-k)+(k-1)=N-1

medium
6
4
5
3
6
6

Anthony J Greene

high
9
10
8
10
8
9

G=90
N=18
k=3
X2= 682

One-Way ANOVA
Fill in the values you have
Source
Between
Within

SS
28

df
2
15

Total

170

17

MS

Anthony J Greene

68

One-Way ANOVA
Compute the missing values
Source
Between
Within

SS
142
28

df
2
15

Total

170

17

MS
71
1.87

Anthony J Greene

F
37.97

69

One-Way ANOVA
1. Compare your F of 37.97 with the critical value at
2,15 degrees of freedom, = 0.01: 6.36
2. reject H0
Source
Between
Within

SS
142
28

df
2
15

Total

170

17

MS
71
1.87

Anthony J Greene

F
37.97

70

You might also like