12 Anova

ANOVA for comparing means between more than 2 groups
ANOVA
Its like this: If I have three groups to compare:
I could do three pair-wise t-tests, but this would increase my type I error So, instead I want to look at the pairwise differences all at once. To do this, I can recognize that variance is a statistic that lets me look at more than one difference at a time
The F-test
Is the difference in the means of the groups more than background noise (=variability within groups)?
Summarizes the mean differences between all groups at once.
Variability between groups F Variability within groups
Analogous to pooled variance from a ttest.

Recall, we have already used an F-test to check for equality of variances If F>>1 (indicating unequal variances), use unpooled variance in a t-test.
The F-distribution
The F-distribution is a continuous probability distribution that depends on two parameters n and m (numerator and denominator degrees of freedom, respectively):
The F-distribution
A ratio of variances follows an F-distribution:

The
2 between 2 within
~ Fn ,m
F-test tests the hypothesis that two variances are equal.

F
will be close to 1 if sample variances are equal.

2 2 H 0 : between within 2 2 H a : between within
Today
Introduce logic of ANOVA Review calculations Work through one example
While only comparing two means, ANOVA will give the same results as the normal ttest. However, it allows comparing multiple means and thus multiple groups (factor levels) as well as multiple factors simultaneously.
Illustrate Logic of ANOVA

We want to evaluate the effects of 4 different drugs on participants level of depression as measured by the Beck Depression Inventory. Group1 Group2 Group3 Group4 27 36 17 34 31 35 21 36
25 27 24 27 M = 26.8
29 33 38 36 M = 34.5
22 22 21 15 M = 19.7
30 32 32 32 M = 32.7
An ANOVA allows us to quantify how far apart the sample means must be before we are no longer willing to say they are all approximately equal.
Introduction to ANOVA
ANOVA the ANalysis Of Variance
(1) Inferential hypothesis-testing procedure (2) Tremendous advantage over t-tests:
used to compare MULTIPLE (two or more) treatments
(3) Provides researchers with much greater flexibility in design and analysis of experiments.
Introduction to ANOVA (cont.)
ANOVA the ANalysis Of Variance
(4) Multiple Forms Well look at the simplest: Single-factor, independent measures ANOVA
(a) experimental unit :object on which measurements take place. (a) factor: new name for the independent variable (b) independent measures: separate sample for each treatment (c) level: the intensity settings of a factor (d) treatment : specific combination of factor levels. Eg: Tyre quality study, Weekly production volume.
Research Design for ANOVA
FYI: Factors and Levels
Can be multiple factors (IVs) and levels (variations) Expressed as factors x levels Therapist Experience
experienced (+) inexperienced (-)
Treatment treatment A
treatment B

tx A + exp tx B + exp
tx A + inexp tx B + inexp
How many factors? How many levels?
Example of ANOVA
Four different test times (8am, 12pm, 4pm, and 8pm)

Tx1 Tx2 Tx3 Tx4
25 28 22
30 29 30
27 20 21
22 27 24
M = 25

M = 29.7
M = 22.6
M = 24.3
Does time of test affect scores? ANOVA uses variance to assess differences among the sample means
Variability Components for ANOVA
The Logic of ANOVA
(1) First, determine total variability for data set

Tx1 Tx2 Tx3 Tx4
25 28 22
M = 25
30 29 30
M = 29.7
27 20 21
M = 22.6
22 27 24
M = 24.3
(2) Next, break this variability into two components:
(a) Between-Treatments variance two sources:

Treatment Effect: Differences are caused by treatments. Chance: Differences simply due to chance.
(b) Within-Treatments variance one source:
Chance: Differences simply due to chance.
Forming an F-Ratio
(3) Finally, determine the variance due to the treatments alone by forming an F-Ratio. F = Variance Between-Treatments Variance Within-Treatments
Or in terms of sources
F = Treatment Effect + Differences due to Chance Differences due to Chance
If no treatment effect exists, F = 1.00 If there IS some treatment effect, F > 1.00 ( but not automatically statistically significant)
The Structure of ANOVA Calculations
New Terms and Symbols
k = number of treatment conditions (levels and factors). For independent-measures study, k = # of separate samples. n = number of scores in a treatment condition N = total number of scores in whole study (N = nk) T = sum of scores for each treatment condition G = sum of all scores in the study (Grand Total)
Hypothesis Testing with ANOVA (4 steps) STEP 1
STEP 1: State the Hypothesis H0: m1 m2 m3 mk (k = number of factor levels) H1 : At least one m is different from the others
Tx1
Tx2
Tx3
Tx4
25 28
30 29
27 20
22 27
22
M = 25
30
M = 29.7
21
M = 22.6
24
M = 24.3
Hypothesis Testing with ANOVA STEP 2
STEP 2: Locate the Critical region a = .05 Calculate dfbetween = k 1 Calculate dfwithin = N-k Calculate dftotal = N-1 Critical F will be provided for you dfbetween + dfwithin = dftotal (always!) Begin to fill in the Source Table (ANOVA Table)
Tx1 Tx2 Tx3
k = number of factor levels n = number of scores in a treatment condition N = total number of scores in whole study (N = nk) T = sum of scores for each treatment condition G = sum of all scores in the study (Grand Total)
Tx4
25 28 22
M = 25
30 29 30
M = 29.7
27 20 21
M = 22.6
22 27 24
M = 24.3
Hypothesis Testing with ANOVA STEP 2 continued

Source Between Within Total
Basic ANOVA Table SS df MS SSbetween k-1 MSbetween SSwithin N-k MSwithin SStotal N-1
F F = Fobtained
STEP 3: Collect Data and Compute Sample Statistics SSbetween = (T2/n) (G2/N) SSwithin = SS inside each treatment = (SS1+SS2+SS3+...+SSk) SStotal = X2 (G2/N) or SSbetween + SSwithin MSbetween = SSbetween/dfbetween MSwithin = SSwithin/dfwithin F = MSbetween/MSwithin Fill in source table (ANOVA Table) *note: SSbetween + SSwithin = SStotal (always!)
n = # of scores in a tx condition N = total # of scores in whole study T = sum of scores for each tx condition G = sum of all scores in the study (Grand Total)
STEP 4: Make a Decision Given the Critical F-value (Fcritical ) - which will be provided - decide whether or not to reject the null. Fobtained < Fcritical --> Fail to reject Ho. Fobtained > Fcritical --> Reject Ho.
Use Appendix B page A-29 to find Fcritical
Bold-Faced = Fcritical for a = 0.01 Light-Faced Fcritical for a = 0.05 df-numerator = df-between df-denominator = df-within
Error Term
Error due to chance Does the treatment effect (difference among means) produce greater variability between groups than that expected by chance? The denominator in the F ratio
Example 1
A researcher is interested in whether class time affects exam scores. There are four different class times being examined:
8am, 12pm, 4pm, and 8pm.
Run an ANOVA, a= .05, to see if a significant difference exists between these treatments.
Example 1 DATA
Tx. 1 25 28 22 mean1=25 T1=75 SS1=18 n1=3 Tx. 2 30 29 30 mean2=29.67 T2=89 SS2=0.67 n2=3 X2 = 7893 G = 305 N = 12 Tx. 3 27 20 21 mean3=22.67 T3=68 SS3=28.67 n3=3 Tx. 4 22 27 24 mean4=24.33 T4=73 SS4=12.67 n4=3
k=
Example 1 Calculations
SSbetween = (T2/n) (G2/N) SSbetween = ((752/3)+(892/3)+(682/3)+(732/3))- (93,025/12) SSbetween=((5625/3)+(7921/3)+(4624/3)+(5329/3))-7752.083 SSbetween = (1875+2640.33+1541.33+1776.33)-7752.083 SSbetween = 7832.99 7752.083 SSbetween = 80.91 SSwithin = SS1+SS2+SS3+SS4 SSwithin = 18+.67+28.67+12.67 SSwithin = 60.01 SStotal = X2 (G2/N) OR SSbetween + SSwithin SStotal = 7893-7752.083 OR 80.91+ 60.01 SStotal = 140.92
Short Cut Method
Calculate grand Total

G = xij
Calculate Correction Factor CF = G2/N = (305)2/12 Find sum of squares and subtract CF to find SST
SST = (x12 + x22 + x32 + xk2 ) CF SSTr = (xj) 2 CF nj SSE = SST - SSTr
Example 1 ANOVA and Decision

Source SS df MS Between 80.91 3 26.97 Within 60.01 8 7.50 TOTAL 140.92 11 Fcritical = 4.07 Fobtained < Fcritical fail to reject H0 3.596 < 4.07 Fail to reject H0 Fobtained 3.596
Use Appendix B Page A-29 for Fcritical

df numerator = 3 (df for between) df denominator = 8 (df for within)
Example 2
A researcher is interested in whether a new drug affects activity level of lab animals. There are three different doses being examined: low, medium, large. Run an ANOVA, a= .05, to see if a significant difference exists between these doses.
Null Hypothesis: Alternative:
Example 2 DATA
Dose 1 (lo) 0 1 3 0 1 Dose 2 (med) 1 3 4 1 1 Dose 3 (hi) 5 8 6 4 7
X2 = G= N= k=
mean1= T1= SS1= n1=5
Example 2 Calculations
SSbetween = (T2/n) (G2/N) SSbetween = SSbetween= SSbetween = SSbetween = SSbetween = SSwithin = SS1+SS2+SS3+SS4 SSwithin = SSwithin = SStotal = X2 (G2/N) OR SSbetween + SSwithin SStotal = SStotal =
Example 2 ANOVA and Decision

Source SS df MS Between Within TOTAL Fcritical = If Fobtained < Fcritical fail to reject H0 Use Appendix B Page A-29-A31 for Fcritical df numerator = (df for between) df denominator = (df for within) Fobtained
Problem 1
To test the significance of variation in the retail prices of a commodity in three principal cities, Mumbai, Kolkata and Delhi, four shops were chosen at random in each city and the prices who lack confidence in their mathematical ability observed in rupees were as follows: Mumbai : 16 8 12 14 Kolkata : 14 10 10 6 Delhi : 4 10 8 8 Do the data indicate that the price in the three cities are significantly different?
Beyond one-way ANOVA

Often, you may want to test more than 1 treatment. ANOVA can accommodate more than 1 treatment or factor, so long as they are independent. Again, the variation partitions beautifully! SST = SSB1 + SSB2 + SSW
Two Way ANOVA

ANOVA in which two criteria (or variables) are used to analyse the difference between more than two population means
Block : Source of Variation

Blocking Variable : A variable that a researcher wants to control but is not the treatment variable of interest. (agricultural origin: block of land)
Total Variation SST
Variation within Samples (SSE)
Variation between Samples (SSTr or SSC)
Unwanted Variation between Blocks (SSR)
New Variation due to random Error (SSE)
Two Way ANOVA Table

Source of Variation Between Treatments (Columns) Between Blocks (rows) Residual Error Total Degrees of Freedom c1 Sum of Squares SStreatments Mean Sum of Squares (MSS) Sstreatments / (c-1) SS blocks / (r 1) SS error/ (N-c-r+1) F Ratio
MSStreatments /MSS error MSSblocks /MSS error
r-1
SS
blocks
Nc-r+1 N1
SS
error
SStotal
Problem 2
The following table gives the number of refrigerators sold by 4 salesmen in 3 months May, June, and July. Is there a significant difference in the sales made by the four salesmen? Is there a significant difference in the sales made during different months?
Month A
May June July 50 46 39
Salesmen B
40 48 44
C
48 50 40
D
39 45 39
ANOVA summary
A statistically significant ANOVA (F-test) only tells you that at least two of the groups differ, but not which ones differ.
Determining which groups differ (when its unclear) requires more sophisticated analyses to correct for the problem of multiple comparisons
Question: Why not just do 3 pairwise ttests?
Answer: because, at an error rate of 5% each test, this means you have an overall chance of up to 1-(.95)3= 14% of making a type-I error (if all 3 comparisons were independent) If you wanted to compare 6 groups, youd have to do 6C2 = 15 pairwise ttests; which would give you a high chance of finding something significant just by chance (if all tests were independent with a type-I error rate of 5% each); probability of at least one type-I error = 1-(.95)15=54%.
Recall: Multiple comparisons
Correction for multiple comparisons

How to correct for multiple comparisons post-hoc Bonferroni correction (adjusts p by most conservative amount; assuming all tests independent, divide p by the number of tests) Tukey (adjusts p) Scheffe (adjusts p) Holm/Hochberg (gives p-cutoff beyond which not significant)

12 Anova

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

12 Anova

Uploaded by

Copyright:

Available Formats

ANOVA for comparing means between more than 2 groups

Its like this: If I have three groups to compare:

Variability between groups F Variability within groups

Analogous to pooled variance from a ttest.

A ratio of variances follows an F-distribution:

F-test tests the hypothesis that two variances are equal.

will be close to 1 if sample variances are equal.

Introduce logic of ANOVA Review calculations Work through one example

Illustrate Logic of ANOVA

ANOVA the ANalysis Of Variance

(1) Inferential hypothesis-testing procedure (2) Tremendous advantage over t-tests:

used to compare MULTIPLE (two or more) treatments

Introduction to ANOVA (cont.)

ANOVA the ANalysis Of Variance

Research Design for ANOVA

FYI: Factors and Levels

How many factors? How many levels?

Four different test times (8am, 12pm, 4pm, and 8pm)

Variability Components for ANOVA

The Logic of ANOVA

(1) First, determine total variability for data set

(2) Next, break this variability into two components:

(a) Between-Treatments variance two sources:

(b) Within-Treatments variance one source:

Chance: Differences simply due to chance.

F = Treatment Effect + Differences due to Chance Differences due to Chance

The Structure of ANOVA Calculations

New Terms and Symbols

Hypothesis Testing with ANOVA (4 steps) STEP 1

Hypothesis Testing with ANOVA STEP 2

Hypothesis Testing with ANOVA STEP 2 continued

Hypothesis Testing with ANOVA STEP 3

Hypothesis Testing with ANOVA STEP 4

8am, 12pm, 4pm, and 8pm.

Short Cut Method

Calculate grand Total

Example 1 ANOVA and Decision

Use Appendix B Page A-29 for Fcritical

mean1= T1= SS1= n1=5

mean2= T2= SS2= n2=5

mean3= T3= SS3= n3=5

Example 2 ANOVA and Decision

Beyond one-way ANOVA

Two Way ANOVA

Block : Source of Variation

Variation within Samples (SSE)

Variation between Samples (SSTr or SSC)

Unwanted Variation between Blocks (SSR)

New Variation due to random Error (SSE)

Two Way ANOVA Table

MSStreatments /MSS error MSSblocks /MSS error

Question: Why not just do 3 pairwise ttests?

Recall: Multiple comparisons

Correction for multiple comparisons

You might also like