You are on page 1of 10

page 1

Chapter 14 Multifactor Studies without Replication

Degrees of freedom for effects in ANOVA models:

• For main effects: df = # levels –1


• For 2-way interactions: df = (df for factor 1) x (df for factor 2)
• For 3-way interactions: df = (df for factor 1) x (df for factor 2) x (df for factor 3)
Etc.

Problem 14.14, p. 433 : Blood-Brain Barrier experiment in rats to investigate how delivery of brain cancer
antibody is influenced by tumor size (varied by starting treatments 8, 12 or 16 days after inoculation with
tumor cells), antibody molecular weight (varied by using 3 agents: AIB, MTX or DEX7) ,blood-brain
barrier disruption (BD = disrupted with mannitol, NS = not disrupted, saline solution), and delivery route (IA
= intra-arterial, or IV = intravenous). The response of interest was the ratio of the concentration of antibody
in the brain around the tumor (BAT) to the concentration in the other lateral half of the brain (LH). There
were thus 3x3x2x2 = 36 treatment combinations with one rat in each.

Some initial graphical examination of the data:

15.00

10.00

o
ti
a
R

5.00

0.00

8 12 16
Days
page 2
page 3
Tests of Between-Subjects Effects

Dependent Variable: Ratio


Type III Sum
Source of Squares df Mean Square F Sig.
Corrected Model 657.232a 31 21.201 4.109 .089
Intercept 967.140 1 967.140 187.447 .000
Agent 7.270 2 3.635 .705 .547
Treatment 342.624 1 342.624 66.406 .001
Route 116.143 1 116.143 22.510 .009
Days 6.489 2 3.244 .629 .579
Agent * Treatment 3.961 2 1.981 .384 .704
Agent * Route .896 2 .448 .087 .919
Agent * Days 23.945 4 5.986 1.160 .444
Treatment * Route 37.116 1 37.116 7.194 .055
Treatment * Days 6.720 2 3.360 .651 .569
Route * Days 31.839 2 15.920 3.085 .155
Agent * Treatment * Route 1.802 2 .901 .175 .846
Agent * Treatment * Days 42.752 4 10.688 2.072 .249
Agent * Route * Days 15.170 4 3.792 .735 .614
Treatment * Route * Days 20.505 2 10.253 1.987 .252
Error 20.638 4 5.160
Total 1645.011 36
Corrected Total 677.870 35
a. R Squared = .970 (Adjusted R Squared = .734)

Tests of Between-Subjects Effects

Dependent Variable: Ratio


Type III Sum
Source of Squares df Mean Square F Sig.
Corrected Model 577.003a 19 30.369 4.817 .001
Intercept 967.140 1 967.140 153.412 .000
Agent 7.270 2 3.635 .577 .573
Treatment 342.624 1 342.624 54.349 .000
Route 116.143 1 116.143 18.423 .001
Days 6.489 2 3.244 .515 .607
Agent * Treatment 3.961 2 1.981 .314 .735
Agent * Route .896 2 .448 .071 .932
Agent * Days 23.945 4 5.986 .950 .461
Treatment * Route 37.116 1 37.116 5.887 .027
Treatment * Days 6.720 2 3.360 .533 .597
Route * Days 31.839 2 15.920 2.525 .111
Error 100.867 16 6.304
Total 1645.011 36
Corrected Total 677.870 35
a. R Squared = .851 (Adjusted R Squared = .675)
page 4

Tests of Between-Subjects Effects

Dependent Variable: Ratio


Type III Sum
Source of Squares df Mean Square F Sig.
Corrected Model 472.526a 6 78.754 11.122 .000
Intercept 967.140 1 967.140 136.585 .000
Agent 7.270 2 3.635 .513 .604
Treatment 342.624 1 342.624 48.387 .000
Route 116.143 1 116.143 16.402 .000
Days 6.489 2 3.244 .458 .637
Error 205.345 29 7.081
Total 1645.011 36
Corrected Total 677.870 35
a. R Squared = .697 (Adjusted R Squared = .634)

Tests of Between-Subjects Effects

Dependent Variable: Ratio


Type III Sum
Source of Squares df Mean Square F Sig.
Corrected Model 509.642a 7 72.806 12.118 .000
Intercept 967.140 1 967.140 160.971 .000
Agent 7.270 2 3.635 .605 .553
Treatment 342.624 1 342.624 57.026 .000
Route 116.143 1 116.143 19.331 .000
Days 6.489 2 3.244 .540 .589
Treatment * Route 37.116 1 37.116 6.178 .019
Error 168.229 28 6.008
Total 1645.011 36
Corrected Total 677.870 35
a. R Squared = .752 (Adjusted R Squared = .690)
page 5
Alternative parameterizations of the ANOVA model
Suppose we have a two-way classification; that is, two categorical explanatory variables and a quantitative
response. As we’ve seen, the ANOVA model can be written as a regression model using indicator variables.
For example, suppose the two factors are A and B and that A has 3 levels and B has 4 levels. Then the
model with main effects but not the two-way interaction is:

µ (Y ) = β 0 + β1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + β 5 X 5

where

X 1 = 1 if A = 1 and 0 otherwise
X 2 = 1 if A = 2, and 0 otherwise
X 3 = 1 if B = 1 and 0 otherwise
X 4 = 1 if B = 2 and 0 otherwise
X 5 = 1 if B = 3 and 0 otherwise

• In this parameterization of the model, β 0 represents the mean of Y when A = 3 and B = 4 (the
reference levels). β1 represents the additive effect of being in level A =1 (compared to A = 3) and
β1 represents the additive effect of being in level A =2 compared to A = 3. Similarly, β 3 represents
the additive effect of being in level B =1 (versus B = 4), and so on.

• Each of the β’s represents the deviation from the reference level for that factor.

• In the regression model, the hypothesis of no effect of A on mean response is:

H 0 : β1 = β 2 = 0

and is tested with an extra sum-of-squares F-test by comparing the full model with X 1 and X 2 in it
to the reduced model without them.

• The hypothesis of no effect of B on mean response is:

H 0 : β3 = β4 = β5 = 0

• The interaction between A and B would be included by including all possible products of an indicator
variable for A and one for B: X 1 X 3 , X 1 X 4 , X 1 X 5 , X 2 X 3 , X 2 X 4 , X 2 X 5 . This would add 6 more
parameters to the model.
page 6
Another way to parameterize the additive two-way ANOVA model is as follows:

µ ij = µ + α i + β j

where µ ij represents the mean response in cell i-j (level i of variable A and level j of variable B). We have
the restriction that ∑ α i = 0 , ∑ β j =0.
• In this parameterization, µ represents the overall mean response over all cells, α i represents the A
effect (the difference between the mean for level i of A and the mean for A), and β j represents the B
effect (the difference between the mean for level i of A and the mean for A). The restriction that
∑ α i = 0 , ∑ β j =0 simply reflects the fact that the deviations from the overall mean must sum to 0.
• In this model, the hypothesis of no effect of A on mean response is:

H 0 : α1 = α 2 = α 3 = 0

• The hypothesis of no effect of B on mean response is:

H 0 : β1 = β 2 = β 3 = β 4 = 0

Example 1
It’s easiest to start with an example with only one factor, say A, which has 3 levels as above. Suppose that
the true group means are µ1 = 5, µ 2 = 8, and µ 3 = 14. These three means can be written as a regression
model as:

µ (Y ) = 14 − 9 X 1 − 6 X 2

where X 1 and X 2 are the indicator variables defined above. Check to see that this model gives the desired
means:

The alternative parameterization writes the means as:

µi = µ + α i

where µ = (5+8+14)/3 = 9 (the overall mean), α 1 = µ1 − µ = -4, α 2 = µ 2 − µ = -1, and α 3 = µ 3 − µ = 5.


Note that the α i ’s sum to 0.

Example 2
Let’s return to the two-factor model on the previous page and suppose the cell means are:
page 7

µ11 = 4, µ12 = 8, µ13 = 5, µ14 = 3, µ 21 = 5, µ 22 = 9, µ 23 = 6, µ 24 = 4, µ 31 = 6, µ 32 = 10, µ 33 = 7, µ 34 =


5.

Note that these means follow an additive model: the effect of moving from level 1 to level 2 of A, for
example, is the same (increase of 4), for every level of B. Thus, we can reproduce these means using the
additive model with either parameterization:

Regression: µ (Y ) = 5 − 2 X 1 − X 2 + X 3 + 5 X 4 + 2 X 5 (X’s defined as on previous page)

Check to see that this yields the desired cell means:

Alternative parameterization: µ ij = µ + α i + β j , for i = 1,2,3 and j = 1,2,3,4

µ = 6, α 1 = µ1• − µ = -1, α 2 = µ 2• − µ = 0, α 3 = µ 3• − µ = 1,

β1 = µ •1 − µ = -1, β 2 = µ •2 − µ = 3, β 3 = µ •3 − µ = 0, β 4 = µ •4 − µ = -2

where µ1• is the mean for A=1, averaged over all levels of B, µ •1 is the mean for B=1 averaged over all
levels of A, etc.

Check to see that this yields the desired cell means.

If the cell means do not follow an additive model, then we need additional parameters to reproduce the cell
means. In the regression formulation, we get these by adding the 6 pairwise products of the indicator
variables for A and B. In the alternative parameterization, we get them by adding an extra interaction term
for each cell, denoted (αβ ) ij (this notation indicates that the term is the interaction term for A and B). The
(αβ ) ij ’s sum to 0 over all i for each j and also sum to 0 over all j for each i. With these restrictions, there are
only (i-1)(j-1) unique interaction parameters.

We can incorporate the error term into either parameterization by replacing µ ij by Yij on the left hand side,
representing an individual response rather than the mean response for the cell:

Yij = µ + α i + β j + ε ij
page 8

where the ε ij are independent N (0, σ ) random variables.

Random effects

Our ANOVA analyses so far have treated all factors as fixed. That is, we’re only interested in these
particular levels of the factor and we assume each level has some fixed effect on the response. The
hypotheses we test in the analysis of these data using ANOVA are whether these fixed effects are 0 or not
and the confidence intervals are for the size of the effect. The scope of inference in these studies is only to
the particular levels of the factors studied.

• If we treat a factor as a random effect, then we are assuming that the particular levels of the factor
used in the study were a random sample from a larger population of possible levels and that we wish
our inferences to be to this larger population.

Example: randomized complete block design. A study was performed to compare the yields of four varieties
of cowpea hay (the treatment). Three areas of land (blocks) were each divided into four plots and each of
the four varieties was randomly assigned to one plot in each block.

• It makes sense to treat Variety as a fixed effect since there is probably not a larger population of
varieties from which these four were randomly selected and to which we wish to make inferences.

• However, we could treat Block as a fixed effect or a random effect. If we treat it as a fixed effect,
then the inferences about differences between varieties are only to these three blocks. If these three
blocks are a random sample of blocks from a population of blocks, we could treat Block as a random
effect. Then, the inferences about differences between treatments would apply to the population of
all blocks.

The additive model for yield as a function of Treatment and Block would be written this way:

yij = µ + α i + β j + ε ij

where α i is the treatment effect and β j is the block effect.


4
• Since treatment is a fixed effect, then ∑α i = 0 .
i =1
3
• If Block is a fixed effect, then ∑ β j = 0. The inferences about Treatment differences apply only to
j =1
these blocks. Inferences about Block concern whether or not there are differences among these three
blocks.

• If Block is a random effect, then we assume that the β j ’s are independent random draws from a
normal distribution with mean 0 and standard deviation σ b . That is, we assume that every block in
page 9
the population has its own block effect and that these effects are distributed as a N (0, σ b )
distribution. The three block effects we’ve observed are independent random draws from this
distribution and don’t necessarily sum to 0.

• Treating Block as a random effect only makes sense if the blocks can be viewed as a random sample
from a larger population of blocks. For example, if the blocks are adjacent strips of land, then it
would be difficult to justify treating Block as a random effect. This is because adjacent blocks of
land are likely similar and cannot be viewed as independent draws from some population of strips.

• If Block is a random effect , inferences about Treatment differences can be generalized to the
population of blocks. Inferences about Block differences are inferences about σ b . The test of Block
differences is a test of
H 0 : σ b = 0 vs. H A : σ b > 0
We can also estimate σ b , just as we can estimate σ, the error standard deviation.

Whether an effect is considered fixed or random can affect the tests of all the effects in a model.

• In a randomized complete block design without replication, the tests of the Treatment and Block effects
is the same whether Block is treated as fixed or random. This model does not include the Block by
Treatment interaction because there is no replication. Note: treating Block as a random effect must be
justified, as noted above.

• In a randomized complete block design with replication, where the Block by Treatment interaction can
be estimated, then the test of Treatment effect is changed when we treat Block as a random factor. The
denominator in the F test for the Treatment effect becomes the Block by Treatment interaction, not the
Error term. One way to see why this should be so is that we can view Block as the primary “sampling”
unit and plot within block as the secondary sampling unit. Inferences about response are always based
on the number of primary sampling units.

• Treating Block as a random effect usually makes it more difficult to detect differences among
Treatments because the Block by Treatment mean square is usually bigger than the MSE and its
degrees of freedom are less. This reflects the fact that the Block by Treatment interaction includes two
sources of variability: variability between blocks and variability between replicates within blocks.
However, treating Block as a random effect expands the scope of the inferences so there is a tradeoff.

SPSS automatically does the right tests if you specify the random and fixed effects in the General Linear
Model procedure.

Some additional notes:


• Sometimes the subjects in an experiment are used as blocks (each receives more than one treatment)
and are treated as a random effect if they can be viewed as a random sample from a larger population of
possible subjects.

• You are not justified in treating a factor as a random effect unless you can truly view the levels as a
random sample of possible levels. However, even if you are justified in treating a factor as a random
page 10
effect, you do not have to. If you want the scope of your inferences to be to just the levels you actually
chose, then you could treat it as a fixed effect.

• Models with only fixed effects (except for the error term) are referred to as fixed effects models,
models with only random effects are random effects models, and models with both fixed and random
effects are mixed effects models or mixed models.

• σ b2 and σ 2 are referred to as variance components. Additional random factors will add additional
variance components. Estimating the variance components is often important, but can be difficult.