You are on page 1of 10

Limitations of ANOVA

2005 Dr. B. C. Paul

The Data Size Effect

We Did ANOVA with one factor


We Did it with two factors (Driver and
Car)
We could have used the same
procedure to do three way (maybe
add country or city driving)

Or Four Way (maybe add season of the


year)

What happens to our test data


requirements?

Data Set Size Illustration

Lets say we tested 4 cars with pairs of


before and after MPG
Minimum we would need 2 drivers on
each car
Ie need 8 paired driving tests

Why 8 tests for four cars?


Need a way to measure variability not
accounted for by the test
If all differences are accounted for you
have no SS error
With no denominator you cannot do the F
test to decide if your effect is significant

The Exponential Explosion

Now lets look for a Driver Effect


Lets try 4 drivers
Each driver drives each car twice
8 pairs of data is now 32 pairs

Now lets look for a Country, Urban


Effect
Well run the tests in downtown of a
large city, in a suburb, and in the country
Now we need 96 pairs of data

Now lets see if it depends on season


Well do Winter, Spring, Summer, Fall
Now we need 384 pairs of data

Problem of ANOVA

As you get more and more effects to


study the amount of data needed
grows exponentially

There are practical limits to how much


you can do at once

There are specialized techniques


that can be done
We had every driver test every car
under every condition.
Eng 540 Design of Experiments does a
lot for elegant alternatives

Relief from SPSS at a Price

Our experimental design called for equal


numbers of tests under all conditions

Actually the procedure I showed with SPSS will


run the test without equal numbers of tests
under every condition.

The Price

If I do not check every driver in every car I will


loose my ability to measure interaction effects (it
will go into the SS error)
If I have equal numbers of cases in every cell
the test tends to be forgiving (Robust) against
violations of the normal distribution assumption

If my cells are uneven my model will start spitting


out more poorly fit answers if I violate the normal
distribution assumption.

The Who Done It Mystery

ANOVA will easily tell you


whether an effect exists
When

it says the driver makes a


difference

It

Did you have 2 wacko drivers and


the other 8 are all the same?

tells you whether an effect


exists, but it might still come from
only part of the data set

Coping With Who Done It

You can run all sorts of plots to see if


its just a few results that are different.
You can run statistical tests to test
different subsets of the data against
each other
The little options button on the SPSS
field where you said Ok leads to a
menu of optional tests and plots

Not going to deal with them right now


other than telling you they are there.

Now I Know What Does It


Mean?

We found that the MPG improvement


from the Red Rooster Carburetor
varied with the driver

We put an individual results may vary


disclaimer on our advertizing

Ok, but how much do individual


results vary?
ANOVA doesnt tell us
For some types of engineering works we
have to know how big a difference
something will make (Yes-No doesnt
always cut it)

Dealing with large numbers of


possible causes

May have a large, but more


randomly organized set of data and
conditions that might have influenced
it.

Trying to do ANOVA for 15 affect


variables would be unwieldy

Solution to the I need to quantify the


effect and for maxing out the
computer memory (and corporate
budget) from doing a 15 way ANOVA
is a method called Regression

Our next exciting topic!!!

You might also like