We Did it with two factors (Driver and Car) We could have used the same procedure to do three way (maybe add country or city driving)
Or Four Way (maybe add season of the
year)
What happens to our test data
requirements?
Data Set Size Illustration
Lets say we tested 4 cars with pairs of
before and after MPG Minimum we would need 2 drivers on each car Ie need 8 paired driving tests
Why 8 tests for four cars?
Need a way to measure variability not accounted for by the test If all differences are accounted for you have no SS error With no denominator you cannot do the F test to decide if your effect is significant
The Exponential Explosion
Now lets look for a Driver Effect
Lets try 4 drivers Each driver drives each car twice 8 pairs of data is now 32 pairs
Now lets look for a Country, Urban
Effect Well run the tests in downtown of a large city, in a suburb, and in the country Now we need 96 pairs of data
Now lets see if it depends on season
Well do Winter, Spring, Summer, Fall Now we need 384 pairs of data
Problem of ANOVA
As you get more and more effects to
study the amount of data needed grows exponentially
There are practical limits to how much
you can do at once
There are specialized techniques
that can be done We had every driver test every car under every condition. Eng 540 Design of Experiments does a lot for elegant alternatives
Relief from SPSS at a Price
Our experimental design called for equal
numbers of tests under all conditions
Actually the procedure I showed with SPSS will
run the test without equal numbers of tests under every condition.
The Price
If I do not check every driver in every car I will
loose my ability to measure interaction effects (it will go into the SS error) If I have equal numbers of cases in every cell the test tends to be forgiving (Robust) against violations of the normal distribution assumption
If my cells are uneven my model will start spitting
out more poorly fit answers if I violate the normal distribution assumption.
The Who Done It Mystery
ANOVA will easily tell you
whether an effect exists When
it says the driver makes a
difference
It
Did you have 2 wacko drivers and
the other 8 are all the same?
tells you whether an effect
exists, but it might still come from only part of the data set
Coping With Who Done It
You can run all sorts of plots to see if
its just a few results that are different. You can run statistical tests to test different subsets of the data against each other The little options button on the SPSS field where you said Ok leads to a menu of optional tests and plots
Not going to deal with them right now
other than telling you they are there.
Now I Know What Does It
Mean?
We found that the MPG improvement
from the Red Rooster Carburetor varied with the driver
We put an individual results may vary
disclaimer on our advertizing
Ok, but how much do individual
results vary? ANOVA doesnt tell us For some types of engineering works we have to know how big a difference something will make (Yes-No doesnt always cut it)
Dealing with large numbers of
possible causes
May have a large, but more
randomly organized set of data and conditions that might have influenced it.
Trying to do ANOVA for 15 affect
variables would be unwieldy
Solution to the I need to quantify the
effect and for maxing out the computer memory (and corporate budget) from doing a 15 way ANOVA is a method called Regression