You are on page 1of 18

Application of ANOVA

What is ANOVA?

An ANOVA is an analysis of the variation present


in an experiment. It is a test of the hypothesis that
the existence of differences among several
population means.
Application of ANOVA
ANOVA is designed to detect differences
among means from populations subject to
different treatments

ANOVA is a joint test


The equality of several population means is
tested simultaneously or jointly.

ANOVA tests for the equality of several


population means by looking at two estimators
of the population variance (hence, analysis of
variance).
The Hypothesis Test of Analysis of Variance
In an analysis of variance:
We have r independent random samples, each one
corresponding to a population subject to a different
treatment.
We have:
n = n1+ n2+ n3+ ...+nr total observations.
r sample means: x1, x2 , x3 , ... , xr
These r sample means can be used to calculate an
estimator of the population variance. If the population
means are equal, we expect the variance among the
sample means to be small.
r sample variances: s12, s22, s32, ...,sr2
These sample variances can be used to find a pooled
estimator of the population variance.
The Hypothesis Test of Analysis of
Variance (continued): Assumptions
We
Weassume
assumeindependent
independentrandom
randomsampling
sampling from
fromeach
each
ofthe
of therrpopulations
populations
Weassume
We assumethat
thatthe
the rrpopulations
populationsunder
understudy:
study:
are
arenormally
normallydistributed,
distributed,
with meansi that
withmeans i thatmay
mayor
ormay
maynot notbe
beequal,
equal,
but
butwith
withequal variances,i2i2. .
equalvariances,

1 2 3
Population 1 Population 2 Population 3
Testing Hypothesis
Thehypothesis
The hypothesistest
testof
ofanalysis
analysisof
ofvariance:
variance:

HH00::11==22==33==44==......r r
HH11::Not alli(i(i==1,1,...,
Notall ...,r)r)are
areequal
equal
i

The
Thetest
teststatistic
statisticof
ofanalysis
analysisof ofvariance:
variance:

F(r-1,n-r)
F(r-1, n-r) ==Estimate
Estimateof ofvariance
variancebased
basedon onmeans
meansfromfromrrsamples
samples
Estimateof
Estimate ofvariance
variancebased
basedononallallsample
sample
observations
observations
That
Thatis,is,the
thetest
teststatistic
statisticininan
ananalysis
analysisof ofvariance
varianceisisbased
basedononthe
theratio
ratio
oftwo
of twoestimators
estimatorsof ofaapopulation
populationvariance,
variance,andandisistherefore
thereforebased
basedon on
theFFdistribution,
the distribution,with
with(r-1)
(r-1)degrees
degreesof offreedom
freedomininthethenumerator
numeratorand
and
(n-r)degrees
(n-r) degreesof offreedom
freedomininthe thedenominator.
denominator.
Extension of ANOVA to Three Factors
Source of Sum of Degrees
Variation Squares of Freedom Mean Square F Ratio
Factor A SSA a-1 SSA MSA
MSA F
a 1
MSE
Factor B SSB b-1 SSB MSB
MSB F
b 1 MSE
Factor C SSC c-1 SSC MSC
MSC F
c 1 MSE
Interaction SS(AB) (a-1)(b-1) SS ( AB ) MS ( AB )
MS ( AB ) F
(AB) ( a 1)( b 1) MSE

Interaction SS(AC) (a-1)(c-1) SS ( AC ) MS ( AC )


MS ( AC ) F
(AC) ( a 1)( c 1) MSE

Interaction SS(BC) (b-1)(c-1) SS ( BC ) MS ( BC )


MS ( BC ) F
(BC) ( b 1)( c 1) MSE

Interaction SS(ABC) (a-1)(b-1)(c-1) SS ( ABC ) MS ( ABC )


MS ( ABC ) F
(ABC) ( a 1)( b 1)( c 1) MSE

Error SSE abc(n-1) SSE


MSE
abc ( n 1)
Total SST abcn-1
Application of ANOVA

We can manipulate certain variables (like promotion, ad copy,


display at the point of purchase), and observe changes in other
variables (like sales, or consumer preferences, behavior or
attitude). The application areas for experiments are wide .

Whenever a marketing-mix variable (independent variable) is


changed, we can determine its effect. Such variables include
price, a specific promotion or type of distribution, or specific
elements like shelf space, color of packaging etc.
An experiment can be done with just one
independent variable (factor) or with multiple
independent variables.

ANOVA - The key to success in an ANOVA Analysis


Survey is the degree of control on the various
independent variable (factors) that are being
manipulated during the experiment.
N-way Analysis of Variance

In business research, one is often concerned with the effect


of more than one factor simultaneously. For example:

How do advertising levels (high, medium, and low) interact


with price levels (high, medium, and low) to influence a
brand's sale?

Do educational levels (less than high school, high school


graduate, some college, and college graduate) and age
(less than 35, 35-55, more than 55) affect consumption of a
brand?

What is the effect of consumers' familiarity with a


department store (high, medium, and low) and store image
(positive, neutral, and negative) on preference for the store?
N-way Analysis of Variance
Consider the simple case of two factors X1 and X2 having categories c1 and c2.
The total variation in this case is partitioned as follows:

SStotal = SS due to X1 + SS due to X2 + SS due to interaction of X1 and X2 + SSwithin

or

SS y =SS x 1+SS x 2+SS x 1x 2+SS error

The strength of the joint effect of two factors, called the overall effect, or multiple
2, is measured as follows:

multiple 2 =

(SS x 1+SS x 2+SS x 1x 2)/SS y


N-way Analysis of Variance
The significance of the overall effect may be tested by an F test, as
follows

(SS x 1+SS x 2+SS x 1x 2)/dfn


F=
SS error/dfd
SS x 1,x 2,x 1x 2/dfn
=
SS error/dfd
MS x 1,x 2,x 1x 2
=
MS error
where

dfn = degrees of freedom for the numerator


= (c1 - 1) + (c2 - 1) + (c1 - 1) (c2 - 1)
= c1c2 - 1
dfd = degrees of freedom for the denominator
= N - c1c2
MS = mean square
N-way Analysis of Variance
If the overall effect is significant, the next step is to examine the
significance of the interaction effect. Under the null hypothesis of no
interaction, the appropriate F test is:

SS x 1x 2/dfn
F=
SS error/dfd

MS x 1x 2
=
MS error
where

dfn = (c1 - 1) (c2 - 1)


dfd = N - c1 c2
N-way Analysis of Variance
The significance of the main effect of each factor may be tested
as follows for X1:
SS x 1/dfn MS x 1
F= =
SS error/dfd MS error

where

dfn = c1 - 1
dfd = N - c 1c 2
Issues in Interpretation

The most commonly used measure in ANOVA is omega squared, . This measure
2indicates what proportion of the variation in the dependent variable is related to a particular
independent variable or factor. The relative contribution of a factor X is calculated as
follows:

SS x (df
x xMS error)
2x =
SS total +MS error
Normally, is interpreted only for statistically significant effects.

2

Repeated Measures ANOVA
In the case of a single factor with repeated measures, the total
variation, with nc - 1 degrees of freedom, may be split into
between-people variation and within-people variation.

SStotal = SSbetween people + SSwithin people

The between-people variation, which is related to the


differences between the means of people, has n - 1 degrees
of freedom. The within-people variation has
n (c - 1) degrees of freedom. The within-people variation may,
in turn, be divided into two different sources of variation. One
source is related to the differences between treatment means,
and the second consists of residual or error variation. The
degrees of freedom corresponding to the treatment variation
are c - 1, and those corresponding to residual variation are
(c - 1) (n -1).
Repeated Measures ANOVA
Thus,

SSwithin people = SSx + SSerror

A test of the null hypothesis of equal means may now be


constructed in the usual way:
SS x /(c1) MS x
F= =
SS error/(n1)(c1) MS error

So far we have assumed that the dependent variable is


measured on an interval or ratio scale. If the dependent
variable is nonmetric, however, a different procedure should be
used.
Nonmetric Analysis of Variance
Nonmetric analysis of variance examines the
difference in the central tendencies of more than two
groups when the dependent variable is measured
on an ordinal scale.
One such procedure is the k-sample median test.
As its name implies, this is an extension of the
median test for two groups, which was considered in
Chapter 15.
Nonmetric Analysis of Variance
A more powerful test is the Kruskal-Wallis one way analysis
of variance. This is an extension of the Mann-Whitney test
This test also examines the difference in medians. All cases
from the k groups are ordered in a single ranking. If the k
populations are the same, the groups should be similar in
terms of ranks within each group. The rank sum is calculated
for each group. From these, the Kruskal-Wallis H statistic,
which has a chi-square distribution, is computed.
The Kruskal-Wallis test is more powerful than the k-sample
median test as it uses the rank value of each case, not merely
its location relative to the median. However, if there are a
large number of tied rankings in the data, the k-sample median
test may be a better choice.

You might also like