You are on page 1of 22

Welcome to Powerpoint slides

for

Chapter 9

Anova and the


Design of Experiments

Marketing Research
Text and Cases
by
Rajendra Nargundkar

Slide 1

Introduction and Applications

1. Surveys are the most popular research method used in


Marketing Research.
2. The other widely used class of study is known as
experimentation. Just like in a laboratory, we manipulate
certain variables (usually marketing related ones in Marketing
Research), and observe changes in other variables (like sales,
or consumer preferences, behaviour or attitude for example).
3. The application areas for experiments are wide. Whenever
a marketing mix variable (independent variable) such as
price, a specific promotion or type of distribution, even
specific elements like shelf space, or colour of packaging etc
is changed, we would want to know its effect.
4. Under proper conditions, an experiment can tell us the
effects of specific variations in one or more elements, of the
marketing mix.

5. An experiment can be done with only one independent


variable (factor) or with multiple independent variables.

Slide 2

Methods
1. A oneindependent variable experiment is called oneway ANOVA. ANOVA stands for Analysis of Variance, the
generic name given to a set of techniques for studying
cause-and-effect of one or more factors on a single
dependent variable.
2. If we hypothesise that there is also a Blocking Variable
(to be explained later in the Randomised Block Design) in
addition to one independent variable, we can use a
randomized block design.

3. When more than one factors (independent variables) are


studied, it is known as a factorial experiment. This design
can also facilitate the study of possible interaction effects
among the independent variables. We will explore this
further when we discuss factorial experiments.
4. When more than one dependent variable is studied, the
technique called MANOVA or Multivariate Analysis of
Variance is used. However, we will limit ourselves to the
discussion of three major types of ANOVA .

Slide 3
Variables
The Analysis of Variance technique is used when the
independent variables are of nominal scale (categorical) and
the dependent variable is metric (continuous).
Design
The design of the experiment is the most critical in
performing any experiment to be analysed through the
technique of ANOVA.
There are four major types of designs, of which three
frequently used types will be illustrated with a worked out
example each.
These four major types are
Completely Randomised Design in a One-Way ANOVA
(Single Factor)
Randomised Block Design (Single Blocking Factor)
Latin Square Design (Two Blocking Factors)
Factorial Design with 2 or more Factors.
We will discuss in detail the first two, and the fourth.

Slide 4
One-Way ANOVA
This particular design is used when there is only one
categorical independent variable, and one dependent (metric)
variable.
Each category of an independent variable is called a level.
The independent variable may be different levels of prices, or
different pack sizes, or different product colours, and the
effect (dependent variable) could be sales, preferences or
attitudes towards the brand.
In the example that follows, we will look at advertising copy
alternatives as the independent variable, and preference rating
for the advertising copy as the dependent variable.
Worked Example Problem:
In this example, we assume that three different versions of
advertising copy have been created by an advertising agency
for a campaign. Let us call these versions of copy ADCOPY
1, 2 and 3. Now, the ad agency wants to test which of these
three versions of the advertising copy is preferred by its target
population, before they launch the campaign.
A sample of 18 respondents is selected from the target
population in the nearby areas of the city. At random, these
18 respondents are assigned to the 3 versions of ad copy.
Each version of ad copy is thus shown to six of the
respondents.
The respondents are asked to rate their liking for the ad copy
shown to them on a scale of 1 to 10. (1 = Not liked at all, 10
= Liked a lot, and other values in between these two). The
ratings given by the 18 respondents are tabulated.

Slide 5
Input Data
Fig 1. shows the input data for the 18 respondents.
Fig. 1.
Sr.
No.
1
2
3
4
5
6
7
8
9
10

Ad
copy
1
1
1
1
1
1
2
2
2
2

rating
6.00
7.00
5.00
8.00
8.00
8.00
4.00
4.00
5.00
7.00

Slide 5 contd...
Fig. 1. Contd
Sr.
No.
11
12
13
14
15
16
17
18

Ad
copy
2
2
3
3
3
3
3
3

rating
7.00
6.00
5.00
5.00
4.00
7.00
8.00
7.00

The codes in the ad copy, column (1,2,3) indicate


the different versions of the ad. The last column,
rating, is the rating given by a respondent to the
adcopy seen by him/her. Thus, six respondents have
rated each ad. Please note, that these eighteen
respondents were randomly assigned to each of the
three ad versions. This random assignment is called a
completely randomised assignment or design.

Slide 6
The input data in fig 1 is input into a statistical
package for performing a One-Way ANOVA,
because we have only 1 categorical factor (Ad copy)
at 3 levels 1, 2, 3 and 1 dependent variable
Rating.
Output
The output of the computerised One-Way ANOVA
is shown in fig. 2.
Fig. 2
Source of
Variation

Sum of
Squares

DF

Mean
Square

Sig.
of F

Main
Effects
ADCOPY
Explained
Residual
Total

7.000

3.500

1.780

.203

7.000
7.000
29.500
36.500

2
2
15
17

3.500
3.500
1.967
2.147

1.780
1.780

.203
.203

Slide 6 contd.

The first column is titled Source of Variation. Under


this, labeled Main Effects, is the single independent
variable called ADCOPY.
We then go to the last column, where the significance of
the F test is given. It is .203 in this case, for the factor
ADCOPY. This indicates that at the confidence level of
95 percent, (corresponding to significance level of 0.05),
the F-test proves the model is not significant. In other
words, the Ratings given to the three ad copy versions are
not significantly different from each other.

Slide 7

The ANOVA has thus told us what we may not have been
able to gauge if we had simply looked at the mean ratings for
each ad copy by computing these.
For example, the ratings for the ad copy version 1 are
6,7,5,8,8,8 and the mean rating is (6+7+5+8+8+8) / 6, or 42/6
= 7. Similarly, the mean rating of ad copy version 2 is
(4+4+5+7+7+6) / 6, or 33/6 = 5.5. The mean rating for ad
copy version 3 is (5+5+4+7+8+7) / 6, or 36/6 = 6.
At a glance, the three mean ratings appear to be different 7,
5.5 and 6. But the ANOVA tells us that this difference is not
statistically significant at the 95 percent confidence level.
It does this by performing an F-test. The null hypothesis for
this F-test is that there is no significant difference in the mean
ratings for the three ad copy versions. (H0: M1 = M2 = M3
where M1, M2 and M3 are the mean ratings for the three
versions of ad copy). Thus, in this case, we have accepted the
null hypothesis (or failed to reject the null hypothesis), at the
95 percent confidence level.
If the significance of F in the last column of fig. 2 had been
less than 0.05, we would have rejected the null hypothesis. In
that case, we would have concluded that significant
differences exist between mean ratings given to the three ad
copy versions.

Slide 8
1. Randomised Block Design:
Let us continue with the same input data as in fig. 1,
with one more column added to it. This
dataset is
shown in fig. 3.
Fig. 3
sr. adcopy
no.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

rating
1
1
1
1
1
1
2
2
2
2
2
2
3
3
3
3
3
3

6.00
7.00
5.00
8.00
8.00
8.00
4.00
4.00
5.00
7.00
7.00
6.00
5.00
5.00
4.00
7.00
8.00
7.00

magazine
1
2
3
4
5
6
1
2
3
4
5
6
1
2
3
4
5
6

Slide 8 contd..
We have made a slightly different assumption in this
case. We assume that the three versions of the adcopy
were each used in 6 different magazines. These six
magazines are coded 1, 2, 3, 4, 5, 6 and appear in the
column titled magazine. Out of the people who saw
these ads, 18 randomly chosen respondents are
picked, one from each magazine who saw a particular
version of ad. Thus, we finally have one respondent
who has seen a given version of the ad in a given
magazine. In other words, we have one respondent
for every combination of magazine and adcopy.

Slide 9
Hypothesis
1. The assignment of our sample of 18 in the above manner
assumes that the magazine in which the version of adcopy
appears may have an impact on the ratings. We can test this
hypothesis - in fact, two hypotheses - by doing an ANOVA
with a randomised block design.
2. For this purpose, we use the variable Rating as the
dependent variable, and Adcopy as the factor, and
Magazine as the block.
3. A block is defined as some variable which could affect the
relationship between the independent factor and the
dependent variable under study in an ANOVA. In our
example, the magazine in which the advertisement appears
could influence the Rating given to Adcopy by the
respondents. We are trying to remove the effect of the
magazine used, by "blocking" its effect, or treating the block
separately.
4. If we do not block on a variable, its effect gets included
with the error (residual) term. This may lead to wrong
conclusions about the relationship between the independent
and dependent variables. In that sense, a randomised block
design is more "powerful" than a simple one-way ANOVA, if
the block effect is significantly influencing the relationship.

Slide 10

Output
The computer output for this problem using a randomised
block design is shown in fig. 4.
Fig. 4
Tests of significance for RATING using UNIQUE sums of
squares.
Source of
Variation
Residual
Adcopy
Magazine
(Model)
(Total)

SS

DF MS

Sig
of F

3.67 10 .37
7.00 2 3.50 9.55 .005
25.83 5 5.17 14.09 .000
32.83 7 4.69 12.79 .000
36.50 17 2.15

This table is similar to the output table of the one-way ANOVA


we got earlier (fig. 2), except that there is an additional source of
variation called Magazine in the first column of fig. 4. This is
the block we have used, to test the null hypotheses
.The first null hypothesis is that mean rating of the ADCOPY is
the same for all 3 versions. This is the same as the null hypothesis
we had used earlier for the one-way ANOVA.
.The second null hypothesis is that the block used (Magazine in
this case) has no effect on mean ratings given to ADCOPY
versions by respondents.

Slide 11
1. To test if the null hypotheses are rejected or not, we turn to
the last column of fig. 4, which gives the result of an F-test
for any assumed confidence level. We will assume we wanted
to test these hypotheses at the 95 percent confidence level.
2. We know that the significance level of F in the last column
should be less than 0.05 for the null hypothesis to be rejected.
We see that for both the rows labelled ADCOPY and
MAGAZINE, the significance of F is less than .05. It is .005
for ADCOPY and .000 for MAGAZINE. This means that
both the null hypotheses are rejected.

3. We conclude that the mean ratings given to the 3 versions


of ADCOPY are significantly different, and also that the
MAGAZINE in which the ADCOPY appears has an impact
on its rating.
4. Please note that the Blocking Factor being considered
separately has now led us to a different conclusion from that
in a completely randomized test of the same basic data. This
makes the randomized block test a better test when we
suspect that a blocking factor affects the relationship between
the independent variable and the dependent variable.

Slide 12

Latin Square Design


The Latin Square Design is an extension of the
Randomised Block Design. It consists of one independent
variable (FACTOR) and two Blocks, instead of one which
we saw in the Randomised Block Design. It has no
special significance in marketing research, so we will
move on to the more general case of a factorial design
where any number of factors can be tested simultaneously
for their effects on the dependent variable.
Factorial Designs

This type of design is employed when we have 2 or more


independent variables or factors. The major advantage of
this design is that multiple factors can be simultaneously
tested. There are two kinds of effects that we can test.
One is called the Main Effect. The second is called the
Interaction Effect. To illustrate, we will take up an
example.

Slide 13

Worked Example
In this example, we assume that we are testing for a toilet
soap brand, the effect of two Factors (independent variables)
Pack Design and Price - on Sales (dependent variable).
We would like to know (1) if each of the Factors
independently affects Sales (called the Main Effects), and (2)
if there is a combined effect of Pack Design and Price (called
the 2 way Interaction Effect) on Sales.
Incidentally, if there are 3 factors in a study, then we could
test for all 2-way interaction effects and the 3-way
interaction effect, in addition to the Main Effects of the
individual factors.
To continue with our example, the experiment is conducted
in a simulated environment on 18 randomly selected
respondents. There are 3 levels of price Rs. 8, Rs. 11 and
Rs. 14, and 3 levels of Pack Design designated by the main
colours used Blue, Red and Green.
The coding of these variables is 1, 2, 3 respectively for Rs.
8, 11 and 14 and 1, 2, 3 for Blue, Red and Green in the case
of Pack Design.

Slide 14

Input Data

The input dataset is shown in fig. 5.


Fig. 5.
sr. no. sales packdesn price
1
500
1
1
2
440
2
1
3
360
3
1
4
300
1
2
5
280
2
2
6
250
3
2
7
200
1
3
8
150
2
3
9
250
3
3
10
600
1
1
11
450
2
1
12
510
3
1
13
400
1
2
14
350
2
2
15
300
3
2
16
250
1
3
17
275
2
3
18
220
3
3
Column 1 is Sales, column 2 is Pack Design and Column 3 is
Price. Please note that even though Price is a continuous metric
variable, for the purpose of ANOVA, being an independent
variable, it has to be treated as a categorical variable. Hence the
coding (1, 2, 3) for Price.

Slide 15
Also note from fig.5 that each combination of Price and Pack
Design appears twice in the dataset. For example, Packdesign =
1 and Price = 1 appears in Row 1 and also Row 10. This is
known as a replication in design of experiments. This is similar
to having a higher sample size in a survey.
Depending on the number of Factors and the number of levels
of each Factor, the minimum sample size required for ANOVA
may go up. In such cases, multiple observations or replications
become necessary. In general, replications reduce chances of
random error affecting the results of ANOVA experiments,
similar to the effects of increasing sample size in surveys.
Output:
The output data for our factorial experiment are presented in
fig. 6.
Fig 6
Source of
Variation
Main
Effects
Packdesn
Price
2-Way
Interactions
Packdesn
Price
Explained
Residual
Total

Sum of
Squares

DF

209305.556

12536.111

196769.444
9838.889

2
4

9838.889

Mean Square

Sig of
F

52326.389 13.645 .001


6268.056

1.635 .248

98384.722 25.656 .000


2459.722
.641 .646
2459.722

.641 .646

219144.444 8 27393.056
34512.500 9
3834.722
253656.944 17 14920.997

7.143 .004

Slide 16

Let us first look at Sources of Variation listed in the first


column. The last source of variation listed is the
Residual or error term. But we are interested in the two
Main Effects and one Interaction Effect.
In this case, we are testing three hypotheses
The mean level of Sales remains the same
for all 3 levels of Pack Design (Main Effect 1).
The mean level of Sales remains the same
for all 3 levels of Price (Main Effect 2).
The mean level of Sales remains the same
for all combinations of Pack Design and Price
(Interaction Effect).
Assuming 0.05 level of significance, we check whether
for each of the rows corresponding to the above
hypotheses, the significance of F is below 0.05 in the
last column of fig. 6.

Slide 17
We find that the significance of F values are
Pack Design - .248 (Main Effect 1)
Price - .000 (Main Effect 2)
Pack Design by Price - .646 (Interaction Effect)
Therefore, only the Price effect, one of the two main
effects, is significant statistically, at 95 percent
confidence level. This means that hypothesis no. 2 is
rejected.
Hypothesis 1 and 3 cannot be rejected, as the
significance of F values are greater than .05 in both
cases - .248 and .646 respectively).
Thus, we conclude that Price alone has an impact on
Sales. Neither Pack Design alone nor the combination of
Pack Design with Price have any significant impact on
Sales of the toilet soap.

Slide 18

Additional Comments

Experiments are today widely used in many ways in Marketing


Research. For example, test marketing of new concepts,
products or prototypes is usually done through procedures
explained above, or similar to these.
STM or simulated Test Marketing procedures are extensions of
the basic ANOVA type experiments, with the added tools of
forecasting based on the results of experiments conducts.
Separate software packages are now available for many
specialised applications such as STM.
Pairwise Tests
If any main effect/interaction effect turns out significant, and has
more than two levels, there is one additional test required to
check for pairwise differences in the means.
For instance, in our example of one-way ANOVA, if the mean
Ratings had turned out to be significantly different at the 95
percent confidence level, we still would not know whether only
one of the pairs (say, ADCOPY 1 and ADCOPY 2) are
significantly different from each other, or if the remaining pairs
(ADCOPY 1 and 3, and ADCOPY 2 and 3) are also significantly
different.
To find out, we can use tests such as Tukey's Test, Duncan's Test
or Scheffe's Test. These can be requested while doing the
ANOVA on most computer packages. These tests give us a
pairwise test result of significant difference among means.
These are meaningful only if the F test value for a main
effect/interaction effect with more than two levels turns out to be
significant.

You might also like