You are on page 1of 36

Discriminant analysis

Chapter 11

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

2-groups discriminant analysis


Discriminant analysis is a statistical procedure
which allows us to classify cases in separate
categories to which they belong on the basis of a
set of characteristic independent variables called
predictors or discriminant variables
The target variable (the one determining allocation
into groups) is a qualitative (nominal or ordinal)
one, while the characteristics are measured by
quantitative variables.
DA looks at the discrimination between two groups
Multiple discriminant analysis (MDA) allows for
classification into three or more groups.
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

Applications of DA
DA is especially useful to understand the
differences and factors leading consumers to make
different choices allowing them to develop
marketing strategies which take into proper
account the role of the predictors.
Examples
Determinants of customer loyalty
Shopper profiling and segmentation
Determinants of purchase and non-purchase
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

Example on the Trust data-set


Purchasers of chicken at the butchers shop (recorded in
question q8d)
Respondents may belong to one of two groups
those who purchase chicken at the butchers shop
those who do not

Discrimination between these groups through a set of


consumer characteristics
expenditure on chicken in a standard week (q5)
age of the respondent (q51)
whether respondents agree (on a seven-point ranking scale) that butchers
sell safe chicken (q21d)
trust (on a seven-point ranking scale) towards supermarkets (q43b)

Does a linear combination of these four characteristics


allow one to discriminate between those who buy chicken
at the butchers and those who do not?
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

Discriminant analysis(DA)
Two groups only, thus a single discriminating value
(discriminating score)
For each respondent a score is computed using the
estimated linear combination of the predictors (the
discriminant function)
Respondents with a score above the discriminating value
are expected to belong to one group, those below to the
other group.
When the discriminant score is standardized to have zero
mean and unity variance it is called Z score
DA also provides information about the discriminating
power of each of the original predictors
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

Multiple discriminant analysis(MDA)(1)


Discriminant analysis may involve more than two
groups, in which case it is termed multiple
discriminant analysis (MDA).
Example from the Trust data-set
Dependent variable: Type of chicken purchased in a
typical week, choosing among four categories: value
(good value for money), standard, organic and luxury
Predictors: age (q50), stated relevance of taste (q24a),
value for money (q24b) and animal welfare (q24k), plus
an indicator of income (q60)

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

Multiple discriminant analysis(2)


In this case there will be more than one
discriminant function.
The exact number of discriminant functions is
equal to either (g-1), where g is the number of
categories in classification or to k, the number of
independent variables, whichever is the smaller
Trust example: four groups and five explanatory
variables, the number of discriminant functions is
three (that is g-1 which is smaller than k=5).
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

The output of MDS


Similarities with factor (principal component) analysis
the first discriminant function is the most relevant for
discriminating across groups, the second is the second most
relevant, etc.
the discriminant functions are also independent, which means that
the resulting scores are non-correlated.
Once the coefficients of the discriminant functions are estimated
and standardized, they are interpreted in a similar fashion to the
factor loadings.
The larger the standardised coefficients (in absolute terms), the
more relevant the respective variables to discriminating between
groups

There is no single discriminant score in MDA


group means are computed (centroids) for each of the discriminant
functions to have a clearer view of the classification rule
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

Running discriminant analysis


(two groups)
Discriminant function
(Target variable: purchasers of chicken at
the butchers shop)

z 0 1 x1 2 x2 3 x3 4 x4
Discriminant score

Predictors
weekly expenditure on chicken

The discriminant coefficients


need to be estimated

age
safety of butchers chicken
trust in supermarkets

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

Fishers linear discriminant analysis

The discrimant function is the starting point


Two key assumptions behind linear DA
(a) the predictors are normally distributed;
(b) the covariance matrices for the predictors within each of the groups
are equal.

Departure from condition (a) should suggest use of


alternative methods (logistic regression, see lecture 16)
Departure from condition (b) requires the use of different
discriminant techniques (usually quadratic discriminant
functions).
In most empirical cases, the use of linear DA is appropriate

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

10

Estimation
The first step is the estimation of the
coefficients, also termed as discriminant
coefficients or weights
Estimation is similar to factor analysis or PCA, as
the coefficients are those which maximize the
variability between groups
In MDA the first discriminating function is the one
with the highest between-group variability, the
second discriminating function is independent from
the first and maximizes the remaining betweengroup variability and so on
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

11

SPSS two groups case


1. Choose the
target variable

2. Define the range of


the dependent variable

3.Select
the
predictors

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

12

Coefficient estimates
Additional statistics and diagnostics

Fishers and
standardized
estimates of the
discriminant function
coefficients need to
be asked for

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

13

Classification options

Decide whether prior


probabilities are
equal across groups
or group sizes reflect
different allocation
probabilities
These are diagnostic
indicators to evaluate
how well the
discriminant function
predict the groups
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

14

Save classification

Create new variables


in the data-set,
containing the
predicted group
membership and/or
the discriminant
score for each case
and each function

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

15

Output coefficient estimates


Canonical Discriminant Function Coefficients

Standardized Canonical Discriminant Function Coefficients

Function
1
In a typical week how
much do you spend
on fresh or frozen
chicken (Euro)?
From the butcher
Supermarkets
Age
(Constant)

.095
.454
-.297
.025
-2.515

Function
1
In a typical week how
much do you spend
on fresh or frozen
chicken (Euro)?
From the butcher
Supermarkets
Age

Unstandardized coefficients

Unstandardized coefficients
depend on the measurement unit

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

Standardized
coefficients do not
depend on the
measurement unit

.378
.748
-.453
.394

Most important
predictor
Trust in
supermarkets
has a sign
(thus it reduces
the discriminant
score)

16

Centroids
Functions at Group Centroids

Butcher
no
yes

Funct ion
1
-.307
.594

Unstandardized c anonical discriminant


functions evaluated at group means

These are the means of the


discriminant score for each of the
two groups
Thus, the group of those not
purchasing chicken at the
butchers shop have a negative
centroid

With two groups, the discriminating score is zero


This can be computed by weighting the centroids with the initial probabilities
Prior Probabilities for Groups

Butcher
no
yes
Total

Prior
.660
.340
1.000

Cases Used in Analysis


Unweighted
Weighted
277
277.000
143
143.000
420
420.000

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

From these prior probabilities it


follows that the discriminating score
is -0.307 x 0.66 + 0.594 x 0.34 = 0

17

Output classification success


Classification Resultsa

Original

Count

Butcher
no
yes
Ungrouped cases
no
yes
Ungrouped cases

Predicted Group
Membership
no
yes
244
33
88
55
1
1
88.1
11.9
61.5
38.5
50.0
50.0

Total
277
143
2
100.0
100.0
100.0

a. 71.2% of original grouped cases correctly classified.

Using the discriminant function, it is possible to correctly classify 71.2% of


original cases (244 no-no + 55 yes-yes)/420

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

18

Diagnostics (1)
Boxs M test. This tests whether covariances are equal
across groups
Wilks Lambda (or U statistic) tests discrimination between
groups. It is related to analysis of variance.
Individual WilksLambda for each of the predictors in a discriminant
function; univariate ANOVA (are there significant differences in the
predictors means between the groups?), p-value from the F distribution.
Wilks Lambda for the function as a whole. Are there significant
differences in the group means for the discriminant function p-value from
the Chi-square distribution?

The overall Wilks Lambda is especially helpful in multiple


discriminant analysis as it allows one to discard those
functions which do not contribute towards explaining
differences between groups.
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

19

Diagnostics (2)
DA returns one eigenvalue (or more eigenvalues for
MDA) of the discriminant function.
These can be interpreted as in principal component
analysis
In MDA (more than one discriminant function)
eigenvalues are exploited to compute how each
function contributes to explain variability
The canonical correlation measures the intensity of
the relationship between the groups and the single
discriminant function
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

20

Trust example: diagnostics


Statistic

P-value

Box's M statistic

37.3

0.000

Overall Wilks' Lambda

0.85

0.000

Expenditure

0.98

0.002

Age

0.97

0.001

Safer for Butcher

0.91

0.000

Trust in Supermarket

0.98

0.002

Wilks Lambda for

Eigenvalue

0.18

Canonical correlation

0.39

% of correct predictions
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

71.2%

Covariance matrices are


not equal
The overall discriminating
power of the DF is good
All of the predictors are
relevant to discriminating
between the two groups
The eigenvalue is the ratio
between variances between
and variance within groups
(the larger the better)

Square root of the ratio between variability


between and total variability
21

MDA
To run MDA in SPSS the only
difference is that the range
has more than two
categories

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

22

Predictors
Tests of Equality of Group Means

Age
Tasty food
Value for money
Animal welfare
Please indicate your
gross annual household
income range

Wilks'
Lambda
.981
.971
.960
.982

F
1.798
2.761
3.878
1.679

.919

8.272

df1
3
3
3
3

df2
282
282
282
282

Sig.
.148
.042
.010
.172

282

.000

Test Results
Box's M
F

Approx.
df1
df2
Sig.

65.212
1.382
45
53286.386
.045

Three predictors only appear to be


relevant in discriminating among preferred
types of chicken

Tests null hypothesis of equal population covariance matrices.

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

Null rejected at 95%


c.l., but not at 99% c.l.
23

Discriminant functions
Three discriminant functions (four groups minus one) can
be estimated
Eigenvalues
Function
1
2
3

Eigenvalue % of Variance
.102a
61.0
.051a
30.8
.014a
8.2

Cumulative %
61.0
91.8
100.0

Canonical
Correlation
.304
.221
.116

a. First 3 canonical discriminant functions were used in the


analysis.
Wilks' Lambda
Test of Function(s)
1 through 3
2 through 3
3

Wilks'
Lambda
.851
.938
.986

Chi-square
45.098
17.904
3.818

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

df
15
8
3

Sig.
.000
.022
.282

The first two


discriminant functions
have a significant
discriminating power.

24

Coefficients

Value for money is


very relevant for
the second
function

Income is
very relevant
for the first
function
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

25

Structure matrix
Structure Matrix
Function
2

1
Please indicate y our
gross annual hous ehold
income range
Animal welfare
Value for money
Tasty food
Age

.929*
.390*

Income
-.010
.241
-.217

-.021
Value
and
taste
-.206
.891*
.660*
-.204

3
.078
.125
.168
Age
.273
.944*

Pooled within-groups correlat ions between discriminating


variables and standardized canonical disc riminant functions
Variables ordered by absolute size of correlation within function.
*. Larges t absolute correlation between eac h variable and
any discriminant function

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

The values in the structure


matrix are the correlations
between the individual
predictors and the scores
computed on the discriminant
functions.
For example, the income
variable has a strong
correlation with the scores of
the first function
The structure matrix help
interpreting the functions

26

Centroids
Functions at Group Centroids
In a typical week, what
type of fresh or frozen
chicken
do you buy for
'Value' chicken
your household's
'Standard' chicken
home consumption?
'Organic' chicken
'Luxury' chicken

1
-.673
.058
.525
.003

Function
2
-.262
.156
-.470
.052

3
-.040
-.065
-.030
.242

Unstandardized canonical discriminant functions evaluated at


group means

The first function


discriminates well between
value and organic (income
matters to organic buyers)
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

The second allows some


discrimination standard-organic,
value-standard, organic-luxury
(taste and value matter)
27

Plot of two functions

Tick separate-groups to show


graphs of the first two functions
for each individual group

The territorial map shows the


scores for the first two
functions considering all groups

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

28

Plots: individual groups

Example: organic chicken


Most cases tend to be
relatively high on function 1
(income)

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

29

Plots all groups

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

30

Prediction results
Classification Resultsa

Original

Count

In a typical week, what


type of fresh or frozen
chicken do you buy for
your
household's
'Value'
chicken
home consumption?
'Standard' chicken
'Organic' chicken
'Luxury' chicken
Ungrouped cases
'Value' chicken
'Standard' chicken
'Organic' chicken
'Luxury' chicken
Ungrouped cases

Predicted Group Membership


'Value'
'Standard'
'Organic'
'Luxury'
chicken
chicken
chicken
chicken
3
38
0
0
2
154
1
0
1
30
4
0
1
51
1
0
0
51
3
0
7.3
92.7
.0
.0
1.3
98.1
.6
.0
2.9
85.7
11.4
.0
1.9
96.2
1.9
.0
.0
94.4
5.6
.0

Total
41
157
35
53
54
100.0
100.0
100.0
100.0
100.0

a. 56.3% of original grouped cases correctly classified.

The functions do not predict well;


most units are allocated to standard
chicken on average only 56.3% of
the cases are allocated correctly
Statistics for Marketing & Consumer Research
Copyright 2008 - Mario Mazzocchi

31

Stepwise discriminant analysis


As for linear regression it is possible to
decide whether all predictors should appear
in the equation regardless of their role in
discriminating (the Enter option) or a subset of predictors is chosen on the basis of
their contribution to discriminating between
groups (the Stepwise method)

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

32

The step-wise method


1.

2.
3.
4.
5.

A one-way ANOVA test is run on each of the predictors, where the target
grouping variable determines the treatment levels. The ANOVA test provides
a criterion value and tests statistics (usually the Wilks Lambda). According to
the criterion value, it is possible to identify the predictor which is most
relevant in discriminating between the groups
The predictor with the lowest Wilks Lambda (or which meets an alternative
optimality criterion) enters the discriminating function, provided the p-value
is below the set threshold (for example 5%).
An ANCOVA test is run on the remaining predictors, where the covariates are
the target grouping variables and the predictors that have already entered
the model. The Wilks Lambda is computed for each of the ANCOVA options.
Again, the criteria and the p-value determine which variable (if any) enter
the discriminating function (and possibly whether some of the entered
variables should leave the model).
The procedure goes back to step 3 and continues until none of the excluded
variables have a p-value below the threshold and none of the entered
variables have a p-value above the threshold (the stopping rule is met).

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

33

Alternative criteria

Unexplained variance
Smallest F ratio
Mahalanobis distance
Raos V

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

34

In SPSS

The step-wise method


allows selection of
relevant predictors

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

35

Output of the step-wise method


Variables in the Analysis
Step
1

Wilks'
Lambda

Tolerance

F to Remove

1.000

8.272

1.000

8.241

.960

1.000

3.863

.919

Please indicate your


gross annual household
income range
Please indicate your
gross annual household
income range
Value for money

Only two predictors


are kept in the model

Variables Not in the Analysis


Step
0

Age
Tasty food
Value for money
Animal welfare
Please indicate your
gross annual household
income range
Age
Tasty food
Value for money
Animal welfare
Age
Tasty food
Animal welfare

Tolerance
1.000
1.000
1.000
1.000

Min.
Tolerance
1.000
1.000
1.000
1.000

F to Enter
1.798
2.761
3.878
1.679

Wilks'
Lambda
.981
.971
.960
.982

1.000

1.000

8.272

.919

.988
.991
1.000
.992
.987
.821
.992

.988
.991
1.000
.992
.987
.821
.992

1.507
2.437
3.863
1.052
1.549
.793
1.057

.905
.896
.883
.909
.868
.875
.873

Statistics for Marketing & Consumer Research


Copyright 2008 - Mario Mazzocchi

36

You might also like