Chap11 Discriminant Analysis

Discriminant analysis
Chapter 11
Statistics for Marketing & Consumer Research

Copyright 2008 - Mario Mazzocchi
2-groups discriminant analysis

Discriminant analysis is a statistical procedure
which allows us to classify cases in separate
categories to which they belong on the basis of a
set of characteristic independent variables called
predictors or discriminant variables
The target variable (the one determining allocation
into groups) is a qualitative (nominal or ordinal)
one, while the characteristics are measured by
quantitative variables.
DA looks at the discrimination between two groups
Multiple discriminant analysis (MDA) allows for
classification into three or more groups.
Applications of DA
DA is especially useful to understand the
differences and factors leading consumers to make
different choices allowing them to develop
marketing strategies which take into proper
account the role of the predictors.
Examples
Determinants of customer loyalty
Shopper profiling and segmentation
Determinants of purchase and non-purchase
Example on the Trust data-set

Purchasers of chicken at the butchers shop (recorded in
question q8d)
Respondents may belong to one of two groups
those who purchase chicken at the butchers shop
those who do not
Discrimination between these groups through a set of

consumer characteristics
expenditure on chicken in a standard week (q5)
age of the respondent (q51)
whether respondents agree (on a seven-point ranking scale) that butchers
sell safe chicken (q21d)
trust (on a seven-point ranking scale) towards supermarkets (q43b)
Does a linear combination of these four characteristics

allow one to discriminate between those who buy chicken
at the butchers and those who do not?
Discriminant analysis(DA)
Two groups only, thus a single discriminating value
(discriminating score)
For each respondent a score is computed using the
estimated linear combination of the predictors (the
discriminant function)
Respondents with a score above the discriminating value
are expected to belong to one group, those below to the
other group.
When the discriminant score is standardized to have zero
mean and unity variance it is called Z score
DA also provides information about the discriminating
power of each of the original predictors
Multiple discriminant analysis(MDA)(1)

Discriminant analysis may involve more than two
groups, in which case it is termed multiple
discriminant analysis (MDA).
Example from the Trust data-set
Dependent variable: Type of chicken purchased in a
typical week, choosing among four categories: value
(good value for money), standard, organic and luxury
Predictors: age (q50), stated relevance of taste (q24a),
value for money (q24b) and animal welfare (q24k), plus
an indicator of income (q60)

Multiple discriminant analysis(2)

In this case there will be more than one
discriminant function.
The exact number of discriminant functions is
equal to either (g-1), where g is the number of
categories in classification or to k, the number of
independent variables, whichever is the smaller
Trust example: four groups and five explanatory
variables, the number of discriminant functions is
three (that is g-1 which is smaller than k=5).
The output of MDS

Similarities with factor (principal component) analysis
the first discriminant function is the most relevant for
discriminating across groups, the second is the second most
relevant, etc.
the discriminant functions are also independent, which means that
the resulting scores are non-correlated.
Once the coefficients of the discriminant functions are estimated
and standardized, they are interpreted in a similar fashion to the
factor loadings.
The larger the standardised coefficients (in absolute terms), the
more relevant the respective variables to discriminating between
groups
There is no single discriminant score in MDA

group means are computed (centroids) for each of the discriminant
functions to have a clearer view of the classification rule
Running discriminant analysis

(two groups)
Discriminant function
(Target variable: purchasers of chicken at
the butchers shop)
z 0 1 x1 2 x2 3 x3 4 x4
Discriminant score
Predictors
weekly expenditure on chicken
The discriminant coefficients

need to be estimated
age
safety of butchers chicken
trust in supermarkets

Fishers linear discriminant analysis
The discrimant function is the starting point

Two key assumptions behind linear DA
(a) the predictors are normally distributed;
(b) the covariance matrices for the predictors within each of the groups
are equal.
Departure from condition (a) should suggest use of

alternative methods (logistic regression, see lecture 16)
Departure from condition (b) requires the use of different
discriminant techniques (usually quadratic discriminant
functions).
In most empirical cases, the use of linear DA is appropriate

10
Estimation
The first step is the estimation of the
coefficients, also termed as discriminant
coefficients or weights
Estimation is similar to factor analysis or PCA, as
the coefficients are those which maximize the
variability between groups
In MDA the first discriminating function is the one
with the highest between-group variability, the
second discriminating function is independent from
the first and maximizes the remaining betweengroup variability and so on
11
SPSS two groups case

1. Choose the
target variable
2. Define the range of

the dependent variable
3.Select
the
predictors

12
Coefficient estimates
Additional statistics and diagnostics
Fishers and
standardized
estimates of the
discriminant function
coefficients need to
be asked for

13
Classification options
Decide whether prior

probabilities are
equal across groups
or group sizes reflect
different allocation
probabilities
These are diagnostic
indicators to evaluate
how well the
predict the groups
14
Save classification
Create new variables

in the data-set,
containing the
predicted group
membership and/or
the discriminant
score for each case
and each function

15
Output coefficient estimates

Canonical Discriminant Function Coefficients
Standardized Canonical Discriminant Function Coefficients
Function
1
In a typical week how
much do you spend
on fresh or frozen
chicken (Euro)?
From the butcher
Supermarkets
Age
(Constant)
.095
.454
-.297
.025
-2.515
Function
1
In a typical week how
much do you spend
on fresh or frozen
chicken (Euro)?
From the butcher
Supermarkets
Age
Unstandardized coefficients
Unstandardized coefficients
depend on the measurement unit

Standardized
coefficients do not
depend on the
measurement unit
.378
.748
-.453
.394
Most important
predictor
Trust in
supermarkets
has a sign
(thus it reduces
the discriminant
score)
16
Centroids
Functions at Group Centroids
Butcher
no
yes
Funct ion
1
-.307
.594
Unstandardized c anonical discriminant

functions evaluated at group means
These are the means of the

discriminant score for each of the
two groups
Thus, the group of those not
purchasing chicken at the
butchers shop have a negative
centroid
With two groups, the discriminating score is zero

This can be computed by weighting the centroids with the initial probabilities
Prior Probabilities for Groups
Butcher
no
yes
Total
Prior
.660
.340
1.000
Cases Used in Analysis

Unweighted
Weighted
277
277.000
143
143.000
420
420.000

From these prior probabilities it

follows that the discriminating score
is -0.307 x 0.66 + 0.594 x 0.34 = 0
17
Output classification success

Classification Resultsa
Original
Count
Butcher
no
yes
Ungrouped cases
no
yes
Ungrouped cases
Predicted Group
Membership
no
yes
244
33
88
55
1
1
88.1
11.9
61.5
38.5
50.0
50.0
Total
277
143
2
100.0
100.0
100.0
a. 71.2% of original grouped cases correctly classified.
Using the discriminant function, it is possible to correctly classify 71.2% of

original cases (244 no-no + 55 yes-yes)/420

18
Diagnostics (1)
Boxs M test. This tests whether covariances are equal
across groups
Wilks Lambda (or U statistic) tests discrimination between
groups. It is related to analysis of variance.
Individual WilksLambda for each of the predictors in a discriminant
function; univariate ANOVA (are there significant differences in the
predictors means between the groups?), p-value from the F distribution.
Wilks Lambda for the function as a whole. Are there significant
differences in the group means for the discriminant function p-value from
the Chi-square distribution?
The overall Wilks Lambda is especially helpful in multiple

discriminant analysis as it allows one to discard those
functions which do not contribute towards explaining
differences between groups.
19
Diagnostics (2)
DA returns one eigenvalue (or more eigenvalues for
MDA) of the discriminant function.
These can be interpreted as in principal component
analysis
In MDA (more than one discriminant function)
eigenvalues are exploited to compute how each
function contributes to explain variability
The canonical correlation measures the intensity of
the relationship between the groups and the single
20
Trust example: diagnostics

Statistic
P-value
Box's M statistic
37.3
0.000
Overall Wilks' Lambda
0.85
0.000
Expenditure
0.98
0.002
Age
0.97
0.001
Safer for Butcher
0.91
0.000
Trust in Supermarket
0.98
0.002
Wilks Lambda for
Eigenvalue
0.18
Canonical correlation
0.39
% of correct predictions
71.2%
Covariance matrices are

not equal
The overall discriminating
power of the DF is good
All of the predictors are
relevant to discriminating
between the two groups
The eigenvalue is the ratio
between variances between
and variance within groups
(the larger the better)
Square root of the ratio between variability

between and total variability
21
MDA
To run MDA in SPSS the only
difference is that the range
has more than two
categories

22
Predictors
Tests of Equality of Group Means
Age
Tasty food
Value for money
Animal welfare
Please indicate your
gross annual household
income range
Wilks'
Lambda
.981
.971
.960
.982
F
1.798
2.761
3.878
1.679
.919
8.272
df1
3
3
3
3
df2
282
282
282
282
Sig.
.148
.042
.010
.172
282
.000
Test Results
Box's M
F
Approx.
df1
df2
Sig.
65.212
1.382
45
53286.386
.045
Three predictors only appear to be

relevant in discriminating among preferred
types of chicken
Tests null hypothesis of equal population covariance matrices.

Null rejected at 95%

c.l., but not at 99% c.l.
23
Discriminant functions
Three discriminant functions (four groups minus one) can
be estimated
Eigenvalues
Function
1
2
3
Eigenvalue % of Variance
.102a
61.0
.051a
30.8
.014a
8.2
Cumulative %
61.0
91.8
100.0
Canonical
Correlation
.304
.221
.116
a. First 3 canonical discriminant functions were used in the

analysis.
Wilks' Lambda
Test of Function(s)
1 through 3
2 through 3
3
Wilks'
Lambda
.851
.938
.986
Chi-square
45.098
17.904
3.818

df
15
8
3
Sig.
.000
.022
.282
The first two

discriminant functions
have a significant
discriminating power.
24
Coefficients
Value for money is

very relevant for
the second
function
Income is
very relevant
for the first
function
25
Structure matrix
Structure Matrix
Function
2
1
Please indicate y our
gross annual hous ehold
income range
Animal welfare
Value for money
Tasty food
Age
.929*
.390*
Income
-.010
.241
-.217
-.021
Value
and
taste
-.206
.891*
.660*
-.204
3
.078
.125
.168
Age
.273
.944*
Pooled within-groups correlat ions between discriminating

variables and standardized canonical disc riminant functions
Variables ordered by absolute size of correlation within function.
*. Larges t absolute correlation between eac h variable and
any discriminant function

The values in the structure

matrix are the correlations
between the individual
predictors and the scores
computed on the discriminant
functions.
For example, the income
variable has a strong
correlation with the scores of
the first function
The structure matrix help
interpreting the functions
26
Centroids
Functions at Group Centroids
In a typical week, what
type of fresh or frozen
chicken
do you buy for
'Value' chicken
your household's
'Standard' chicken
home consumption?
'Organic' chicken
'Luxury' chicken
1
-.673
.058
.525
.003
Function
2
-.262
.156
-.470
.052
3
-.040
-.065
-.030
.242
Unstandardized canonical discriminant functions evaluated at

group means
The first function

discriminates well between
value and organic (income
matters to organic buyers)
The second allows some

discrimination standard-organic,
value-standard, organic-luxury
(taste and value matter)
27
Plot of two functions
Tick separate-groups to show

graphs of the first two functions
for each individual group
The territorial map shows the

scores for the first two
functions considering all groups

28
Plots: individual groups
Example: organic chicken

Most cases tend to be
relatively high on function 1
(income)

29
Plots all groups

30
Prediction results
Classification Resultsa
Original
Count
In a typical week, what

type of fresh or frozen
chicken do you buy for
your
household's
'Value'
chicken
home consumption?
'Standard' chicken
'Organic' chicken
'Luxury' chicken
Ungrouped cases
'Value' chicken
'Standard' chicken
'Organic' chicken
'Luxury' chicken
Ungrouped cases
Predicted Group Membership

'Value'
'Standard'
'Organic'
'Luxury'
chicken
chicken
chicken
chicken
3
38
0
0
2
154
1
0
1
30
4
0
1
51
1
0
0
51
3
0
7.3
92.7
.0
.0
1.3
98.1
.6
.0
2.9
85.7
11.4
.0
1.9
96.2
1.9
.0
.0
94.4
5.6
.0
Total
41
157
35
53
54
100.0
100.0
100.0
100.0
100.0
a. 56.3% of original grouped cases correctly classified.
The functions do not predict well;

most units are allocated to standard
chicken on average only 56.3% of
the cases are allocated correctly
31
Stepwise discriminant analysis

As for linear regression it is possible to
decide whether all predictors should appear
in the equation regardless of their role in
discriminating (the Enter option) or a subset of predictors is chosen on the basis of
their contribution to discriminating between
groups (the Stepwise method)

32
The step-wise method

1.
2.
3.
4.
5.
A one-way ANOVA test is run on each of the predictors, where the target
grouping variable determines the treatment levels. The ANOVA test provides
a criterion value and tests statistics (usually the Wilks Lambda). According to
the criterion value, it is possible to identify the predictor which is most
relevant in discriminating between the groups
The predictor with the lowest Wilks Lambda (or which meets an alternative
optimality criterion) enters the discriminating function, provided the p-value
is below the set threshold (for example 5%).
An ANCOVA test is run on the remaining predictors, where the covariates are
the target grouping variables and the predictors that have already entered
the model. The Wilks Lambda is computed for each of the ANCOVA options.
Again, the criteria and the p-value determine which variable (if any) enter
the discriminating function (and possibly whether some of the entered
variables should leave the model).
The procedure goes back to step 3 and continues until none of the excluded
variables have a p-value below the threshold and none of the entered
variables have a p-value above the threshold (the stopping rule is met).

33
Alternative criteria
Unexplained variance
Smallest F ratio
Mahalanobis distance
Raos V

34
In SPSS
The step-wise method

allows selection of
relevant predictors

35
Output of the step-wise method

Variables in the Analysis
Step
1
Wilks'
Lambda
Tolerance
F to Remove
1.000
8.272
1.000
8.241
.960
1.000
3.863
.919

income range
income range
Value for money
Only two predictors

are kept in the model
Variables Not in the Analysis

Step
0
Age
Tasty food
Value for money
Animal welfare
income range
Age
Tasty food
Value for money
Animal welfare
Age
Tasty food
Animal welfare
Tolerance
1.000
1.000
1.000
1.000
Min.
Tolerance
1.000
1.000
1.000
1.000
F to Enter
1.798
2.761
3.878
1.679
Wilks'
Lambda
.981
.971
.960
.982
1.000
1.000
8.272
.919
.988
.991
1.000
.992
.987
.821
.992
.988
.991
1.000
.992
.987
.821
.992
1.507
2.437
3.863
1.052
1.549
.793
1.057
.905
.896
.883
.909
.868
.875
.873

36

Chap11 Discriminant Analysis

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chap11 Discriminant Analysis

Uploaded by

Copyright:

Available Formats

Discriminant analysis

Statistics for Marketing & Consumer Research

2-groups discriminant analysis

Example on the Trust data-set

Discrimination between these groups through a set of

Does a linear combination of these four characteristics

Multiple discriminant analysis(MDA)(1)

Statistics for Marketing & Consumer Research

Multiple discriminant analysis(2)

The output of MDS

There is no single discriminant score in MDA

Running discriminant analysis

The discriminant coefficients

Statistics for Marketing & Consumer Research

Fishers linear discriminant analysis

The discrimant function is the starting point

Departure from condition (a) should suggest use of

Statistics for Marketing & Consumer Research

SPSS two groups case

2. Define the range of

Statistics for Marketing & Consumer Research

Statistics for Marketing & Consumer Research

Decide whether prior

Create new variables

Statistics for Marketing & Consumer Research

Output coefficient estimates

Standardized Canonical Discriminant Function Coefficients

Statistics for Marketing & Consumer Research

Unstandardized c anonical discriminant

These are the means of the

With two groups, the discriminating score is zero

Cases Used in Analysis

Statistics for Marketing & Consumer Research

From these prior probabilities it

Output classification success

a. 71.2% of original grouped cases correctly classified.

Using the discriminant function, it is possible to correctly classify 71.2% of

Statistics for Marketing & Consumer Research

The overall Wilks Lambda is especially helpful in multiple

Trust example: diagnostics

Overall Wilks' Lambda

Safer for Butcher

Wilks Lambda for

Covariance matrices are

Square root of the ratio between variability

Statistics for Marketing & Consumer Research

Three predictors only appear to be

Tests null hypothesis of equal population covariance matrices.

Statistics for Marketing & Consumer Research

Null rejected at 95%

a. First 3 canonical discriminant functions were used in the

Statistics for Marketing & Consumer Research

The first two

Value for money is

Pooled within-groups correlat ions between discriminating

Statistics for Marketing & Consumer Research

The values in the structure

Unstandardized canonical discriminant functions evaluated at

The first function

The second allows some

Plot of two functions

Tick separate-groups to show

The territorial map shows the

Statistics for Marketing & Consumer Research