You are on page 1of 25

COMPLETE

BUSINESS
STATISTICS
by
AMIR D. ACZEL
&
JAYAVEL SOUNDERPANDIAN
7th edition.
Prepared by Lloyd Jaisingh, Morehead State
University

Chapter 17

Multivariate Analysis
McGraw-Hill/Irwin

Copyright 2009 by The McGraw-Hill Companies, Inc. All

17-2

17 Multivariate Analysis

The Multivariate Normal Distribution


Discriminant Analysis
Principal Components and Factor Analysis
Using the Computer

17-3

17 LEARNING OBJECTIVES
After studying this chapter, you should be able to:

Describe a multivariate normal distribution


Explain when a discriminant analysis could be conducted
Interpret the results of a discriminant analysis
Explain when a factor analysis could be conducted
Differentiate between principal components and factors
Interpret factor analysis results

17-2 The Multivariate Normal


Distribution

A k-dimensional (vector) random variable X:

A realization of a k-dimensional random variable X:

X = (X1, X2, X3..., Xk)


x = (x1, x2, x3..., xk)

A joint cumulative probability distribution function of a k-dimensional


random variable X:

F(x1, x2, x3..., xk) = P(X1x1, X2x2,..., Xkxk)

17-4

17-5

The Multivariate Normal Distribution


A multivariate normal random variable has the following
probability density function:

f (x1, x2 , , x )
k

1
k
2

1
2

1 ( X ) 1( X )
e 2

where X is the vector random variable, the term = ( 1 , 2 , , k )


is the vector of means of the component variables X i , and is
the variance - covariance matrix. The operations ' and -1 are
transposition and inversion of matrices, respectively, and
denotes the determinant of a matrix.

17-6

Picturing the Bivariate Normal


Distribution
f(x1,x2)

x2

x1

17-7

17-3 Discriminant Analysis


In a discriminant analysis, observations are classified into two or more groups,
depending on the value of a multivariate discriminant function.
As the figure illustrates, it may
be easier to classify
observations by looking at
them from another direction.
The groups appear more
separated when viewed from a
point perpendicular to Line L,
rather than from a point
perpendicular to the X1 or X2
axis. The discriminant
function gives the direction
that maximizes the separation
between the groups.

X2
Group 1
1
2

Group 2

Line L

X1

17-8

The Discriminant Function


The form of the estimated predicted equation:
D = b0 +b1X1+b2X2+...+bkXk
where the bi are the discriminant weights. b0 is a
constant.

Group 1

Group 2

The intersection of the normal marginal distributions of


two groups gives the cutting score, which is used to
assign observations to groups. Observations with scores
less than C are assigned to group 1, and observations
with scores greater than C are assigned to group 2.
Since the distributions may overlap, some observations
may be misclassified.
The model may be evaluated in terms of the percentages
of observations assigned correctly and incorrectly.

C
Cutting Score

Discriminant Analysis: Example 17-1


(Minitab)

17-9

Discriminant Analysis: Example 17-1


(Minitab - continued)

17-10

Example 17-1: Misclassified


Observations (Minitab continued)

17-11

17-12

Example 17-1: SPSS Output (1)


setwidth
width80
80
11 00 set
datalist
listfree
free/ /assets
assetsincome
incomedebt
debtfamsize
famsizejob
jobrepay
repay
22 data
begindata
data
33 begin
35 end
enddata
data
35
36 discriminant
discriminantgroups
groups==repay(0,1)
repay(0,1)
36
37 /variables
/variablesassets
assetsincome
incomedebt
debtfamsize
famsizejob
job
37
38 /method
/method==wilks
wilks
38
39 /fin
/fin==11
39
40 /fout
/fout==11
40
41 /plot
/plot
41
42 /statistics
/statistics==all
all
42
Numberofofcases
casesby
bygroup
group
Number
Numberofofcases
cases
Number
REPAYUnweighted
Unweighted Weighted
Weighted Label
Label
REPAY
14
14.0
00
14
14.0
18
18.0
11
18
18.0
Total
Total

32
32

32.0
32.0

17-13

Example 17-1: SPSS Output (2)


- -- -- -- -- -- -- -- - DDI ISSCCRRI IMMI INNAANNTT AANNAALLYYSSI ISS - -- -- -- -- -- -- -- Ongroups
groupsdefined
definedby
byREPAY
REPAY
On
Analysisnumber
number
Analysis

11

Stepwisevariable
variableselection
selection
Stepwise
Selectionrule:
rule: minimize
minimizeWilks'
Wilks'Lambda
Lambda
Selection
Maximumnumber
numberofofsteps..................
steps..................
10
Maximum
10
Minimumtolerance
tolerancelevel..................
level.................. .00100
.00100
Minimum
MinimumFFtotoenter....................
enter.................... 1.00000
1.00000
Minimum
MaximumFFtotoremove......................
remove......................1.00000
1.00000
Maximum
CanonicalDiscriminant
DiscriminantFunctions
Functions
Canonical
Maximumnumber
numberofoffunctions..............
functions..............
Maximum
11
Minimumcumulative
cumulativepercent
percentofofvariance...
variance... 100.00
100.00
Minimum
Maximumsignificance
significanceofofWilks'
Wilks'Lambda....
Lambda.... 1.0000
1.0000
Maximum
Priorprobability
probabilityfor
foreach
eachgroup
groupisis .50000
.50000
Prior

17-14

Example 17-1: SPSS Output (3)


----------------Variables
Variablesnot
notininthe
theAnalysis
Analysisafter
afterStep
Step00---------------------------------------------Variable
Variable

Minimum
Minimum
Tolerance
Tolerance

Tolerance
Tolerance

ASSETS
ASSETS
INCOME
INCOME
DEBT
DEBT
FAMSIZE
FAMSIZE
JOB
JOB

1.0000000
1.0000000
1.0000000
1.0000000
1.0000000
1.0000000
1.0000000
1.0000000
1.0000000
1.0000000

1.0000000
1.0000000
1.0000000
1.0000000
1.0000000
1.0000000
1.0000000
1.0000000
1.0000000
1.0000000

Enter
FFtotoEnter
6.6151550
6.6151550
3.0672181
3.0672181
5.2263180
5.2263180
2.5291715
2.5291715
.2445652
.2445652

Wilks'Lambda
Lambda
Wilks'
.8193329
.8193329
.9072429
.9072429
.8516360
.8516360
.9222491
.9222491
9919137
. .9919137

********************************************
************************
Atstep
step1,1,ASSETS
ASSETS was
wasincluded
includedininthe
theanalysis.
analysis.
At
Wilks'Lambda
Lambda
Wilks'
EquivalentFF
Equivalent

.81933
.81933
6.61516
6.61516

DegreesofofFreedom
Freedom Signif.
Signif. Between
BetweenGroups
Groups
Degrees
30.0
11
11 30.0
30.0
.0153
11 30.0
.0153

17-15

Example 17-1: SPSS Output (4)


----------------Variables
Variablesininthe
theAnalysis
Analysisafter
afterStep
Step11---------------------------------------------Variable Tolerance
Tolerance FFtotoRemove
Remove Wilks'
Wilks'Lambda
Lambda
Variable
ASSETS 1.0000000
1.0000000
6.6152
ASSETS
6.6152
----------------Variables
Variablesnot
notininthe
theAnalysis
Analysisafter
afterStep
Step11-------------------------------------Variable
Variable

Tolerance
Tolerance

Minimum
Minimum
Tolerance
Tolerance

INCOME
INCOME
DEBT
DEBT
FAMSIZE
FAMSIZE
JOB
JOB

.5784563
.5784563
.9706667
.9706667
.9492947
.9492947
.9631433
.9631433

.5784563
.5784563
.9706667
.9706667
.9492947
.9492947
.9631433
.9631433

Enter
FFtotoEnter
0090821
. .0090821
6.0661878
6.0661878
3.9269288
3.9269288
.0000005
.0000005

Wilks'Lambda
Lambda
Wilks'
.8190764
.8190764
.6775944
.6775944
.7216177
.7216177
.8193329
.8193329

Atstep
step2,2,DEBT
DEBT was
wasincluded
includedininthe
theanalysis.
analysis.
At
Wilks'Lambda
Lambda
Wilks'
EquivalentFF
Equivalent

.67759
.67759
6.89923
6.89923

DegreesofofFreedom
FreedomSignif.
Signif. Between
BetweenGroups
Groups
Degrees
30.0
22 11 30.0
29.0
.0035
22 29.0
.0035

17-16

Example 17-1: SPSS Output (5)


-----------------Variables
Variablesininthe
theAnalysis
Analysisafter
afterStep
Step22----------------------------------------------Variable
Variable
ASSETS
ASSETS
DEBT
DEBT

Tolerance
Tolerance
.9706667
.9706667
.9706667
.9706667

Remove
FFtotoRemove
7.4487
7.4487
6.0662
6.0662

Wilks'Lambda
Lambda
Wilks'
.8516360
.8516360
.8193329
.8193329

--------------Variables
Variablesnot
notininthe
theAnalysis
Analysisafter
afterStep
Step22-------------------------------------Variable
Variable
INCOME
INCOME
FAMSIZE
FAMSIZE
JOB
JOB

Tolerance
Tolerance
.5728383
.5728383
.9323959
.9323959
.9105435
.9105435

Minimum
Minimum
Tolerance
Tolerance
.5568120
.5568120
.9308959
.9308959
.9105435
.9105435

Enter
FFtotoEnter
.0175244
.0175244
2.2214373
2.2214373
.2791429
.2791429

Wilks'Lambda
Lambda
Wilks'
.6771706
.6771706
.6277876
.6277876
.6709059
.6709059

Atstep
step3,3,FAMSIZE
FAMSIZE was
wasincluded
includedininthe
theanalysis.
analysis.
At
Wilks'Lambda
Lambda
Wilks'
EquivalentFF
Equivalent

.62779
.62779
5.53369
5.53369

DegreesofofFreedom
FreedomSignif.
Signif. Between
BetweenGroups
Groups
Degrees
30.0
33 11 30.0
28.0
.0041
33 28.0
.0041

17-17

Example 17-1: SPSS Output (6)


-------------Variables
Variablesininthe
theAnalysis
Analysisafter
afterStep
Step33------------------------------------------Variable
Tolerance
Remove
Wilks'Lambda
Lambda
Variable
Tolerance
FFtotoRemove
Wilks'
ASSETS
.9308959
8.4282
.8167558
ASSETS
.9308959
8.4282
.8167558
DEBT
.9533874
4.1849
.7216177
DEBT
.9533874
4.1849
.7216177
FAMSIZE
.9323959
2.2214
.6775944
FAMSIZE
.9323959
2.2214
.6775944
-------------Variables
Variablesnot
notininthe
theAnalysis
Analysisafter
afterStep
Step33----------------------------------Minimum
Minimum
Variable Tolerance
Tolerance Tolerance
Tolerance FFtotoEnter
Enter Wilks'
Wilks'Lambda
Lambda
Variable
INCOME .5725772
.5725772 .5410775
.5410775 .0240984
.0240984 .6272278
.6272278
INCOME
JOB
.8333526 .8333526
.8333526 .0086952
.0086952 .6275855
.6275855
JOB
.8333526
SummaryTable
Table
Summary
Action
Vars
Action
Vars
StepEntered
EnteredRemoved
Removed inin
Step
ASSETS
11 ASSETS
DEBT
22 DEBT
FAMSIZE
33 FAMSIZE

11
22
33

Wilks'
Wilks'
Lambda
Lambda
.81933
.81933
.67759
.67759
.62779
.62779

Sig. Label
Label
Sig.
.0153
.0153
.0035
.0035
.0041
.0041

17-18

Example 17-1: SPSS Output (7)


Classificationfunction
functioncoefficients
coefficients
Classification
(Fisher'slinear
lineardiscriminant
discriminantfunctions)
functions)
(Fisher's
REPAY ==
REPAY
ASSETS
ASSETS
DEBT
DEBT
FAMSIZE
FAMSIZE
(Constant)
(Constant)

00
.0018509
.0018509
.0758239
.0758239
3.5833063
3.5833063
-7.7374079
-7.7374079

11
.0547891
.0547891
.0113348
.0113348
2.8570101
2.8570101
-6.1008660
-6.1008660

Unstandardizedcanonical
canonicaldiscriminant
discriminantfunction
functioncoefficients
coefficients
Unstandardized
Func 11
Func
ASSETS
ASSETS
DEBT
DEBT
FAMSIZE
FAMSIZE
(Constant)
(Constant)

-.0352245
-.0352245
.0429103
.0429103
.4832695
.4832695
-.9950070
-.9950070

17-19

Example 17-1: SPSS Output (8)


Case Mis
Mis
Actual Highest
Highest
Probability
Case
Actual
Probability
NumberVal
ValSel
Sel Group
Group Group
Group
P(D/G)
P(G/D)
Number
P(D/G)
P(G/D)
.1798
.9587
11
11
11
.1798
.9587
2
1
1
.3357
.9293
2
1
1
.3357
.9293
.8840
.7939
33
11
11
.8840
.7939
4
1
**
0
.4761
.5146
4
1 **
0
.4761
.5146
.3368
.9291
55
11
11
.3368
.9291
.5571
.5614
66
11
11
.5571
.5614
7
1
**
0
.6272
.5986
7
1 **
0
.6272
.5986
.7236
.6452
88
11
11
.7236
.6452
...........................................................................
...........................................................................
.1122
.9712
2020
00
00
.1122
.9712
21
0
**
1
.7395
.6524
21
0 **
1
.7395
.6524
.9432
.7749
2222
11**** 00
.9432
.7749
.7819
.6711
2323
11
11
.7819
.6711
24
0
**
1
.5294
.5459
24
0 **
1
.5294
.5459
.5673
.8796
2525
11
11
.5673
.8796
26
1
1
.1964
.9557
26
1
1
.1964
.9557
.6916
.6302
2727
00**** 11
.6916
.6302
.7479
.6562
2828
11**** 00
.7479
.6562
29
1
**
0
.9211
.7822
29
1 **
0
.9211
.7822
.4276
.9107
3030
11
11
.4276
.9107
31
1
1
.8188
.8136
31
1
1
.8188
.8136
.8825
.7124
3232
00**** 11
.8825
.7124

2nd
2nd
Group
Group
00
0
0
00
11
00
00
11
00
11
00
11
00
00
00
0
0
00
11
11
00
00
00

Highest
Highest
P(G/D)
P(G/D)
.0413
.0413
.0707
.0707
.2061
.2061
.4854
.4854
.0709
.0709
.4386
.4386
.4014
.4014
.3548
.3548
.0288
.0288
.3476
.3476
.2251
.2251
.3289
.3289
.4541
.4541
.1204
.1204
.0443
.0443
.3698
.3698
.3438
.3438
.2178
.2178
.0893
.0893
.1864
.1864
.2876
.2876

Discrim
Discrim
Scores
Scores
-1.9990
-1.9990
-1.6202
-1.6202
-.8034
-.8034
.1328
.1328
-1.6181
-1.6181
-.0704
-.0704
.3598
.3598
-.3039
-.3039
2.4338
2.4338
-.3250
-.3250
.9166
.9166
-.3807
-.3807
-.0286
-.0286
-1.2296
-1.2296
-1.9494
-1.9494
-.2608
-.2608
.5240
.5240
.9445
.9445
-1.4509
-1.4509
-.8866
-.8866
-.5097
-.5097

17-20

Example 17-1: SPSS Output (9)


Classificationresults
results- Classification
ActualGroup
Group
Actual
---------------------------------------

No.ofof
No.
Cases
Cases
-----------

PredictedGroup
GroupMembership
Membership
Predicted
00
11
-----------------------------

Group
Group

00

14
14

10
10
71.4%
71.4%

44
28.6%
28.6%

Group
Group

11

18
18

55
27.8%
27.8%

13
13
72.2%
72.2%

Percentofof"grouped"
"grouped"cases
casescorrectly
correctlyclassified:
classified: 71.88%
71.88%
Percent

17-21

Example 17-1: SPSS Output (10)


All-groupsStacked
StackedHistogram
Histogram
All-groups
CanonicalDiscriminant
DiscriminantFunction
Function11
Canonical

44++
++
||
||
|
|
||
FF | |
||
r r 33++
22
++
e
|
2
e
|
2
||
qq
||
22
||
u
|
2
u
|
2
||
e e 22++
22
11
22
++
nn
||
22
11
22
||
c
|
2
1
2
c
|
2
1
2
||
yy | |
22
11
22
||
1
+
22
222
2
222
121
212112211
2
1
11
1
1
1
1+
22
222
2 222 121
212112211
2
1
11
1
1
1 ++
222 22 222
222 121
121
212112211
||
2222
222
212112211
22
11
1111
11
11
11 | |
222 22 222
222 121
121
212112211
||
2222
222
212112211
22
11
1111
11
11
11 | |
|
22
222
2
222
121
212112211
2
1
11
1
1
|
22
222
2 222 121
212112211
2
1
11
1
1
11 | |
X---------------------+---------------------+---------------------+---------------------+---------------------+---------------------X
X---------------------+---------------------+---------------------+---------------------+---------------------+---------------------X
out
-2.0
-1.0
1.0
2.0
out
out
-2.0
-1.0
.0.0
1.0
2.0
out
Class 22222222222222222222222222222222222222222222222222222222222222221111111111111111111111111111111111111111111111111111111111
Class
Centroids
Centroids
22
11

17-4 Principal Components and


Factor Analysis
y

First Component
Total
Variance

Variance
Remaining After
Extraction of
First Second Third

Second Component
Component

17-22

17-23

Factor Analysis
Thekkoriginal
originalXXivariables
variableswritten
writtenas
aslinear
linearcombinations
combinationsof
ofaasmaller
smallerset
setof
of
The
i
commonfactors
factorsand
andaaunique
uniquecomponent
componentfor
foreach
eachvariable:
variable:
mmcommon
F2 +...+ b1mFm + U1
1+bb12
XX11==bb1111FF1+
12F2 +...+ b1mFm + U1
F2 +...+ b2mFm + U2
1+bb22
XX11==bb2121FF1+
22F2 +...+ b2mFm + U2

......

F2 +...+ bkmFm + Uk
1+bbk2
XXkk==bbk1k1FF1+
k2F2 +...+ bkmFm + Uk
TheFFjare
arethe
thecommon
commonfactors.
factors. Each
EachUUiisisthe
theunique
uniquecomponent
componentof
of
The
j
i
variableXX.i. The
Thecoefficients
coefficientsbbijare
arecalled
calledthe
thefactor
factorloadings.
loadings.
variable
i
ij
Totalvariance
varianceininthe
thedata
dataisisdecomposed
decomposedinto
intothe
thecommunality,
communality,the
the
Total
commonfactor
factorcomponent,
component,and
andthe
thespecific
specificpart.
part.
common

17-24

Rotation of Factors
Factor 2

Orthogonal Rotation

Factor 2

Rotated Factor 2

Oblique Rotation
Rotated Factor 2

Factor 1
Factor 1
Rotated Factor 1
Rotated Factor 1

17-25

Factor Analysis of Satisfaction Items


Satisfaction with:
Information
1
2
3
4
Variety
5
6
7
8
9
10
Closure
11
12
Pay
13
14

Factor Loadings
1
2

4 Communality

0.87
0.88
0.92
0.65

0.19
0.14
0.09
0.29

0.13
0.15
0.11
0.31

0.22
0.13
0.12
0.15

0.8583
0.8334
0.8810
0.6252

0.13
0.17
0.18
0.11
0.17
0.20

0.82
0.59
0.48
0.75
0.62
0.62

0.07
0.45
0.32
0.02
0.46
0.47

0.17
0.14
0.22
0.12
0.12
0.06

0.7231
0.5991
0.4136
0.5894
0.6393
0.6489

0.17
0.12

0.21
0.10

0.76
0.71

0.11
0.12

0.6627
0.5429

0.17
0.10

0.14
0.11

0.05
0.15

0.51
0.66

0.3111
0.4802

You might also like