You are on page 1of 45

Principal Components Analysis with PASW

Karl L. Wuensch Dept of Psychology East Carolina University

When to Use PCA


You have a set of p continuous variables. You want to repackage their variance into m components. You will usually want m to be < p, but not always.

Components and Variables


Each component is a weighted linear combination of the variables

Ci Wi 1 X 1 Wi 2 X 2 Wip X p
Each variable is a weighted linear combination of the components.

X j A1 j C1 A2 j C2 Amj Cm

Factors and Variables


In Factor Analysis, we exclude from the solution any variance that is unique, not shared by the variables.

X j A1 j F1 A2 j F2 Amj Fm U j
Uj is the unique variance for Xj

Goals of PCA and FA


Data reduction. Discover and summarize pattern of intercorrelations among variables. Test theory about the latent variables underlying a set a measurement variables. Construct a test instrument. There are many others uses of PCA and FA.

Data Reduction
Ossenkopp and Mazmanian (Physiology and Behavior, 34: 935-941). 19 behavioral and physiological variables. A single criterion variable, physiological response to four hours of cold-restraint Extracted five factors. Used multiple regression to develop a model for predicting the criterion from the five factors.

Exploratory Factor Analysis


Want to discover the pattern of intercorrleations among variables. Wilt et al., 2005 (thesis). Variables are items on the SOIS at ECU. Found two factors, one evaluative, one on difficulty of course. Compared FTF students to DE students, on structure and means.

Confirmatory Factor Analysis


Have a theory regarding the factor structure for a set of variables. Want to confirm that the theory describes the observed intercorrelations well. Thurstone: Intelligence consists of seven independent factors rather than one global factor. Often done with SEM software

Construct A Test Instrument


Write a large set of items designed to test the constructs of interest. Administer the survey to a sample of persons from the target population. Use FA to help select those items that will be used to measure each of the constructs of interest. Use Cronbach alpha to check reliability of resulting scales.

An Unusual Use of PCA


Poulson, Braithwaite, Brondino, and Wuensch (1997, Journal of Social Behavior and Personality, 12, 743-758).

Simulated jury trial, seemingly insane defendant killed a man. Criterion variable = recommended verdict
Guilty Guilty But Mentally Ill Not Guilty By Reason of Insanity.

Predictor variables = jurors scores on 8 scales. Discriminant function analysis. Problem with multicollinearity. Used PCA to extract eight orthogonal components. Predicted recommended verdict from these 8 components. Transformed results back to the original scales.

A Simple, Contrived Example


Consumers rate importance of seven characteristics of beer.
low Cost high Size of bottle high Alcohol content Reputation of brand Color Aroma Taste

Download FACTBEER.SAV from http://core.ecu.edu/psyc/wuenschk/SPSS/ SPSS-Data.htm . Analyze, Data Reduction, Factor. Scoot beer variables into box.

Click Descriptives and then check Initial Solution, Coefficients, KMO and Bartletts Test of Sphericity, and Anti-image. Click Continue.

Click Extraction and then select Principal Components, Correlation Matrix, Unrotated Factor Solution, Scree Plot, and Eigenvalues Over 1. Click Continue.

Click Rotation. Select Varimax and Rotated Solution. Click Continue.

Click Options. Select Exclude Cases Listwise and Sorted By Size. Click Continue.

Click OK, and PASW completes the Principal Components Analysis.

Checking for Unique Variables 1


Check the correlation matrix. If there are any variables not well correlated with some others, might as well delete them.

Checking for Unique Variables 2


Correlation Matrix
cost size alcohol reputat color aroma taste

cost size alcohol reputat color aroma taste

1.00 .832 .767 -.406 .018 -.046 -.064

.832 1.00 .904 -.392 .179 .098 .026

.767 .904 1.00 -.463 .072 .044 .012

-.406 -.392 -.463 1.00 -.372 -.443 -.443

.018 .179 .072 -.372 1.00 .909 .903

-.046 .098 .044 -.443 .909 1.00 .870

-.064 .026 .012 -.443 .903 .870 1.00

Checking for Unique Variables 3


Bartletts test of sphericity tests null that the matrix is an identity matrix, but does not help identify individual variables that are not well correlated with others.
KMO and Bartle tt's Te s t

Kaiser-Meyer-Olkin Measure of Sampling Adequacy.

.665

Bartlett's Test of Sphericity

Approx. Chi-Square 1637.9 df 21 Sig. .000

Checking for Unique Variables 4


For each variable, check R2 between it and the remaining variables. PASW reports these as the initial communalities when you do a principal axis factor analysis Delete any variable with a low R2 .

Checking for Unique Correlations


Look at partial correlations pairs of variables with large partial correlations share variance with one another but not with the remaining variables this is problematic. Kaisers MSA will tell you, for each variable, how much of this problem exists. The smaller the MSA, the greater the problem.

Checking for Unique Correlations 2


An MSA of .9 is marvelous, .5 miserable. Variables with small MSAs should be deleted Or additional variables added that will share variance with the troublesome variables.

Checking for Unique Correlations 3


Anti-image Matrices

cost Anti-image Correlation cost size alcohol .779a -.543 .105

size -.543 .550a -.806

alcohol .105 -.806 .630a

reputat .256 -.109 .226

color .100 -.495 .381

aroma .135 .061 -.060

taste -.105 .435 -.310

reputat

.256 .100 .135 -.105

-.109 -.495 .061 .435

.226 .381 -.060 -.310

.763a -.231 .287 .257

-.231 .590a -.574 -.693

.287 -.574 .801a -.087

.257 -.693 -.087 .676a

color aroma taste

a. Measures of Sampling Adequacy (MSA) on main diagonal. Off diagonal are partial correlations x -1.

Extracting Principal Components 1


From p variables we can extract p components. Each of p eigenvalues represents the amount of standardized variance that has been captured by one component. The first component accounts for the largest possible amount of variance. The second captures as much as possible of what is left over, and so on. Each is orthogonal to the others.

Extracting Principal Components 2


Each variable has standardized variance = 1. The total standardized variance in the p variables = p. The sum of the m = p eigenvalues = p. All of the variance is extracted. For each component, the proportion of variance extracted = eigenvalue / p.

Extracting Principal Components 3


For our beer data, here are the eigenvalues and proportions of variance for the seven components:
Initial Eigenvalues % of Cumulative Component Total Variance % 1 3.313 47.327 47.327 2 2.616 37.369 84.696 3 .575 8.209 92.905 4 .240 3.427 96.332 5 .134 1.921 98.252 6 9.E-02 1.221 99.473 7 4.E-02 .527 100.000
Ex traction Method: Princ ipal Component Analy sis.

How Many Components to Retain


From p variables we can extract p components. We probably want fewer than p. Simple rule: Keep as many as have eigenvalues 1. A component with eigenvalue < 1 captured less than one variables worth of variance.

Visual Aid: Use a Scree Plot Scree is rubble at base of cliff. For our beer data,
Scree Plot
3.5 3.0

2.5

2.0

1.5

1.0

Eigenvalue

.5 0.0 1 2 3 4 5 6 7

Component Number

Only the first two components have eigenvalues greater than 1. Big drop in eigenvalue between component 2 and component 3. Components 3-7 are scree. Try a 2 component solution. Should also look at solution with one fewer and with one more component.

Loadings, Unrotated and Rotated


loading matrix = factor pattern matrix = component matrix. Each loading is the Pearson r between one variable and one component. Since the components are orthogonal, each loading is also a weight from predicting X from the components. Here are the unrotated loadings for our 2 component solution:

a Com ponent Matrix

COLOR AROMA REPUTAT TASTE COST ALCOHOL SIZE

Component 1 2 .760 -.576 .736 -.614 -.735 -.071 .710 -.646 .550 .734 .632 .699 .667 .675

Ex traction Method: Princ ipal Component A naly sis.

a. 2 components extracted.

All variables load well on first component, economy and quality vs. reputation. Second component is more interesting, economy versus quality.

Rotate these axes so that the two dimensions pass more nearly through the two major clusters (COST, SIZE, ALCH and COLOR, AROMA, TASTE). The number of degrees by which I rotate the axes is the angle PSI. For these data, rotating the axes -40.63 degrees has the desired effect.

Component 1 = Quality versus reputation. Component 2 = Economy (or cheap drunk) versus reputation.
a Rotated Com pone nt M atrix

TASTE AROMA COLOR SIZE ALCOHOL COST REPUTAT

Component 1 2 .960 -.028 .958 1.E-02 .952 6.E-02 7.E-02 .947 2.E-02 .942 -.061 .916 -.512 -.533

Ex traction Method: Principal Component A nalys is. Rotation Method: V arimax w ith Kais er Normalization.

a. Rotation converged in 3 iterations.

Number of Components in the Rotated Solution


Try extracting one fewer component, try one more component. Which produces the more sensible solution? Error = difference in obtained structure and true structure. Overextraction (too many components) produces less error than underextraction. If there is only one true factor and no unique variables, can get factor splitting.

In this case, first unrotated factor true factor. But rotation splits the factor, producing an imaginary second factor and corrupting the first. Can avoid this problem by including a garbage variable that will be removed prior to the final solution.

Explained Variance
Square the loadings and then sum them across variables. Get, for each component, the amount of variance explained. Prior to rotation, these are eigenvalues. Here are the SSL for our data, after rotation:

Total V ariance Explaine d

Component 1 2

Rotation Sums of Squared Loadings % of Cumulative Total Variance % 3.017 43.101 43.101 2.912 41.595 84.696

Ex traction Method: Princ ipal Component A naly sis.

After rotation the two components together account for (3.02 + 2.91) / 7 = 85% of the total variance.

If the last component has a small SSL, one should consider dropping it. If SSL = 1, the component has extracted one variables worth of variance. If only one variable loads well on a component, the component is not well defined. If only two load well, it may be reliable, if the two variables are highly correlated with one another but not with other variables.

Naming Components
For each component, look at how it is correlated with the variables. Try to name the construct represented by that factor. If you cannot, perhaps you should try a different solution. I have named our components aesthetic quality and cheap drunk.

Communalities
For each variable, sum the squared loadings across components. This gives you the R2 for predicting the variable from the components, which is the proportion of the variables variance which has been extracted by the components.

Here are the communalities for our beer data. Initial is with all 7 components, Extraction is for our 2 component solution.
Com m unalitie s

COST SIZE ALCOHOL REPUTAT COLOR AROMA TASTE

Initial 1.000 1.000 1.000 1.000 1.000 1.000 1.000

Extraction .842 .901 .889 .546 .910 .918 .922

Ex traction Method: Princ ipal Component A naly sis.

Orthogonal Rotations
Varimax -- minimize the complexity of the components by making the large loadings larger and the small loadings smaller within each component. Quartimax -- makes large loadings larger and small loadings smaller within each variable. Equamax a compromize between these two.

Oblique Rotations
Axes drawn through the two clusters in the upper right quadrant would not be perpendicular.

May better fit the data with axes that are not perpendicular, but at the cost of having components that are correlated with one another. More on this later.

You might also like