Professional Documents
Culture Documents
Ci Wi 1 X 1 Wi 2 X 2 Wip X p
Each variable is a weighted linear combination of the components.
X j A1 j C1 A2 j C2 Amj Cm
X j A1 j F1 A2 j F2 Amj Fm U j
Uj is the unique variance for Xj
Data Reduction
Ossenkopp and Mazmanian (Physiology and Behavior, 34: 935-941). 19 behavioral and physiological variables. A single criterion variable, physiological response to four hours of cold-restraint Extracted five factors. Used multiple regression to develop a model for predicting the criterion from the five factors.
Simulated jury trial, seemingly insane defendant killed a man. Criterion variable = recommended verdict
Guilty Guilty But Mentally Ill Not Guilty By Reason of Insanity.
Predictor variables = jurors scores on 8 scales. Discriminant function analysis. Problem with multicollinearity. Used PCA to extract eight orthogonal components. Predicted recommended verdict from these 8 components. Transformed results back to the original scales.
Download FACTBEER.SAV from http://core.ecu.edu/psyc/wuenschk/SPSS/ SPSS-Data.htm . Analyze, Data Reduction, Factor. Scoot beer variables into box.
Click Descriptives and then check Initial Solution, Coefficients, KMO and Bartletts Test of Sphericity, and Anti-image. Click Continue.
Click Extraction and then select Principal Components, Correlation Matrix, Unrotated Factor Solution, Scree Plot, and Eigenvalues Over 1. Click Continue.
Click Options. Select Exclude Cases Listwise and Sorted By Size. Click Continue.
.665
reputat
a. Measures of Sampling Adequacy (MSA) on main diagonal. Off diagonal are partial correlations x -1.
Visual Aid: Use a Scree Plot Scree is rubble at base of cliff. For our beer data,
Scree Plot
3.5 3.0
2.5
2.0
1.5
1.0
Eigenvalue
.5 0.0 1 2 3 4 5 6 7
Component Number
Only the first two components have eigenvalues greater than 1. Big drop in eigenvalue between component 2 and component 3. Components 3-7 are scree. Try a 2 component solution. Should also look at solution with one fewer and with one more component.
Component 1 2 .760 -.576 .736 -.614 -.735 -.071 .710 -.646 .550 .734 .632 .699 .667 .675
a. 2 components extracted.
All variables load well on first component, economy and quality vs. reputation. Second component is more interesting, economy versus quality.
Rotate these axes so that the two dimensions pass more nearly through the two major clusters (COST, SIZE, ALCH and COLOR, AROMA, TASTE). The number of degrees by which I rotate the axes is the angle PSI. For these data, rotating the axes -40.63 degrees has the desired effect.
Component 1 = Quality versus reputation. Component 2 = Economy (or cheap drunk) versus reputation.
a Rotated Com pone nt M atrix
Component 1 2 .960 -.028 .958 1.E-02 .952 6.E-02 7.E-02 .947 2.E-02 .942 -.061 .916 -.512 -.533
Ex traction Method: Principal Component A nalys is. Rotation Method: V arimax w ith Kais er Normalization.
In this case, first unrotated factor true factor. But rotation splits the factor, producing an imaginary second factor and corrupting the first. Can avoid this problem by including a garbage variable that will be removed prior to the final solution.
Explained Variance
Square the loadings and then sum them across variables. Get, for each component, the amount of variance explained. Prior to rotation, these are eigenvalues. Here are the SSL for our data, after rotation:
Component 1 2
Rotation Sums of Squared Loadings % of Cumulative Total Variance % 3.017 43.101 43.101 2.912 41.595 84.696
After rotation the two components together account for (3.02 + 2.91) / 7 = 85% of the total variance.
If the last component has a small SSL, one should consider dropping it. If SSL = 1, the component has extracted one variables worth of variance. If only one variable loads well on a component, the component is not well defined. If only two load well, it may be reliable, if the two variables are highly correlated with one another but not with other variables.
Naming Components
For each component, look at how it is correlated with the variables. Try to name the construct represented by that factor. If you cannot, perhaps you should try a different solution. I have named our components aesthetic quality and cheap drunk.
Communalities
For each variable, sum the squared loadings across components. This gives you the R2 for predicting the variable from the components, which is the proportion of the variables variance which has been extracted by the components.
Here are the communalities for our beer data. Initial is with all 7 components, Extraction is for our 2 component solution.
Com m unalitie s
Orthogonal Rotations
Varimax -- minimize the complexity of the components by making the large loadings larger and the small loadings smaller within each component. Quartimax -- makes large loadings larger and small loadings smaller within each variable. Equamax a compromize between these two.
Oblique Rotations
Axes drawn through the two clusters in the upper right quadrant would not be perpendicular.
May better fit the data with axes that are not perpendicular, but at the cost of having components that are correlated with one another. More on this later.