You are on page 1of 18

1

Subject: Advanced Quantitative Methods

Assignment # 01

Factor Analysis
Using SPSS Statistics 21

Submitted to:

Dr. Syed Asim

Submitted by:

Mohammad Zainullah

Dated: June 27, 2015

Qurtuba University, Peshawar, KPK


2

Contents
Page No

Introduction 3

FA Equation 3

Sample Size 4

Data Screening 4

Dataset (wiscsem.sav) 4

Utilising SPSS 5

Variable View 6

Data View 6

Further Steps 7

Analyze > Data Reduction > Factor 7

Extraction 8

Rotation 9

The Output 10

Correlation Matrix 10

KMO and Bartlett's Test 11

Communalities 11

Total Variance Explained 12

Scree Plot 13

Factor Matrix 14

Rotated Factor Matrix 14

Revised Output 15

Total Variance Explained 16

Scree Plot 16

Rotated Factor Matrix 16

Factor Scores as two new variables 17

Conclusion 18

References 18
3

Factor Analysis

Introduction
Factor analysis (FA) identifies "invisible" factors representing the hidden organization or
"organizing principle" of whatever is being measured with a number of observable measures or
scales (Navarro, F. H., 2006). In the illustrative example we have “Verbal IQ” and “Performance
IQ” as the hidden organization or factors, while 11 variables as observable measures or scales.
Practitioners may use FA for a variety of purposes such as reducing a large number of items from a
questionnaire or survey instrument to a smaller number of components, uncovering latent
dimensions underlying a data set, or examining which items have the strongest association with a
given factor (DiStefano, Zhu & Mîndrilă, , 2009). Once a researcher has used and identified the
number of factors or components underlying a data set, he may wish to use the information about
the factors in subsequent analyses (Gorsuch, 1983).
Factor analysis is thus a method of data reduction. Data reduction is achieved by seeking underlying
un-observable (latent) variables that are reflected in the observed variables (manifest
variables). Many different methods to conduct a factor analysis are: principal axis factor, maximum
likelihood, generalized least squares, un-weighted least squares. Similarly, many different types of
rotations can be used after the initial extraction of factors, including orthogonal rotations (varimax
and equimax), which requires the factors not to be correlated, and oblique rotations (promax),
which allow the factors to be correlated with one another. Different factor analysis methods may
leads to different results analyzing the same data set.
I have conducted factor analysis using Maximum likelihood method, and the Varimax as the
rotation method.

FA Equation:
FA is a dimensionality reduction multivariate and variable-focused technique i.e., FA represents,
the original variables X1, X2, X3, …… Xn in smaller numbers of underlying factors F1, F2, F3,
………… Fm, whereas m<<<n. The underlying factors are latent or hidden or un-observable
variables.
Unlike Principal Component Analysis (PCA), FA is based on proper statistical model and the ith
original variable Xi can be given by
Xi-µi = li1F1 + li2F2 + …………………….. + limFm + εi
lim = ith factor loading or loading of mth factor on the ith variable or influence of Fm on Xi
It can be positive or negative (values range +1 to -1).
εi = ith Error term
4

Further processed form of the equation is


Var (Xi) = ∑𝒎 𝟐
𝒊=𝟏 𝒍𝒊𝒋 + ΨI

𝒍𝒊𝒋𝟐 = Communality of the model, also noted as ℎ𝑖 2 , represents part of variance contributed
by the factors, it’s like R2, the more communality value the better.

Sample Size
Field (2005) reviews many suggestions about the sample size necessary for factor analysis and
concludes that it depends on many things. In general over 300 cases is probably adequate but
communalities after extraction should probably be above 0.5 (Field, 2005). Tabachnick and Fidell
(2001, page 588) cite Comrey and Lee's (1992) regarding sample size: 50 cases is very poor, 100 is
poor, 200 is fair, 300 is good, 500 is very good, and 1000 or more is excellent. As a rule of thumb,
a minimum of 10 observations per variable is necessary to avoid computational difficulties.

Data Screening
It is important to look at correlation between variables at first. This is because if our test questions
are measuring the same underlying dimension (s), then we would expect the questions to correlate
with each other within reasonable limits. Variables represent questions, so if we find any variables
that do not correlate with any other variables or very few variables correlate with each other then
we should consider excluding these variables before the factor analysis is conducted. The
correlations between variables can be determined using a correlation matrix of all variables. The
opposite problem is when variables correlate too highly, so we have to avoid extreme
multicollinearity and perfect correlation.

Dataset (wiscsem.sav);
The following example demonstrates factor analysis (FA) of 11 subsets of the Wechsler Intelligence
Scale for Children (WISC). The model assesses the relationship between the indicators of IQ, the
two potential underlying constructs or factors representing IQ, i.e., the Verbal IQ and the
Performance IQ.
The Wechsler Intelligence Scale (WIS) is a test designed to measure intelligence in adults and older
adolescents. It is currently in its fourth edition (WAIS-IV), released in 2008 by Pearson. A revised
form of the test, the WAIS-R, was released in 1981 and consisted of six verbal and five
performance subtests. The verbal tests are: “Information, Comprehension, Arithmetic, Digit Span,
Similarities, and Vocabulary” and the Performance subtests are: “Picture Arrangement, Picture
Completion, Block Design, Object Assembly, and Digit Symbol”, which are used as variables in the
factor analysis to follow. The question was whether we can reproduce the Verbal vs. Nonverbal
distinction, with the appropriate subtests grouping into each category (Verbal IQ, Performance IQ),
5

using factor analysis. In the illustrative factor analysis, a “verbal IQ” and a “performance IQ” were
obtained as two final factors extracted.

The dataset “wiscsem.sav” incorporating subscale scores for the Wechsler Intelligence Scale for
Children is downloaded from the website given as under:
http://psych.colorado.edu/~carey/Courses/PSYC7291/ClassDataSets.htm

The 11 variables and the corresponding labels are tabulated as under:

Wechsler Intelligence Scale


Variable Label
(Revised Form)
Information info
Comprehension comp
Arithmetic arith Verbal IQ
Similarities simil
Vocabulary vocab
Digit Span digit
Picture Completion pictcomp
Paragraph Arrangement parang
Block Design block Performance IQ
Object Assembly object
Coding coding

Utilising SPSS
After placing the dataset file “wiscsem.sav” in the folder C:\Documents and
Settings\Administrator\Desktop\FA June 23 Important\WAIS-R, I activated the file within the
SPSS 21 environment. The Variable View, Data View and the subsequent steps are shown below:
6

Variable View

Data View
7

Further Steps
To conduct FA, after initiating the program SPSS 21, selected “Analyze” menu and then chose
“Data Reduction” as FA is intended to reduce the complexity in a set of data, so after Analyze >
Data reduction, chose “Factor” for FA i.e., Analyze > Data Reduction > Factor as shown in
the figure given below:

To select an “extraction method” and a “rotation method.” Hit the “Extraction” button to specify
extraction method.
8

Extraction Selection:

In this dialog box, I have left the box labeled “Un-rotated factor solution” in its default setting,
while checked the box for a “Scree plot” to have a Scree diagram which is one of the ways to
decide how many factors to extract visually.

Thirdly, in the “Extract” section, the default setting is to use the Kaiser stopping criterion (i.e., all
factors with eigenvalues greater than 1) to decide how many factors to extract. We can opt for
factors having a higher eigenvalue by setting the value in the specified filed. Alternatively if we
already know about the number of factors to extract then we can put the number into the box.
After clicking the Continue, the main box will be in focus again for Rotation selection.
9

Rotation Selection:

Clicking the “Rotation” tab, leads us to choose a “rotation method” for our factor analysis. A
rotation method gets factors as different from each other as possible, and thus helps to interpret the
factors by putting each variable primarily on one of the factors. We have to decide whether we want
an “orthogonal” solution (e.g., Quartimax, Equimax, Promax, Varimax) i.e., factors are not highly
correlated with each other, or an “oblique” solution (e.g., Direct Oblimin) i.e., factors are correlated
with one another. I have used Varimax method for factor rotation.

Also the “rotated solution” checkbox is checked to have the factor loadings for each individual
variable in our dataset, also to have make up names for different factors.
Hitting “Continue” in the sub-dialog, and then “OK” in the main dialog to see the output:
10

SPSS Output:

[DataSet1] C:\Documents and Settings\Administrator\Desktop\WISE\wiscsem.sav

Correlation Matrix

Inform Compre Arith Simila Vocab Digit Picture Paragraph Block Object Coding
ation hension metic rities ulary Span Completion Arrangement Design Assembly
Information 1.000 .467 .494 .513 .625 .345 .230 .202 .229 .185 .007
Comprehension .467 1.000 .392 .510 .531 .236 .407 .187 .369 .322 .061
Arithmetic .494 .392 1.000 .369 .387 .269 .155 .227 .272 .043 .090
Similarities .513 .510 .369 1.000 .538 .260 .369 .298 .261 .269 -.041
Vocabulary .625 .531 .387 .538 1.000 .294 .285 .132 .297 .185 .100
Digit Span .345 .236 .269 .260 .294 1.000 .075 .148 .073 .035 .173
Picture Completion .230 .407 .155 .369 .285 .075 1.000 .249 .382 .363 -.072
Paragraph Arrangement .202 .187 .227 .298 .132 .148 .249 1.000 .351 .253 .038
Block Design .229 .369 .272 .261 .297 .073 .382 .351 1.000 .399 .107
Object Assembly .185 .322 .043 .269 .185 .035 .363 .253 .399 1.000 .053
Coding .007 .061 .090 -.041 .100 .173 -.072 .038 .107 .053 1.000

It is important to look at correlation between variables at first. The correlations between variables
can be determined by using a correlation matrix of all variables. The problem of collinearity will
exist if the correlation with other variables is higher than or equal to 0.9. Also we have to avoid
extreme multicollinearity and perfect correlation, which by looking into the correlation matrix, I
didn’t find any such problems.
11

KMO and Bartlett's Test

Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .828


Approx. Chi-Square 502.886
Bartlett's Test of
df 55
Sphericity
Sig. .000

Kaiser-Meyer-Olkin Measure of Sampling Adequacy - KMO measure varies between 0 and 1,


and values closer to 1 are better. A value of 0.6 is a suggested minimum, we have KMO value as
0.828, which is closer to 1 and reflects data adequacy for factor analysis, so we can go ahead with
the analysis. Kaiser (1974) recommends accepting values greater than 0.5. Furthermore values
between 0.5 and 0.7 are mediocre, values between 0.7 and 0.8 are good, values between 0.8 and 0.9
are great, values above 0.9 are superb (Hutcheson & Sofroniou, 1999).

Bartlett's Test of Sphericity - It tests the null hypothesis that the correlation matrix is an identity
matrix. P-value = .000 < .001 is significant and thus we can reject the null hypothesis that
correlation matrix is an identity matrix, so we can proceed to conduct factor analysis.

Communalities

Initial Extraction Extraction Method: Maximum Likelihood.


Information .514 .637 Communalities – These are estimates of that part of
Comprehension .448 .506 the variability in each variable that is shared with
Arithmetic .336 .401 others.
Similarities .451 .545
Initial - The individual communalities tell how well
Vocabulary .515 .585
the model is working for the individual variables, and
Digit Span .180 .204 the total communality gives an overall assessment of
Picture Completion .299 .444 performance.
Paragraph .210 .196
Arrangement Communalities less than 0.5 (inadequate) may be due
to cases well below 300 in numbers (Field, 2005)
Block Design .336 .666
Object Assembly .268 .339
Coding .087 .087

Extraction - The values in this column indicate the proportion of each variable's variance that can
be explained by the retained factors (F1, F2, F3). Variables with high values are well represented in
the common factor space, while variables with low values are not well represented. They are the
reproduced variances from the factors that we have extracted. We can find these values on the
diagonal of the reproduced correlation matrix.
12

The communalities for the ith variable are computed by taking the sum of the squared loadings for
that variable. This can be expressed as below:

For example, to compute the communality for the original variable “Information”, we square the
factor loadings for “Information” (from the Factor Matrix) and then add the as under :

(0.719)2 + (-0.340)2 + (0.065)2 = 0.637 and so on for the other variables

We can think of these values as multiple R2 values for regression models predicting the variables
of interest from the 3 factors. In other words, if we perform multiple regression of original variable
“Information” against the three common factors, we obtain an R2 = 0.637, indicating that about
63% of the variation in “Information” is explained by the factor model. The results in the table
given above, suggest that the factor analysis is better in explaining the variations in “Information,
Comprehension, Similarities, and Vocabulary” variables.
So, one assessment of how well this model is doing can be obtained from the communalities, when
values are closer to one. This would indicate that the model explains most of the variation for
those variables. In this case, the model does better for some variables than it does for others. The
model explains “Block Design” the best, and better the other variables such as “Information,
Comprehension, Similarities and the Vocabulary”. However, for other variables such as “Digit
Span, Paragraph Arrangement” the model does not do a good job, explaining only about one
quarter of the variation.

Total Variance Explained

Factor Initial Eigenvalues Extraction Sums of Squared Rotation Sums of Squared


Loadings Loadings
Total % of Cumulative Total % of Cumulative Total % of Cumulative
Variance % Variance % Variance %
1 3.829 34.806 34.806 3.330 30.271 30.271 2.399 21.811 21.811
2 1.442 13.109 47.915 .876 7.959 38.231 1.800 16.365 38.176
3 1.116 10.147 58.062 .404 3.669 41.900 .410 3.724 41.900
4 .890 8.092 66.153
5 .768 6.985 73.138
6 .633 5.753 78.891
7 .595 5.412 84.303
8 .522 4.749 89.051
9 .471 4.281 93.332
10 .419 3.806 97.138
11 .315 2.862 100.000
13

Extraction Method: Maximum Likelihood.


Factor - The initial number of factors is the same as the number of variables (11) used in the factor
analysis. However, not all 11 factors will be retained but with the help of Kaiser’s rule or Scree
plot, important factors will be extracted and retained.
Initial Eigenvalues - Eigenvalues are the variances of the factors. Each variable has a variance of
1, as the variables are standardized, and the total variance is equal to the number of variables used
in the analysis, which is 11.
Total - This column contains the eigenvalues. The first factor will always account for the most
variance and hence have the highest eigenvalue, each successive factor will account for less and less
variance.
% of Variance - This column contains the percent of total variance accounted for by each factor.
So 34.806% of total variance is explained by or accounted for Factor 1, 13.109% of the total
variance explained by Factor 2, and 10.147% by Factor 3.
Cumulative % - This column contains the cumulative percentage of variance accounted for by the
current and all preceding factors. For example, the third row shows a value of 58.062. This means
that the first three factors together account for 58.062% of the total variance.
Extraction Sums of Squared Loadings – In this section the number of factors retained are
mentioned, one row for each retained factor. The values are based on the common variance and not
on the total variance.
Rotation Sums of Squared Loadings – It represents the distribution of the variance after the
Varimax rotation. Varimax rotation tries to maximize the variance of each of the factors, so the
total amount of variance accounted for is redistributed over the three extracted factors.

Scree Plot:
14

The Scree plot is the graph of eigenvalue against the factor number. In the Scree Plot, the slope of
curve seems to levels out after two factors, where as Kaiser’s rule (Eigen values > 1) guides us to
having 3 factors. From the second factor on, the line is almost flat, meaning the each successive
factor is accounting for smaller and smaller amounts of the total variance, so retaining two factors is
recommended (Cattell, 1966).

Factor Matrix

Factor Factor Matrix - This table contains the


1 2 3
un-rotated factor loadings i.e.,
Information .719 -.340 .065
Comprehension .703 .005 -.107 correlations between variables and
Arithmetic .552 -.180 .252 factors and how the variables are
Similarities .696 -.125 -.212 weighted for each factor or the
Vocabulary .727 -.238 .005
variables load on the extracted
Digit Span .354 -.239 .146
Picture Completion .504 .308 -.309 factors. Because these are correlations,
Paragraph Arrangement .371 .234 .055 possible values range from -1 to
Block Design .561 .549 .225
+1. The columns under this heading
Object Assembly .406 .382 -.167
Coding .075 .025 .285 “Factor” are the un-rotated factors that
Extraction Method: Maximum Likelihood. have been extracted.
3 factors extracted. 11 iterations required.

Rotated Factor Matrix

Factor
1 2 3
Information .779 .156 .073
Comprehension .551 .449 -.032
Arithmetic .556 .140 .269
Extraction Method: Maximum Likelihood.
Similarities .620 .366 -.160
Vocabulary .721 .252 .035 Rotation Method: Varimax with Kaiser Normalization.
Digit Span .431 -.003 .134 Rotation converged in 5 iterations.
Picture .202 .605 -.194
Completion
Paragraph .154 .392 .135
Arrangement
Block Design .118 .713 .379
Object Assembly .084 .573 -.050
Coding .054 .005 .290
15

The Rotated Factor Matrix shows us the factor loadings (correlations between variables and
factors, and how the variables are weighted for each factor) for each variable, i.e., highlighting the
factor that each variable loaded most strongly on (high positive loadings). Based on these factor
loadings, I have spotted the factors as:
1. The first 6 variables load high positively on Factor 1, which can be termed as “Verbal IQ”,
these are Information, Comprehension, Arithmetic, Similarities, Vocabulary and Digital
Span.
2. The variables “Picture Completion, Paragraph Arrangement, Block Design, Object
Assembly” load strongly or high positive on Factor 2, which can be termed as “Performance
IQ”
3. The variable named Coding load positively on Factor 3. Probably Factor 3 is “Freedom from
Distraction,” because these are concentration-intensive tasks. But these factor loadings
(correlations between variables and factors and how the variables are weighted for each
factor or the variables load on factors) are less than 0.3, implying no more meaningful, so
preferring 2-factor solution and re-conducting factor analysis with a pre-set 2-factor
solution.

Revised Output
It was important to know whether I can differentiate “verbal” from “nonverbal” tasks. I have got a
3-factor solution, based on Kaiser’s Rule (Eigen Values > 1), the variables “Digit Span”, and
“Coding” loadings on factor 3 (weak positively), creating some confusion so forcing SPSS’s
manually to extract two factors F1 and F2.
To achieve the pre-set number of factors, going back to the main dialog, and then to the
“Extraction” sub-dialog. Under “Extract,” inserted “Number of factors = 2” and clicked continue
and OK.
16

The revised output, with a two-factor solution is given as under:

Total Variance Explained

Factor Initial Eigenvalues Rotation Sums of Squared Loadings


Total % of Cumulative Total % of Cumulative
Variance % Variance %
1 3.829 34.806 34.806 2.355 21.409 21.409
2 1.442 13.109 47.915 1.765 16.050 37.458
3 1.116 10.147 58.062
4 .890 8.092 66.153
5 .768 6.985 73.138
6 .633 5.753 78.891
7 .595 5.412 84.303
8 .522 4.749 89.051
9 .471 4.281 93.332
10 .419 3.806 97.138
11 .315 2.862 100.000
Extraction Method: Maximum Likelihood.

The revised output has two extracted factors, and that each factor accounts for a 37.458 % of the
total variability in the variables.

Scree Plot:

It is same, along with the description mentioned above.

Rotated Factor Matrix

Factor
1 2
Information .783 .172
Comprehension .534 .471 Extraction Method: Maximum Likelihood.
Arithmetic .560 .153 Rotation Method: Varimax with Kaiser
Similarities .584 .386 Normalization.
Vocabulary .727 .255 Rotation converged in 3 iterations.
Digit Span .430 .022
Picture Completion .176 .601
Paragraph Arrangement .146 .407
Block Design .168 .614
Object Assembly .056 .610
Coding .069 .020
17

In the Rotated Factor Matrix, we have the revised factor loadings (correlations between variables
and factors and how the variables are weighted for each factor or the variables load on factors). The
variable “Coding” doesn’t load strongly on either of the extracted factors 1 or 2, but the two factors
of “Verbal” and “Performance” IQ have relatively high positive factor loadings and have thus
emerged more strongly. These factors can be used as variables for further analysis.

Factor Scores as two new variables:

Factor scores FAC1_1 and FAC2_1 are the composite variables which provide information about a
variable’s placement on the factor(s). Once a researcher has used FA and has identified the number
of factors or components underlying a data set, he/she may wish to use the information about the
factors in subsequent analyses (Gorsuch, 1983). To use FA information in follow-up studies, the
researcher must create scores to represent each individual’s placement on the factor(s) identified
from the FA. These factor scores may then be used to investigate the research questions of interest
(DiStefano, Zhu & Mîndrilă, 2009).
18

Conclusion:

We have 11 observable variables that identify two hidden factors F1 and F2. Factor loadings on
hidden Factor 1 across the six variables: are 0.783, 0.534, 0.560, 0.584, 0.727, 0.430. These factor
loadings indicate that observable measures 1 through 6 can be used to "describe" hidden Factor 1; in
other words, Factor 1 has characteristics very similar to what observable measures 1 through 6
measure. Observable variables 7 through 11 are not useful to describe hidden Factor 1 because their
factor loadings on hidden Factor 1 are too small (not > or = to .50).
Similarly, factor loadings on hidden Factor 2 across the 4 variables are .601, .407, .614, .610, these
factor loadings indicate that observable measures 7 through 10 can be used to "describe" hidden
Factor 2; in other words, Factor 2 has characteristics very similar to what observable measures 7
through 10 measure. Observable variables 1 through 6 and 11 are not useful to describe hidden
Factor 2 because their factor loadings on hidden Factor 1 are too small (not > or = to .50).

Factor analysis has thus identified "invisible" factors F1 and F2, which represent the hidden
organization or "organizing principle" of Verbal IQ and Performance IQ, with a number of
observable measures or scales (Navarro, F. H., 2006). Factor scores thus indicate how each
"hidden" factors (F1 and F2) are associated with the "observable" variables used in our analysis.

References:
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate behavioral research,
1(2), 245-276.
DiStefano, C., Zhu, M., & Mindrila, D. (2009). Understanding and using factor scores:
Considerations for the applied researcher. Practical Assessment, Research & Evaluation, 14(20), 1-
11.
Field, A. (2005). Discovering statistics using SPSS (2nd edition), London: Sage
Gorsuch, R. (1983). Factor analysis. Hillsdale, NJ: L. Erlbaum Associates.
Hutcheson, G. D., & Sofroniou, N. (1999). The multivariate social scientist: Introductory statistics
using generalized linear models. Sage.
Tabachnick, B. G., & Fidell, L. S. (2001). Using Multivariate Statistics (4th Ed.). Needham
Heights, MA: Allyn & Bacon.