You are on page 1of 32

Factor Analysis

Basics
• Used for Date Reduction & Summarization
• Reduction to manageable level
• Interval or ratio scale
• Interdependence technique: Make no
difference between independent & dependent
variable.
• Multivariate analysis
Assumptions
• Normality
• Factorability of correlation matrix – there
should be sufficient correlations in the data
matrix of variables to justify the application of
factor analysis.
Where we can use?
• Market segmentation
• To determine the brand attributes
• Identify the characteristics of price-sensitive
consumers.
• Understand the media consumption habits of
the target market.
Premises
• Similar to multiple regression analysis i.e. each
variable is expressed in a linear combination.
• Factor model:
Xi = Ai1F1 + Ai2F2 + Ai3F3 + ……..+ AimFm+ViUi
Xi = ith standardized variable
Aij= standardized multiple regression coefficient of variable I on common factor j
F = Common factor
Vi = standardized regression coefficient of variable i on unique factor i
Ui = the unique factor for variable i
m = number of common factors

• The unique factors are uncorrelated with each


other and with the common factors.
Common factors

Fi = Wi1X1 + Wi2X2 + Wi3X3 + ……..+ WikXk


Fi = estimate of ith factor
Wi= weight or factor score coefficient
k = number of variable

• It is possible to select weights or factor score


coefficient so that the first factor explains the
largest portion of the total variance.
Process Formulate the problem

Construct the correlation matrix

Determine the method of factor analysis

Determine the number of factors

Rotate the factors

Interpret the factors

Determine the model Fit


Formulate the Problem
• Includes several tasks.
• Objective of Factor analysis should be
identified.
• Variables to be specified properly.
• Variable: Interval or ratio scale
• Sample Size: 10times as there are variables.
Problem
• Want to determine the underlying benefits
consumers seek from the purchase of a
toothpaste.
• Sample: 30 respondents
• 7-point scale used (1 – strongly disagree, 7 –
strongly agree)
Questions?
V1 It is important to buy a toothpaste that prevents cavities.

V2 I like a toothpaste that gives shiny teeth.

V3 A toothpaste should strengthen your gums.

V4 I prefer a toothpaste that freshens breath.

V5 Prevention of tooth decay is not an important benefit offered


by a toothpaste.

V6 The most important consideration in buying a toothpaste is


attractive teeth.
Factor analysis on SPSS
Construct of correlation matrix
• The analytical process is based on a matrix of
correlations between the variables.
• For factor analysis, the variables must be
correlated.
• If correlations between all the variables are
small, factor analysis may not be appropriate.
Table shows there are relatively high correlations among
V1 (prevention of cavities), V3(strong gums), and V5
(prevention of tooth decay).
Likewise relatively high correlation among V2 (shiny
teeth), V4 (fresh breath), and V6 (attractive teeth).
Expectation is that these variables will correlate with
some set of factors.
Testing Appropriateness & Adequacy
BARTLETT’S TEST OF SPHERICITY
H0 = the variables are uncorrelated i.e. population
correlation matrix is an identity.
• If this hypothesis not rejected, appropriateness of factor
analysis should be questioned.

KAISER-MEYER-OLKIN (KMO) for SAMPLING ADEQUACY


• This index highlights the common variance attributed to
the underlying factors.
• Small value of KMO suggest not to apply factor analysis.
• KMO value greater than 0.5 is desirable.
• The null hypothesis of Bartlett’s test is
rejected. The approx. chi-square is 106.309
with 15 degree of freedom, which is
significant at the 0.05 level.
• The value of KMO is 0.659 (>0.5).
• Thus, factor analysis may be considered an
appropriate technique for analyzing the
correlation matrix.
Determine the method
• The approach used to derive the weights or factor score
coefficients differentiates the various method of factor
analysis.
• Two Approach:
1. Principal Component Analysis: Total variance in the data is
considered. It is recommended when the primary concern
is to determine the minimum number of factors that will
account for maximum variance in the data for use in
subsequent multivariate analysis.
2. Common Factor Analysis: Factors are estimated based
only on the common variance. This method is appropriate
when the primary concern is to identify the underlying
dimensions and the common variance is of interest.
3. Other Approach: methods of unweighted least squares,
generalized least squares, maximum likelihood, image
factoring, and alpha method.
• Communality is the amount of variance a
variable share with all the other variables
being considered. This is also the proportion
of variance explained by the common factors.
• Column shows the eigenvalues. The eigen values for a
factor indicates the total variance attributed to the
factor.
• The total variance accounted for by all six factor is 6.00.
• Factor 1 accounts for a variance of 2.719, which is
(2.719/6)= 45.32 % of total variance, and so on.
Several consideration are involved in
determining the number of factors.
Determine the number of
factors
• It is possible to compute as many principal
components as there are variables, but in doing
so, no parsimony is gained.
• Small is good, but how many?
1. A priori determination.
2. Based on Eigenvalues (<1 ignore)
3. Based on Scree plot (see the steepness)
4. Based on %age variance (at least 60% variance)
5. Based on split half reliability (high correspondence included)
Rotate Factors
• An important output from
factor analysis is the factor
matrix.
• A factor matrix contains the
coefficents used to express
the standardized variables in
terms of the factors.
• These coefficients, factor loading, represent the
correlation between the factors and variables.
• Also called Unrotated factor matrix.
• Although the unrotated factor matrix indicates
the relationship between the factors and
individual variables, it seldom results in
factors that can be interpreted, because
factors are correlated with many variables.
• In rotating the factors, we expect each factor
to have non-zero or significant loading of
coefficient for only some of the variables.
• Rotation does not affect the communalities
and the percentage of total variance
explained.
• Percentage of variance does change.
• Commonly used method is VARIMAX
procedure.
Interpret Factors
Factor 1 has high coefficients
for V1 (prevention of cavities)
and V3 (strong gums), and a
negative coefficient for V5
(prevention of tooth decay is
important)*.
Thus, Factor 1 leads to Health
Benefits

Factor 2 is highly related with variable V2 (shiny


teeth), V4 (fresh breath), and V6 (attractive
teeth). Factor 2 leads to Social Benefits.
Factor loading Plot
• Factor Loading Plot confirms this
interpretation. Variable V1, V3, and V5 are at
the end of horizontal axis (factor 1), with V5 at
the end opposite to V1 and V3.
• Variables V2, V4, and V6 are at the end of the
vertical axis (factor 2).
• Thus, consumer appears to seek two major
Benefits from a toothpaste.
Health Benefits & Social Benefits
Calculate Factor Scores
• Factor analysis has its own stand-alone value.
• If goal of factor analysis is to reduce to a
smaller set of variables for multivariate
analysis.
• The factor score can be used instead of the
original variables in subsequent multivariate
analysis.
Select Surrogate Variables
• Used when variable are important over factors.
• By examining the factor matrix, one could select
for each factor the variable with the highest
loading on that factor.
• That variable could then be used as a surrogate
variable.
• This process works well if one factor loading for a
variable is clearly higher than all other factor
loading.
Determine the model Fit
• If there are many large residuals, the factor
model does not provide a good fit.

• We can see only 5 residuals are larger than


0.05, indicating an acceptable model fit.
Problem Formulation (Garment Company
Example)

 Wishes to assess the changing attitude of its customers towards


a well-established product, in light of many competitors
presence in the market.
 The company has taken a list of 25 loyal customers and
administered a questionnaire to them. The questionnaire
consists of seven statements, which were measured on a 9-point
rating scale with 1 as strongly disagree and 9 strongly agree.
 The description of the seven statements used in the survey is
given as follows:
Seven statements
 X1: Price is a very important factor in purchasing.
 X2: For marginal difference in price, quality cannot be
compromised.
 X3: Quality is OK, but competitor’s price of the same product
cannot be ignored.
 X4: Quality products are having a high degree of durability.
 X5: With limited income, one can afford to spend only small
portion for cloth purchase.
 X6: In the present world of materialism and commercialization,
people are evaluated on the basis of good appearance.
 X7: By paying more if we can get good quality, why not to go for
it.
Perhaps the most
valuable result of all
education is the ability
to make yourself do the
thing you have to do,
when it ought to be
done, whether you like
it or not.
Thank you.

You might also like