You are on page 1of 37

Multivariate Data Analysis Technique

Factor Analysis
Presented by:
Gurvinder kaur

Under the Guidance of:


Dr. Divya Malhan

Factor Analysis
Most popular analysis of interdependence
interdependence technique.
Basic function is to identify groups of variables that
are related
Main purposes in Marketing & HR:
to identify underlying constructs in the data
Using Common Factor Analysis
to reduce the number of variables to a more
manageable subset
Using Principal Components Analysis

Marketing Research

So Most Common Procedures


Common factor analysis
Used when we want uncover the underlying
dimensions surrounding the original
variables (1st purpose)

Principal components analysis


Used when the objective is to reduce a large
set variables into a smaller number of
factors (2nd purpose)
Example: Factor analysis of 14 variables to find how similar variables
group together to reduce those 14 variables to 4 factors (The
reduced variables are known as factors)

Marketing Research

Factor, Loading, Eigen value


Factor: A variable (or construct) that is not
directly observable but needs to be inferred
from the input variables
Factor Loading: A correlation coefficient
showing the importance of each variable in
defining the factor
Eigen value: Represents the variance in the
original variables that is explained by a factor
(standardized so that the avg. variable has an
eigenvalue = 1.0)

Marketing Research

Statistics Associated with Factor Analysis


Correlation matrix. A correlation matrix is a lower triangle matrix showing
the simple correlations, r, between all possible pairs of variables included in
the analysis. The diagonal elements, which are all 1, are usually omitted.
Communality. Communality is the amount of variance a variable shares
with all the other variables being considered. This is also the proportion of
variance explained by the common factors.
Low communities indicates the variable is independent and cannot be
combined with other variables.
Scree Plot: Plot of eigen values against the number of factors in order of
extraction.

Factor Rotation
Factor analysis can generate several solutions
(loadings & factor scores) for any data set
each solution is called a factor rotation
each time the factors are rotated the pattern of
loadings changes
geometrically, rotation simply means that the axes
are rotated

Varimax rotation--each factor tends to load


high on some variables and low on others
makes factor interpretation easier

Marketing Research

Statistics Associated with Factor Analysis


Bartletts test of Sphericity:
Sphericity used to examine the hypothesis that
the variables are uncorrelated in the population.
population Population
correlation matrix is also known as identity matrix. Each
variable correlates perfectly (r = 1) but has no correlation with
other variables (r = 0).
Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy:
adequacy
Index used to examine the appropriateness of Factor Analysis.
Analysis
High values (b/w 0.5 and 1.0) indicates the FA is appropriate.

Factor Solution: Orthogonal or Oblique

Factor solutions - A factor solution is simply the set of


factors that result from a factor analysis experiment.
Factor solutions fall into two major types:
Orthogonal factor solutions yield factors that are
statistically independent and can be used with other
statistical procedures that must satisfy assumptions of
statistical independence.
Oblique factor solutions are solutions for which factors
are correlated. They are usually a better reflection of the
underlying reality which the researcher seeks to describe.

General Steps to FA
Step 1: Selecting and Measuring a set of
variables in a given domain

Step 2: Data screening in order to prepare the


correlation matrix

Step 3: Factor Extraction


Step 4: Factor Rotation to increase
interpretability

Step 5: Interpretation
Further Steps: Validation and Reliability of
the measures

Conducting
Conducting Factor
Factor Analysis
Analysis
To determine the benefits consumers seek from
the purchase of a toothpaste.
A sample of 30 respondents were interviewed
and asked to indicate their degree of
agreement with the following statements using
7 point scale
(1 = Strongly Disagree and 7= Strongly Agree)

Lifestyle statements constructed for identifying the


consumer opinions
V1: It is important to buy a toothpaste that prevents
cavities.
V2: I like a toothpaste that gives shiny teeth.
V3: A toothpaste should strengthen your gums.
V4: I prefer a toothpaste that freshens breath.
V5: Prevention of tooth decay is not an important
benefit offered by a toothpaste brand.
V6: The most important consideration in buying a
toothpaste is attractive teeth

Correlation Matrix
Variables V1
V1
V2
V3
V4
V5
V6

V2
1
-0.53
0.873
-0.086
-0.858
0.004

Factor Analysis of
Perceptual Variables

V3
1
-0.155
0.572
0.02
0.64

V4

1
-0.248
-0.778
-0.018

V5

1
-0.007
0.64

V6

1
-0.136

Rotated Factor Matrix


Variables Factor 1 Factor 2
V1
0.962
-0.027
V2
-0.057
0.848
V3
0.934
-0.146
V4
-0.098
0.845
V5
-0.933
-0.084
V6
0.083
0.885
I = 2.688031 2.245455

h2
0.926173
0.722353
0.893672
0.723629
0.877545
0.790114

Factors representing Variables


Factor 1 has high coefficients for variables:
V1(prevention of cavities)and V3 (strong gums). Thus,
factor may be labeled as health benefit factor.

Factor 2 has high correlation coefficients for variables:


V2(shiny teeth), V4(fresh breath), and V6 (attractive
teeth). Thus, factor may be labeled as Social benefit
factor.

Communality. Communality is the amount of variance a variable shares with


all the other variables being considered. This is also the proportion of
variance explained by the common factors.
h2 = (0.962)2 + (-0.027)2 = 0.926173 and so on.
Low communities indicates the variable is independent and cannot be
combined with other variables.

Eigenvalue. The eigen value represents the total variance


explained by each factor. Approach to determine the number
of factors to be retain in the analysis. If the factor has an eigen
value >1, then the factor is retained for further interpretation.
interpretation
I for Factor 1 = (0.962)2 +(-0.057)2+..+(0.083)2 = 2.688
Factor loadings. Factor loadings are simple correlations
between the variables and the factors. These are correlation
coefficients within the matrix that indicates the importance
of the factor. Loading limit is from -1.0 to +1.0
High Correlation between V1, V3 and V5
Low Correlation between V2, V4 and V6.
Total Variance: percentage of variance explained i.e. total original
variance of 6 variables explained by 2 factors.

C.V. = sum of all h2s x 100 = (0.926173+..+0.790114)/6


number of variables = 4.933486 x 100 = 82.22477%

Factor Variance:
Amount of variation explained by each
factor.
= (0.962)2 +(-0.057)2+..+(0.083)2 =
2.688
6
0.448005%

Interpretation
Interpretation of
of Factor
Factor Analysis
Analysis
Consumers appear to seek two major kinds of
benefits from a toothpaste: health benefits and
social benefits.
Total variance explained about 82.22477 %
variance of 6 variables.
variables Therefore, the
remaining 17.7752 % of variance indicates that
there are some other variables also influencing
consumer preferences.

Factor Analysis in SPSS: Results


KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling
Adequacy.
Bartlett's Test of
Sphericity

Approx. Chi-Square
df
Sig.

.660
111.314
15
.000

Here,
Null Hypothesis:
H0 = Popl. Correlation Matrix is an identity matrix

Bartletts Test of Sphericity rejected this hypothesis

The Chi-Square value is 111.314 with 15 degree of freedom (df), which is


significant at the 0.05 level.

The Value of KMO is 0.660 which is >0.5 level. Therefore factor analysis is
an appropriate technique for analysing Correlation Matrix.

Total Variance Explained

Component
1
2
3
4
5
6

Initial Eigenvalues
% of
Cumulative
Total
Variance
%
2.731
45.520
45.520
2.218
36.969
82.488
.442
7.360
89.848
.341
5.688
95.536
.183
3.044
98.580
8.521E-02
1.420
100.000

Extraction Sums of Squared Loadings


% of
Cumulative
Total
Variance
%
2.731
45.520
45.520
2.218
36.969
82.488

Extraction Method: Principal Component Analysis.


Component Matrixa
Component
1
VAR00001
VAR00002
VAR00003
VAR00004
VAR00005
VAR00006

.928
-.301
.936
-.342
-.869
-.177

2
.253
.795
.131
.789
-.351
.871

Extraction Method: Principal Component Analysis.


a. 2 components extracted.

THANK YOU

FA vs. PCA conceptually

FA produces factors; PCA produces


components
Factors cause variables; components
are aggregates of the variables

Conceptual FA and PCA

FA

I1

I2

PCA

I3

I1

I2

I3

You might also like