You are on page 1of 15

Welcome to Powerpoint slides for

Chapter 12

Factor Analysis for Data Reduction


Marketing Research Text and Cases by Rajendra Nargundkar

Slide 1

Introduction

1. Factor Analysis is a set of techniques used for understanding variables by grouping them into factors consisting of similar variables 2. It can also be used to confirm whether a hypothesized set of variables groups into a factor or not

3. It is most useful when a large number of variables needs to be reduced to a smaller set of factors that contain most of the variance of the original variables
4. Generally, Factor Analysis is done in two stages, called Extraction of Factors and Rotation of the Solution obtained in stage 5. Factor Analysis is best performed with interval or ratio-scaled variables

Slide 2 Application Areas/Example 1. In marketing research, a common application area of Factor Analysis is to understand underlying motives of consumers who buy a product category or a brand 2. The worked out example in the chapter will help clarify the use of Factor Analysis in Marketing Research 3. In this example, we assume that a two wheeler manufacturer is interested in determining which variables his potential customers think about when they consider his product

4. Let us assume that twenty two-wheeler owners were surveyed by this manufacturer (or by a marketing research company on his behalf). They were asked to indicate on a seven point scale (1=Completely Agree, 7=Completely Disagree), their agreement or disagreement with a set of ten statements relating to their perceptions and some attributes of the two-wheelers.
5. The objective of doing Factor Analysis is to find underlying "factors" which would be fewer than 10 in number, but would be linear combinations of some of the original 10 variables

Slide 3

The research design for data collection can be stated as followsTwenty 2-wheeler users were surveyed about their perceptions and image attributes of the vehicles they owned. Ten questions were asked to each of them, all answered on a scale of 1 to 7 (1= completely agree, 7= completely disagree). 1. I use a 2-wheeler because it is affordable. 2. It gives me a sense of freedom to own a 2-wheeler. 3. Low maintenance cost makes a 2-wheeler very economical in the long run. 4. A 2-wheeler is essentially a mans vehicle. 5. I feel very powerful when I am on my 2-wheeler. 6. Some of my friends who dont have their own vehicle are jealous of me. 7. I feel good whenever I see the ad for 2-wheeler on T.V., in a magazine or on a hoarding. 8. My vehicle gives me a comfortable ride. 9. I think 2-wheelers are a safe way to travel. 10. Three people should be legally allowed to travel on a 2-wheeler.

Slide 4 The input data containing responses of twenty respondents to the 10 statements are in Appendix 1, in the form of a 20 Row by 10 column matrix (reproduced below). QUESTION NO. S. No. 1 2 3 4 5 6 7 8 9 10 1 1 2 2 5 1 3 2 4 2 1 2 4 3 2 1 2 2 2 4 3 4 3 1 2 2 4 2 3 5 3 2 2 4 6 4 1 2 5 3 1 4 6 2 5 5 3 2 2 4 3 2 4 5 1 6 6 3 1 2 4 3 1 5 6 2 7 5 3 1 2 4 3 2 3 5 1 8 2 5 7 3 1 6 4 2 1 4 9 3 5 6 2 1 5 4 3 4 4 10 2 2 2 3 2 3 5 3 1 1

Table contd on next slide...

Slide 4 contd
QUESTION NO. S. No. 11 12 13 14 15 16 17 18 19 20 1
1 1 3 2 2 5 1 2 3 4

2
5 6 1 2 5 6 4 3 3 3

3
1 1 4 2 1 3 2 1 2 2

4
3 1 4 2 3 2 2 1 3 7

5
2 1 4 2 2 1 1 2 4 6

6
3 1 3 2 3 3 2 2 3 6

7
2 1 3 2 2 2 1 2 4 6

8
2 1 6 1 2 5 1 3 3 2

9
2 2 5 3 1 5 1 2 3 3

10
1 2 3 2 6 4 3 2 3 6

Slide 5 The data are subjected to Factor Analysis in two stages (though the stages are 2, both outputs can be requested at the same time, at least in SPSS, by the process described in the SPSS Commands Appendix to the chapter). 1. In stage 1, we request the software package used (SPSS, Statistica, etc.) to EXTRACT factors with an Eigen Value of 1 or higher. The method requested is the PRINCIPAL COMPONENTS. This gives us the output in Figs. 2 and 3. Fig. 2: Factor Matrix (Unrotated) VAR00001 VAR00002 VAR00003 VAR00004 VAR00005 VAR00006 VAR00007 VAR00008 VAR00009 VAR00010 Factor .17581 .96647 .95098 .95184 .97128 .16143 Factor 2 .66967 -.60774 .81955 -.03627 .16594 -.08442 .09591 .77498 .73502 .31862 Factor 3 .49301 .25369 .21827 -.09745 -.13593 -.02522 -.04636 -.03757 -.48213 -.81356

Slide 6 Interpretation of the Output 1. The first step in interpreting the output is to look at the factors extracted, their eigen values and the cumulative percentage of variance (fig 3, reproduced below). Fig. 3: Final Statistics Variable VAR00001 VAR00002 VAR00003 VAR00004 VAR00005 VAR00006 VAR00007 VAR00008 VAR00009 VAR00010 Comm unality .72243 .45214 .73056 .94488 .95038 .91376 .95474 .79869 .77745 .78946 * Factor Eigenva lue * 1 3.88282 * 2 2.77701 * 3 1.37475 * * * * * * * Pact of Var 38.8 27.8 13.7 Cum Pct 38.8 66.6 80.3

Slide 6 contd...

1. We note that three factors have been extracted, based on our criterion that only Factors with eigen values of 1 or more should be extracted. We see from the Cum. Pct. (Cumulative Percentage of Variance Explained) column in Fig. 3 that the three factors extracted together account for 80.3 percent of the total variance (information contained in the original ten variables). This is a pretty good bargain, because we are able to economise on the number of variables (from 10 we have reduced them to 3 underlying factors), while we lost only about 20 percent of the information content (80 percent is retained by the 3 factors extracted out of the 10 original variables). 2. This represents a reasonably good solution for our problem.

Slide 7 1. Now, we try to interpret what these 3 extracted factors represent. This we can accomplish by looking at figs 4 and 2, the rotated and unrotated factor matrices. Fig. 4: Rotated Factor Matrix Factor 1 .13402 -.18143 -.10944 .96986 .96455 .94544 .97214 -.26169 .00891 .07209 Factor 2 .34749 -.64300 .62985 -.06383 .13362 -.13868 .02862 .85203 .87772 -.10990 Factor 3 .76402 -.07596 .56742 -.01338 .04660 .02600 .09411 .06517 -.08347 .87874

VAR00001 VAR00002 VAR00003 VAR00004 VAR00005 VAR00006 VAR00007 VAR00008 VAR00009 VAR00010

Slide 7 contd...
1. Looking at fig. 4, the rotated factor matrix, we notice that variable nos. 4, 5, 6 and 7 have loadings of 0.96986, 0.96455, 0.94544 and 0.97214 on factor 1 (we look down the Factor 1 column in fig. 4, and look for high loadings close to 1.00). This suggests that Factor 1 is a combination of these four original variables. Fig. 2 also suggests a similar grouping. Therefore, there is no problem interpreting factor 1 as a combination of a mans vehicle (statement in variable 4), feeling of power (variable 5), others are jealous of me (variable 6) and feel good when I see my 2-wheeler ads. 2. At this point, the researchers task is to find a suitable phrase which captures the essence of the original variables which form the underlying concept or factor. In this case, factor 1 could be named male ego, or machismo, or pride of ownership or something similar. With the same mathematical output, interpretations of different researchers may differ.

Slide 8 1. Now we will attempt to interpret factor 2. We look in fig 4, down the column for Factor 2, and find that variables 8 and 9 have high loadings of 0.85203 and 0.87772, respectively. This indicates that factor 2 is a combination of these two variables. 2. But if we look at fig. 2, the unrotated factor matrix, a slightly different picture emerges. Here, variable 3 also has a high loading on factor 2, along with variables 8 and 9. It is left to the researcher which interpretation he wants to use, as there are no hard and fast rules. Assuming we decide to use all three variables, the related statements are low maintenance, comfort and safety (from statements 3, 8 and 9). We may combine these variables into a factor called utility or functional features or any other similar word or phrase which captures the essence of these three statements / variables.

Slide 8 contd... 3. For interpreting Factor 3, we look at the column labelled factor 3 in fig. 4 and find that variables 1 and 10 are loaded high on factor 3. According to the unrotated factor matrix of fig. 2, only variable 10 loads high on factor 3. Supposing we stick to fig. 4, then the combination of affordability and cost saving by 3 people legally riding on a 2-wheeler give the impression that factor 3 could be economy or low cost. 4. We have now completed interpretation of the 3 factors with eigen values of 1 or more. We will now look at some additional issues which may be of importance in using factor analysis.

Slide 9 Additional Issues in Interpreting Solutions 1. We must guard against the possibility that a variable may load highly on more than one factors. Strictly speaking, a variable should load close to 1.00 on one and only one factor, and load close to 0 on the other factors. If this is not the case, it indicates that either the sample of respondents have more than one opinion about the variable, or that the question/ variable may be unclear in its phrasing. 2. The other issue important in practical use of factor analysis is the answer to the question what should be considered a high loading and what is not a high loading? Here, unfortunately, there is no clear-cut guideline, and many a time, we must look at relative values in the factor matrix. Sometimes, 0.7 may be treated as a high value, while sometimes 0.9 could be the cutoff for high values.

Slide 9contd Additional Issues (Contd.) 1. The proportion of variance in any one of the original variables which is captured by the extracted factors is known as Communality. For example, fig. 3 tells us that after 3 factors were extracted and retained, the communality is 0.72243 for variable 1, 0.45214 for variable 2 and so on (from the column labelled communality in fig. 3). This means that 0.72243 or 72.24 percent of the variance (information content) of variable 1 is being captured by our 3 extracted factors together. Variable 2 exhibits a low communality value of 0.45214. This implies that only 45.214 percent of the variance in variable 2 is captured by our extracted factors. This may also partially explain why variable 2 is not appearing in our final interpretation of the factors (in the earlier section). It is possible that variable 2 is an independent variable which is not combining well with any other variable, and therefore should be further investigated separately. Freedom could be a different concept in the minds of our target audience. 2. As a final comment, it is again the authors recommendation that we use the rotated factor matrix (rather than unrotated factor matrix) for interpreting factors, particularly when we use the principal components method for extraction of factors in stage 1.

You might also like