Professional Documents
Culture Documents
Dependent variable will have more than two values Amount spent on family vacation can be High, medium or low thus it is a three-group discriminant analysis Question of interest is whether the households that spend high, medium or low amounts on their vacations can be differentiated in terms of
Annual family income Attitude towards travel Importance attached to family vacation Household size & Age of the head of the household
Group Means
Amount 1 2 Income 38.57 50.11 Travel 4.5 4 Vacation 4.7 4.2 Hsize 3.1 3.4 Age 50.3 49.5
3
Total
64.97
51.22
6.1
4.87
5.9
4.93
4.2
3.57
56
51.93
Group Standard Deviations Amount 1 2 3 Total Income 5.3 6 8.61 12.8 Travel 1.72 2.36 1.2 1.98 Vacation 1.89 2.49 1.66 2.1 Hsize 1.2 1.51 1.14 1.33 Age 8.1 9.25 7.6 8.57
Group means indicate that income appears to differentiate the 3 groups more widely than any other variable. There is some differentiation on travel and vacation, with group 3 being fairly high on both. Group 1 & 2 are very close on household size and age. Age has a large standard deviation relative to the separation between the groups.
Pooled within-groups correlation matrix Income Income Travel 1 0.0512 1 Travel Vacation Hsize Age
Vacation
Hsize Age
0.3068
0.3805 -0.209
0.036
0.005 -0.34
1
0.2208 -0.01326 1 -0.02512 1
There is some correlation between Hsize & Income ; Vacation & Income. Age has some ve correlation with travel. But these correlations are not very high and hence will not be of concern.
Wilks' Lambda and Univariate F ratio with 2 & 27 degrees of freedom Wilks' Lambda 0.26 0.79 0.88 0.87 0.88
Univariate F ratios indicates that when the predictors are considered individually, only income and travel are significant in differentiating between the two groups.
In multiple discriminant analysis, if there are G groups, G-1 discriminant functions can be estimated if the number of predictors is larger than this quantity Thus with G groups and k predictors, it is possible to estimate up to the smaller of G-1 or k discriminant functions The first function has the highest ratio of between-groups to within-groups sum of squares
The second function, uncorrelated with the first has the second highest ratio and so on It is not necessary that all the functions may be statistically significant
Canonical Discriminant Functions Function 1 2 EigenValue 3.82 0.25 Percent of Variance 93.93 6.07 Cumulative Percent 93.93 100 Canonical Correlation 0.89 0.45
Since there are G=3 groups & k=5 predictor variables, the number of discriminant functions will be min(G-1,k)=min(2,5)=2 Eigenvalue associated with the first function is 3.82 & it explains 93.93% of the explained variance. Since it has a large Eigenvalue, function 1 will be superior
After Function 0 1
DF 10 4
Sig. 0 0.24
After Function 0 indicates the significance of the two functions together, whereas Function 1 indicates only function 2 after removal of Function 1
Thus, the two functions together significantly differentiate between the three groups. However, when the first function is removed, the second function is not significant at the 0.05 level. Therefore, the second function does not contribute significantly to the group differences
Standard Canonical Discriminant Function Coefficients Func1 Income Travel Vacation Hsize Age 1.0474 0.33991 -0.14198 -0.16317 0.49474 Func2 -0.42076 0.76851 0.53354 0.12932 0.52447 Pooled withingroups correlations Func1 Func2 -0.27833 0.07749 0.58829 0.45362 0.34079 Income Hsize Vacation Travel Age 0.85556 0.19319 0.21935 0.14899 0.16576
Standardised coefficients indicate a large coefficient for income on func1, whereas travel, vacation and age have a large coefficient on func2
Similarly the correlation matrix indicates that income and hsize have higher correlation on func1 compared to func2.
Vacation, travel and age have higher correlation on func2 compared to func1
Group Centroids Groups 1 2 3 Func1 -2.041 -0.40479 2.44578 Func2 0.41847 -0.65867 0.2402
Group 3 has the highest value on function 1 and since function1 is primarily associated with income and hsize, group 3 will have people with higher income and higher household size. Group 1 is highest on function 2 and Group 2 is lowest. Thus, this function separates these two groups. Since the function is primarily associated with travel, vacation and age, group 1 will be higher than group 2 on these variables
Unstandard Canonical Discriminant Function Coefficients Func1 Income Travel Vacation Hsize Age Constant 0.15427 0.18680 -0.06952 -0.12653 0.05928 -11.09442 Func2 -0.06197 0.42234 0.26127 0.10028 0.06284 -3.79160
Analysis Sample Amount Count Original % Hit Ratio Holdout sample Amount Count Original % Hit Ratio Total 1 2 3 75% 1 2 3 Total 1 2 3 86.70% 1 2 3
3 Total 0 10 0 10 8 10 8 30 0 0 80
3 Total 0 4 1 4 3 4 4 12 0 25 75
Three groups of equal size, so by chance one would expect a hit ratio of 1/3 =33.3%. Thus there is large improvement over chance, thus validating the discriminant
Example1
A recent survey asked business people about the concern of hiring and maintaining employees during the current harsh economic environment If an organisation wants to retain its employees, it must learn why people leave their jobs and why others stay and are satisfied with their jobs Discriminant analysis was used to determine what factors explained the differences between salespeople who left a large computer manufacturing company and those who stayed
Example2
Independent variables were Company rating Job security Seven job satisfaction dimensions Four role-conflict dimensions Four role-ambiguity dimensions Nine measures of sales performance Dependent variable was dichotomous Those who stayed and those who left The canonical correlation, an index of discrimination (R=0.4572), was significant (p =.0180) Results indicated that the variables discriminated between those who left and those who stayed
Discriminant Analysis Results 1 Work 2 Promotion 3 Job Security 4 Customer Relations 5 Company Rating 6 Working with others 7 Overall performance 8 Time-territory management 9 Sales produced 10 Presentation skill 11 Technical Information 12 Pay-benefits 13 Quota achieved 14 Management 15 Information collection 16 Family 17 Sales manager 18 Coworker 19 Customer 20 Family 21 Job 22 Job 23 Customer 24 Sales manager 25 Sales manager 26 Customer Coefficients Standardised Coefficients Structure Correlations 0.0903 0.391 0.5446 0.0288 0.1515 0.5044 0.1567 0.1384 0.4958 0.0086 0.1751 0.4906 0.4059 0.324 0.4824
Characteristic profile1
In the example, based on structure correlations, Promotion was identified as the second most important variable. However, looking at standardised discriminant functions, Promotion is not the second most important variable The anamoly arises because of multi-collinearity In such cases, develop a Characteristic Profile for each group
By describing each group in terms of the group means for the predictor variables
Characteristic profile2
Promotion Company Rating Those who stayed 4.5 4 Those who left 2.3 3.83 Overall 3.42 3.92
Clearly promotion is more discriminating the two groups than company rating. Those who stayed with the company are satisfied with the promotions.
Analyse>Classify>Discriminant
Select Analyse from the SPSS menu bar Click Classify and then Discriminant Move criterion variable into the Grouping Variable box
Taken Vacation in the 1st example; Amt spent on vacation in the 2nd example Enter 1- Taken vacation in last 2 years & 2 the rest Enter 1- Low spenders, 2- Medium spenders, 3- High spenders Move Income, Travel, Vacation, Hsize and Age into the Independents box
Select Enter Independents Together (default option) Click on Statistics. In the pop-up window, in the Descriptives box check Means and Univariate ANOVAS. In the Matrices box check Within Group Correlations. Click Continue Click Classify. In the Display box check Summary Table. In the Use Covariance Matrix box check Within Groups. Click Continue Click OK.
Classroom Problem1
Data on Nike was obtained from 45 respondents. Which of the independent variables discriminate between the 2 types of users of Nike?
Dependent variable
Independent variables
Gender
1 Females 2 Males
All these are measured on a 7 point scale where 1- very unfavorable & 7 very favorable
Classroom Problem2
Data on Nike was obtained from 45 respondents. Which of the independent variables discriminate between the 3 types of users of Nike?
Dependent variable
Independent variables
Gender
1 Females 2 Males
All these are measured on a 7 point scale where 1- very unfavorable & 7 very favorable
Thank you