You are on page 1of 31

MKT 355e

Multivariate Analysis
Assignment 2 Group Based Assignment
January 2015 Presentation
SUBMITTED BY:
TAN KWEE HUAT Z1210897
NEO SIOK TIN B1173511
LEE GI SIANG B1370676
TUTORIAL GROUP: T01

Table of Contents
Question 1a..................................................................................................................................................3
Question 1b..................................................................................................................................................5
Question 1c..................................................................................................................................................7
Question 2..................................................................................................................................................12
Question 3..................................................................................................................................................15
Question 4..................................................................................................................................................22
Question 5..................................................................................................................................................29
Reference...................................................................................................................................................31

Question 1a
(i)

Recoded Payment

Recode mode of payment (D1) by combining Credit Card and Debit Card and renaming it to
Cards and coding it using coded value of 2. Check and Other mode of payment are also similarly
combined and renamed to Others and coded using coded value of 3.
Initial and Recoded Value for Mode of Payment (D1)
Payment Method
Cash
Credit Card
Debit Card
Check
Other

Coded Value
1
2
3
4
5

Recoded Value
1
2
2
3
3

Summary Table for Recoded Payment as generated by JMP


Payment Method
Cash
Cards
Others
(ii)

Coded Value
1
2
3

N-Rows
1277
140
33

Recoded Income

Recode family annual household income (D6) by combining Under $25,000 and $25,000 but under
$50,000 and coding it using coded value of 1. $50,000 but under $75,000 and $75,000 but under
$100,000 family annual household incomes are also similarly combined and coded using coded value of
2. Finally, $100,000 but under $150,000 are recoded to 3 while $150,000 but under $200,000 and
$200,000 or more are recoded to 4.

Initial Value for Family Annual Household Income (D6)


Income Level
Under $25,000
$25,000 but under $50,000
$50,000 but under $75,000
$75,000 but under $100,000
$100,000 but under $150,000
$150,000 but under $200,000
$200,000 or more

Coded Value
1
2
3
4
5
6
7

Recoded Value for Family Annual Household Income (D6)


Income Level
Below $50,000
$50,000 but under $100,000
$100,000 but under $150,000
$150,000 or more

Coded Value
1
2
3
4

Summary Table for Recoded Income as generated by JMP


Income Level
Below $50,000
$50,000 but under $100,000
$100,000 but under $150,000
$150,000 or more

Coded Value
1
2
3
4

N-Rows
636
531
108
175

Question 1b
Distributions
Household Size

Normal (2.93103, 1.49651)


Summary Statistics
Mean
Std Dev
Std Err Mean
Upper 95% Mean
Lower 95% Mean
N

2.9310345
1.4965116
0.0393003
3.0081261
2.8539429
1450

Fitted Normal
Parameter Estimates
Type
Parameter Estimate Lower 95%
Location

2.9310345
2.8539429
Dispersion

1.4965116
1.4439586
-2log (Likelihood) = 5283.01841640364

Upper 95%
3.0081261
1.5530635

Goodness-of-Fit Test
Shapiro-Wilk W Test
W
Prob<W
0.890802
<.0001*
Note: Ho = the data is from the Normal distribution. Small p-values reject Ho.
The above analysis presents the distributions of the new variable Household Size obtained by
combining the responses in D3A and running a one sample non-parametric test using JMP.
As revealed in the summary statistics, the mean is computed as 2.931, while standard deviation is given
as 1.497. The 95% confidence interval is between 2.853 and 3.008, and its sample size is 1450, which
explains the softwares auto-selection of the Shapiro-Wilk W Test to run the test for normal distribution
since sample size is less than 2000. The critical value is given as <0.0001, and assuming level of
significance is at 0.05, the null hypothesis will be rejected since p value at <0.0001 is lower than 0.05. In
light of this, we can conclude that Household Size is not normally distributed.

Question 1c
Contingency Analysis of Frequency of Visits (Q3A_39) By Recoded Income
Mosaic Plot

Contingency Table
Recoded Income by Frequency of Visits (Q3A_39)
Count
Total %
Col %
Row %
Below $50,000

$50,000 but under


$100,000
$100,000 but under
$150,000

$150,000 or more

More than 4
Within the past weeks to within
4 weeks
the past 3
months
264
150
20.75
11.79
41.77
47.62
47.14
26.79
249
103
19.58
8.10
39.40
32.70
53.78
22.25
48
25
3.77
1.97
7.59
7.94
52.17
27.17
71
37
5.58
2.91
11.23
11.75
45.22
23.57
632
315
49.69
24.76

More than 3
months ago

Never

140
11.01
44.87
25.00
107
8.41
34.29
23.11
17
1.34
5.45
18.48
48
3.77
15.38
30.57
312
24.53

6
0.47
46.15
1.07
4
0.31
30.77
0.86
2
0.16
15.38
2.17
1
0.08
7.69
0.64
13
1.02

560
44.03
463
36.40
92
7.23
157
12.34
1272

Tests
N
1272

DF
9

Test
Likelihood Ratio
Pearson

-LogLike
5.4280272

RSquare (U)
0.0039

ChiSquare
10.856
11.142

Prob>ChiSq
0.2857
0.2661

The above analysis presents a two-way cross-tabulation between the frequency of visits to Wendys
(Q3A_39) and recoded income. The result from the JMP output shows that the same frequency of visits
i.e. Within the past 4 weeks is prevalent among the respondents, with the highest percentage of
49.69%. Also, it follows that those earning below $50,000 and $50,000 but under $100,000 form the
bulk of the sample group whom visited the restaurant within the prevalent visitor frequency revealed
above, each contributing 20.75% and 19.58% respectively. Nonetheless, the bivariate associations
between the two variables is not significant at =0.05 since the p-value generated is 0.2661, which is

greater than 0.05. As a result, the null hypothesis of no association cannot be rejected and we conclude
that there is no significant relationship between visitor frequency and income.
Contingency Analysis of Frequency of Visits (Q3A_39) By Employment Status (D7)
Mosaic Plot

Contingency Table
Employment Status (D7) By Frequency of Visits (Q3A_39)
Count
Total %
Col %
Row %
Full-time

Part-time

Retired

Student

Homemaker

Unemployed

Prefer not to answer

More than 4
Within the past weeks to within
4 weeks
the past 3
months
401
182
31.53
14.31
63.45
57.78
53.83
24.43
52
39
4.09
3.07
8.23
12.38
42.28
31.71
4
3
0.31
0.24
0.63
0.95
44.44
33.33
97
53
7.63
4.17
15.35
16.83
44.09
24.09
54
28
4.25
2.20
8.54
8.89
46.15
23.93
21
7
1.65
0.55
3.32
2.22
46.67
15.56
3
3
0.24
0.24
0.47
0.95
23.08
23.08
632
315
49.69
24.76

More than 3
months ago

Never

157
12.34
50.32
21.07
29
2.28
9.29
23.58
2
0.16
0.64
22.22
68
5.35
21.79
30.91
34
2.67
10.90
29.06
15
1.18
4.81
33.33
7
0.55
2.24
53.85
312
24.53

5
0.39
38.46
0.67
3
0.24
23.08
2.44
0
0.00
0.00
0.00
2
0.16
15.38
0.91
1
0.08
7.69
0.85
2
0.16
15.38
4.44
0
0.00
0.00
0.00
13
1.02

745
58.57
123
9.67
9
0.71
220
17.30
117
9.20
45
3.54
13
1.02
1272

Tests
N
1272

DF
18

-LogLike
15.683112

RSquare (U)
0.0114

Test
ChiSquare Prob>ChiSq
Likelihood Ratio
31.366
0.0261*
Pearson
34.971
0.0095*
Warning: 20% of cells have expected count less than 5, ChiSquare suspect.
The above analysis indicates similarities as the previous analysis in that most respondents had visited
Wendys within the past 4 weeks. Also, it reveals that those employed full-time are the most frequent
goers to the fast food restaurant (58.57%); follow by students who form the second largest group of
consumers (17.30%). The associated p-value is 0.0095, which is lower than 0.05, and this shows that the
relationship between the two is significant at = 0.05. As a result, we can conclude that there exists a
significant relationship between visitor frequency and employment status.
To sum up, the findings suggest that the dependent variable, frequency of visits to Wendys (Q3A_39),
shares a relationship with independent variable of employment status (D7) and not recoded income.
Also, those who visited the restaurant most regularly are mainly those employed full-time as they form
the largest population (58.57%) of the sample group, followed by the students whom account for
17.30%. Viewed in this light, Wendys may wish to channel the bulk of its marketing efforts towards
appealing those in full-time employment. This could be in the form of promotional campaigns such as
discriminant pricing based on time of purchase, or even bundle discounts or loyalty programmes to
create values for this particular group of consumers. Similar marketing attention should also be extended
to the part-timers as, while they constitute only 9.67% of the sample group, their requirements match
closely to that of the full-time employees. Also, to induce greater spendings from the students, whom
form second largest group of the restaurant goers, Wendys may wish to employ price segmentation
strategies such as student pricing or promotional buddy meals to capture a greater student market share.
Finally, the restaurant should also relook into its menu focusing on serving healthier food offerings and
to tailor its marketing and communication strategy to appeal to the other segments.

Question 2
The objective is to examine the relationship between frequency of visits to Wendys with level of
education (D5) via cross tabulation and the subsequent introduction of recoded income into the model as
the controlling variable. At the outset, the relationship between frequency of visits (Q8_39) and level of
education (D5) is tested and the results are as shown below.
Contingency Analysis of Frequency of Visits (Q8_39) By Level of Education (D5)
Mosaic Plot

Contingency Table
Level of Education (D5) By Frequency of Visits (Q8_39)
Count
Total %
Col %
Row %
Some high school or less

Completed high school

Some college

Completed college

Post graduate

Prefer not to answer

More
Often

About
the
same

Less
Often

7
0.74
2.51
43.75
39
4.12
13.98
27.66
111
11.72
39.78
29.84
85
8.98
30.47
26.90
37
3.91
13.26
36.63
0
0.00
0.00
0.00
279
29.46

9
0.95
1.71
56.25
75
7.92
14.26
53.19
206
21.75
39.16
55.38
181
19.11
34.41
57.28
55
5.81
10.46
54.46
0
0.00
0.00
0.00
526
55.54

0
0.00
0.00
0.00
27
2.85
19.01
19.15
55
5.81
38.73
14.78
50
5.28
35.21
15.82
9
0.95
6.34
8.91
1
0.11
0.70
100.00
142
14.99

16
1.69
141
14.89
372
39.28
316
33.37
101
10.67
1
0.11
947

Tests
N
947

DF
10

-LogLike
8.4984320

RSquare (U)
0.0092

Test
ChiSquare Prob>ChiSq
Likelihood Ratio 16.997
0.0744
Pearson
16.426
0.0881
Warning: 20% of cells have expected count less than 5, ChiSquare suspect.

The contingency table above reveals that the same frequency of visits i.e. about the same is prevalent
among the respondents, with the highest percentage of 55.54%. Additionally, it also identifies those
whom have received Some College education (21.75%) and those whom Completed College
(19.11%) belonging to the group of the prevalent visitor frequency. However, the test result also shows
that the overall model of frequency of visits and level of education is not statistically significant at
=0.05 as the Chi-square probability is 0.0881, which is greater than 0.05 level of significance. At this
juncture, we conclude that there is no significant relationship between frequency of visits and level of
education. However, JMP also warns of 20% of the cells have expected count of less than 5, suggesting
that of insufficient data for a complete assessment. In light of this, the third variable, recoded income, is
introduced to clarify the initial association. Result from Cochran Mantel Haenszel test is shown below.
Cochran-Mantel-Haenszel Tests
Stratified by Recoded Income
CMH Test
Correlation of Scores
Row Score by Col Categories
Col Score by Row Categories
General Assoc. of Categories

ChiSquare
0.1597
11.3743
0.4417
14.7884

DF
1
5
2
10

Prob>Chisq
0.6894
0.0444*
0.8019
0.1400

The Cochran Mantel Haenszel Test reveals that the chi-square test for both adjusted correlation (0.6894)
and general association of categories (0.1400) are much larger than 5%, indicating again of the lack of
association between frequency of visits and the two independent variables of recoded income and level
of education. This reinforces the results of the previous test.
In summary, the introduction of the third variable of recoded income reinforces the results of the initial
findings of no association between the two variables of frequency of visits and level of education. The
lack of correlation amongst them somewhat suggests that Wendys may wish to consider alternative
independent variables like household age groups or employment status which may better correlate with
the dependent variable of frequency of visits and in this sense, provide greater insights into their
associations.

Question 3
A 2-Way ANOVA analysis was conducted to find out whether the level of education (D5) and
employment status (D7) have any effect on Wendys restaurant rating (Q9_39). Adjustment had been
made to convert level of education (D5) to ordinal since it belongs to ordered categories, while
employment status (D7) to remain nominal as it only serves only as labels for identification and
classification (Malhotra, 2015).
Actual by Predicted Plot

Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)

0.041532
0.010141
1.703514
7.559662
947

Analysis of Variance
Source
Model
Error
C. Total

DF
30
916
946

Sum of Squares Mean Square F Ratio


115.1846
3.83949
1.3231
2658.1945
2.90196
Prob > F
2773.3791
0.1160

Effect Tests
Source
level of education (D5)
employment status (D7)
level of education (D5)*Employment
status (D7)

Nparm DF Sum of Squares F Ratio Prob > F


5
2
5.321880
0.9169
0.4001 LostDFs
6
3
19.104167
2.1944
0.0872 LostDFs
30

19

88.357881

Level of Education (D5)


Leverage Plot

Least Squares Means Table


Level Least Sq Mean
Std Error
Mean
1
.
NonEstimable
.
7.31250
2
7.2634326
0.37247244 7.58156
3
7.7039966
0.21089534 7.63710
4
.
NonEstimable
.
7.49684
5
7.2584801
0.33811727 7.50495
6
.
NonEstimable
.
5.00000

1.6025

0.0490*

LostDFs

Employment Status (D7)


Leverage Plot

Least Squares Means Table


Level Least Sq Mean
1
6.9349846
2
.
3
.
4
.
5
.
6
.
7
.

NonEstimable
NonEstimable
NonEstimable
NonEstimable
NonEstimable
NonEstimable

Std Error
0.30585168
.
.
.
.
.
.

Mean
7.51973
7.51648
8.00000
7.60000
7.91463
7.17857
7.50000

Level of Education (D5)*Employment Status (D7)


Leverage Plot

Least Squares Means Table


Level
Some high school or less,1
Some high school or less,2
Some high school or less,3
Some high school or less,4
Some high school or less,5
Some high school or less,6
Some high school or less,7
Completed high school,1
Completed high school,2
Completed high school,3
Completed high school,4
Completed high school,5
Completed high school,6
Completed high school,7
Some College,1
Some College,2
Some College,3
Some College,4
Some College,5
Some College,6
Some College,7
Completed College,1
Completed College,2

Least Sq Mean
6.2500000
9.0000000
.
NonEstimable
8.1666667
.
NonEstimable
9.0000000
.
NonEstimable
7.6470588
7.8181818
8.0000000
6.8333333
7.5454545
8.0000000
5.0000000
7.5209581
7.5813953
8.2500000
7.7207207
8.0882353
7.1000000
7.6666667
7.4658635
7.4642857

Std Error
0.6022831
1.7035137
.
0.6954566
.
1.7035137
.
0.1847720
0.5136287
1.7035137
0.4917621
0.3631904
0.5678379
1.7035137
0.1318219
0.2597834
0.8517569
0.1616904
0.2921502
0.5386983
0.9835241
0.1079558
0.3219338

Level
Completed College,3
Completed College,4
Completed College,5
Completed College,6
Completed College,7
Post graduate,1
Post graduate,2
Post graduate,3
Post graduate,4
Post graduate,5
Post graduate,6
Post graduate,7
6,1
6,2
6,3
6,4
6,5
6,6
6,7

Least Sq Mean
.
7.6000000
8.4444444
5.8333333
.
7.7260274
6.7500000
7.5000000
6.3333333
7.0000000
7.0000000
8.5000000
5.0000000
.
.
.
.
.
.

NonEstimable

NonEstimable

NonEstimable
NonEstimable
NonEstimable
NonEstimable
NonEstimable
NonEstimable

Std Error
.
0.4398454
0.4015220
0.6954566
.
0.1993812
0.6022831
1.2045661
0.6954566
0.6022831
1.2045661
1.2045661
1.7035137
.
.
.
.
.
.

Identify the Dependent and Independent Variables


The dependent variable is Wendys restaurant rating (Q9_39) and the independent variables are level of
education (D5) and employment status (D7).
Decomposition of Total Variation
R2 is given as 0.0415, suggesting that this model using level of education (D5) and employment status
(D7) can explain 4.15% of the variation in Wendys restaurant rating. Overall mean of response is
computed as 7.56. The sum of squares (corrected total) is 2773.38 with the impact of level of education
sum of squares and the employment status sum of square at 5.321 and 19.104 respectively. The
combined main effect totals to 24.426, computed by summing the sum of squares for level of education
(5.321), employment status (19.104) and the degrees of freedom of 5.

Measurement of Results
The overall model effect of the sum of squares is calculated as 112.784, computed by adding the sum of
squares for level of education (5.321), employment status (19.104) and interaction effect (88.358).

Significance of Model
The model test F-statistics is computed as 1.323 with associated significance given as 0.1160. This is not
significant at the 0.05 level. Given this, we cannot reject the null hypothesis but to conclude that the
model is not statistically significant and that both independent variables of level of education (D5) and
employment status (D7) do not have a significant impact on Wendys restaurant rating.

For the interaction, the test statistic for the significance of the interactive effects is 1.6025 with a
significance of 0.0490. This is significant at the 0.05 level. This means that the two factors do interact
significantly with each other.
Individually, the test statistic for the significance of the main effects of level of education (D5) is
computed as 0.9169 while employment status (D7) is calculated as 2.1944. Their associated
significances are given as 0.4001 and 0.0872 respectively, which are not significant at the 0.05 level.
With knowledge of this, we can conclude that separately, these two variables do not have any significant
influence on the restaurant rating.
Finally, it is interesting to note that in the last column of the effect tests tableLostDFs are flagged
out for all the three significant values, implying that DF is less than Nparm. Accordingly, this means that
not all the parameters associated with the effect are testable (JMP, 2015).
Results Interpretation
The above results show that separately, neither level of education (D5) nor employment status (D7) has
direct impact on Wendys restaurant rating. Notwithstanding, both variables do exert certain level of
influence on its rating when combined as they do interact significantly with each other.
To conclude, the choice to use level of education and employment status fails to yield much insight for
Wendys in its restaurant rating. Other variables like Cleanliness, Service Quality, Nutritional

Value and Value for Money may have been better choices, as they are often deemed more
appropriate, and are widely recognised for their ability to provide better insights into the formulations of
strategies and actions that need to be undertaken at different levels to improve a restaurants rating. In
this sense, the restaurant may wish to relook into its choices, and to consider other variables instead in
examining their effect and effectiveness on its rating.

Question 4
A multiple regression analysis was conducted to examine the effect on the frequency of eating fast-food
(S3A) in terms of the ratings on the psychographic statements (q14_1, q14_2, q14_3, q14_4, q14_5,
q14_6, and q14_7) and demographic information on household size. The results are as follows.
Response S3A
Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)

0.01085
0.005358
6.559442
7.728276
1450

Analysis of Variance
Source
Model
Error
C. Total

DF Sum of Squares Mean Square F Ratio


8
680.081
85.0101
1.9758
1441
62000.860
43.0263
Prob > F
1449
62680.941
0.0461*

Lack Of Fit
Source
Lack Of Fit
Pure Error
Total Error

DF
1104
337
1441

Sum of Squares
48881.261
13119.599
62000.860

Mean Square
44.2765
38.9306

F Ratio
1.1373
Prob > F
0.0765
Max RSq

Parameter Estimates
Term
Intercept
q14_1
q14_2
q14_3
q14_4
q14_5
q14_6
q14_7
Household Size

Estimate
7.7367977
-0.007885
-0.01696
-0.003541
0.0394514
-0.009271
0.0113858
-0.049935
0.1612185

Std Error
0.484205
0.066239
0.048754
0.031466
0.040536
0.003984
0.024474
0.02911
0.124194

t Ratio
15.98
-0.12
-0.35
-0.11
0.97
-2.33
0.47
-1.72
1.30

Prob>|t|
<.0001*
0.9053
0.7280
0.9104
0.3306
0.0201*
0.6418
0.0865
0.1945

VIF
.
8.5483785
5.6492182
2.9799786
3.9212688
1.1686583
2.2913446
2.2000752
1.163316

Formulation of Regression Model


From the parameter estimates, the estimated multiple regression equation is given by:
= 7.7367977 0.007885X1 0.01696X2 0.003541X3 + 0.0394514X4 0.009271X5 + 0.0113858X6
0.049935X7 + 0.1612185X8
Where,
Dependent variable, : frequency of eating fast-food (S3A); and
Predictors:
Constant: 7.7367977
X1: q14_1 (I try to stay current on the latest health and nutrition information)
X2: q14_2 (I read nutritional labels on most products I buy)
X3: q14_3 (I am making more of an effort to find out about the nutritional content of the foods I
eat at fast-food restaurants)
X4: q14_4 (I consider the amount of fat in the foods I eat at fast-food restaurants)
X5: q14_5 (I consider the amount of fat in the foods my kids eat at fast-food restaurants)
X6: q14_6 (I have been making an effort to look for fast-food choices that have better nutritional
value than the foods I have chosen in the past)
X7: q14_7 (I am eating at fast-food restaurants less often out of concern for the high fat content
in the foods at fast-food restaurants)
X8: household size
The above regression equation indicates that , the frequency of eating fast-food (S3A), would still
score an incremental 7.737 when all the independent variables (Xk) score 0.
Also, as the slope for q14_4, q14_6 and household size is positively sloped i.e. positive values, any
increase in either or all of the 3 will result in an corresponding increase in . As a case in point, for
every 1 unit increase in q14_4, the frequency of eating fast-food will increase correspondingly by
0.0394514, holding all other independent variables constant. Similarly, every 1 unit increase in q14_6
and household size will also lead to a corresponding increase of 0.0113858 and 0.1612185 respectively,
holding all other predictors constant.

The reverse is true for q14_1, q14_2, q14_3, q14_5, q14_7 where their slope is downward sloping i.e.
negative values. In this sense, any 1 unit increase in score will result in a corresponding decrease in by
0.007885, 0.01696, 0.003541, 0.009271 and 0.049935 respectively, holding all other independent
variables constant.
Model Fit
According to results of the above summary of fit, the model is not a good predictor, as revealed by R =
0.01085 and Adjusted R = 0.005358. Essentially, the R 2 and Adjusted R2 indicate the proportion of
variability in frequency of eating fast-food (S3A) that can be accounted by the independent variables.
However, based on the adjusted R value, which is the more accurate as it adjusts for sample sizes and
number of independent variables, the independent variables only account for 0.5338% variance in the
frequency of eating fast-food. In consideration, it does appear that the model is unable to offer a
reasonable explanation on the relationship between and Xk.
Model Test
The F-value statistics test the overall significance of the regression model. From the above analysis of
variance, the value is computed as 1.9578 with associated p-value of 0.0461. This is significant at the
0.05 level since the p value is 0.0461<0.05. Accordingly, we can reject null hypothesis of no association
and infer that at least one of the independent variables is a significant predictor for the frequency of
eating fast-food (S3A).
Testing of Individual Regression Coefficients
Based on the above parameter estimates, it appears that only q14_5 whose p-value is 0.0201 is
significant at = 0.05 level. In other words, only q14_5 should be retained as it is useful in explaining
the variation in the frequency of eating fast-food (S3A).

Residual by Predicted Plot

Residual Analysis
From the above residual plot, it does not demonstrate any trends or patterns that reveal any relationship
between predictors and residuals. As such, it also signifies that the variance is constant and hence the
regression results can be relied upon.
Multicollinearity Check
Based on VIF statistics for all variables in the above parameter estimates, none of the variables has a
reading of above 10. Hence, there are no high inter-correlations among independent variables.
Implications
Based on the above preliminary findings, it appears that respondents rating on the frequency of eating
fast-food is only dependent on the amount of fat that are present in the foods that their kids eat at fastfood restaurants.
Improved Regression Model Using Stepwise Method
From the above analysis, it appears that only q14_5 is useful in explaining the variation in the frequency
of eating fast-food. In the following analysis, stepwise method will be employed to confirm the above
findings since it is designed to select the best subset of the predictors that accounts for most of the
variation in the dependent variable (Malhotra, 2015).

Response S3A
Regression Plot

Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)

0.006675
0.005989
6.557361
7.728276
1450

Analysis of Variance
Source
Model
Error
C. Total

DF Sum of Squares Mean Square F Ratio


1
418.406
418.406
9.7306
1448
62262.534
42.999
Prob > F
1449
62680.941
0.0018*

Lack of Fit
Source
Lack Of Fit
Pure Error
Total Error

DF
4
1444
1448

Sum of
Squares
92.960
62169.575
62262.534

Mean Square

F Ratio

23.2400
43.0537

0.5398
Prob > F
0.7065
Max RSq

Parameter Estimates
Term
Intercept
q14_5

Estimate
8.1939532
-0.011493

Std Error
0.227904
0.003684

t Ratio
35.95
-3.12

Prob>|t|
<.0001*
0.0018*

VIF
.
1

Formulation of Regression Model


From the parameter estimates, the revised multiple regression equation is given by:
= 8.1939532 0.011493X1
Where,
Dependent variable, : frequency of eating fast-food (S3A); and
Predictors:
Constant: 8.1939532
X1: q14_5 (I consider the amount of fat in the foods my kids eat at fast-food restaurants)

The improved model, based on stepwise method, only retains q14_5 to examine the effect on the
frequency of eating fast-food (S3A).
The above regression equation indicates that , the frequency of eating fast-food (S3A), would still
score an incremental 8.193 when X1 is 0.
Further, as the slope for q14_5 is downward sloping i.e. negative value, any increase in the value of X 1
will result in a corresponding decrease in . As a case in point, for every 1 unit increase in q14_5 , the
frequency of eating fast-food will decrease correspondingly by 0.011493, holding all other independent
variables constant.
Model Fit
According to the results of the above summary of fit, the model is not a good predictor, as revealed by
R = 0.006675 and Adjusted R = 0.005989. While it presents better results relative to the preceding
analysis, the adjusted R value reveals that the independent variables only account for 0.5989% variance

in the frequency of eating fast-food. Given this, it appears that the model is unable to offer a reasonable
explanation on the relationship between and X1.
Model Test
The F-value statistics test the overall significance of the regression model. From the above analysis of
variance, the value is computed as 9.7306 with associated p-value of 0.0018. This is significant at the
0.05 level since the p value is 0.0018<0.05. Accordingly, we can reject null hypothesis of no association
and infer that q14_5 is a significant predictor for the frequency of eating fast-food (S3A).
Testing of Individual Regression Coefficients
Based on the above parameter estimates, it appears that q14_5 whose p-value is 0.0018 is significant at
= 0.05 level. In other words, q14_5 is useful in explaining the variation in the frequency of eating fastfood (S3A).
Residual Analysis
Not applicable
Multicollinearity Check
Not applicable
Conclusion
In summary, out of the 8 independent variables identified, only I consider the amount of fat in the foods
my kids eat at fast-food restaurants (q14_5) has a significant effect on the dependent variable,
"frequency of eating fast-food (S3A). In this regard, Wendys will have to offer healthier food options,
paying particular attention to the amount of fat content that are present in the food served to kids dining
at its restaurants to improve its rating.

Question 5
This report contains our findings and highlights possible managerial implications based on the data from
the survey Wendys conducted to understand its customers.
Of the 1450 respondents surveyed, 1277 favour paying for their food in cash, 140 use credit/debit cards,
while the remaining 33 prefer other modes of payment.
In terms of annual household income, 636 households earn below $50,000, 531 collect between $50,000
but under $100,000, 108 get paid between $100,000 but under $150,000, and the remaining 175 earn
$150,000 or more.
For household size, the sampling data does not follow a normal distribution. Nonetheless, the sampling
data distribution is deemed reliable since its true mean falls between the lower and upper mean and
standard deviation is relatively low.
The two-way cross-tabulation between the frequency of visit to Wendys and recoded income did not
suggest any meaningful associations. In relation to employment status, however, those employed fulltime form the bulk of patrons (58.57%) to the restaurant, followed by students whom form the second
largest group (17.30%). As such, the management should channel their marketing efforts towards
appealing these two segments.
The two-way cross-tabulation between the frequency of visit to Wendys and level of education of the
respondent, with the introduction of income yielded weak associations. Consequently, Wendys should
consider alternative independent variables which can better correlate with the dependent variable to
provide greater insights into their associations.
Results from the two-way ANOVA reveals that neither level of education nor employment status has an
effect on Wendys restaurant rating, thus the management should consider other variables which may be
more appropriate.

The improved new model using the stepwise approach suggests the management should concentrate on
reducing the fat content that are present in the food served to kids dining at its restaurants to improve its
rating.
Overall, the above analysis has suggested some valuable insights that management can act upon.
Notwithstanding, Wendys should regularly seek refreshed data to better understand its customers
behaviour to gain competitive edge over its competitors.

Reference
1. JMP. (2015). The Factor Models. Retrieved March 31, 2015, from SAS Institute Inc.:
http://www.jmp.com/support/help/The_Factor_Models.shtml
2. Malhotra, N. K. (2015). Marketing Research-An Applied Orientation, 6th Edition. Singapore:
Pearson Education South Asia Pte Ltd.

You might also like