You are on page 1of 6

Regression Problem

MRP Biscuit Company started its operations in Ambala city, Haryana, in 2001. The company
was growing at an annual rate of 20 percent, which was above the industry average. However,
for the last three years, the growth has been only to the tune of 5 to 6 percent. This very factor
has been a main concern to the top management of the company. Mr. P K Malhotra, the Senior
Vice President, Marketing, had a meeting of the senior marketing team and was wondering why
their company which, which has been doing so well, has slowed down in the last few years.
During the discussion it was suggested by one of the senior managers to identify the factors
which influence the preference for biscuits. It was argued that once these are known, it will help
the company to concentrate on those factors accordingly. Therefore, the company decided to get
a study done from a research agency to identify the various factors that influence the preference
for biscuits. A sample of 40 individuals was chosen randomly from Ambala. The data was
collected on variables like preservation quality, taste, nutrition value and preference on a 7 point
likert scale with the higher number indicating a more positive rating.
Also it was felt that celebrity endorsement can influence the preference of customers towards
biscuits. Thus the sample was also asked a question to choose a celebrity whom they would like
to see endorsing biscuits. The three celebrities were Hritik Roshan, Govinda and Saif Ali Khan.

Help MRP Biscuit Co. by analyzing the data and Interpret the output to them.

Developed By Saurabh Bhattacharya for QM-II Sec-A

Page 1

MRP Biscuit Co. Regression Analysis (Without Dummy)


Objective of the Study: Factors like preservation quality, taste and nutrition influence the
preference of customers or individuals for biscuits.
H1: Preservation quality positively affects preference for biscuits
H2: Taste positively affects preference for biscuits.
H3: Nutrition positively affects preference for biscuits.
Population Regression Function:
= +

Output Interpretation (After SAS E.G has been used to run Multiple Linear Regression
Table 1: ANOVA Table

The F value in the above ANOVA table tests:


:

Ha: at least one of the s is different.


The significance of the F statistic (at 5% level or .05) implies that the alternate hypothesis is
accepted and the model is a good fit and we can proceed with regression analysis.
In Table 1, the Model sum of square explains how much variance is explained by the
independent variables in the dependent variable. The Error sum of square explains the variance
in the dependent variable not accounted for by the regression model.
Calculations related to Table 1:
Sample Size = 40
Total Sum of Square d.f. = (n-1) = (40-1) = 39, where n= sample size
Model Sum of Square d.f. = k = 3, where k = number of independent variables.
Error Sum of Square d.f. = n-k-1 = 40-3-1 = 36
Developed By Saurabh Bhattacharya for QM-II Sec-A

Page 2

Mean Square Model= (Model Sum of Square / model d.f.) = (108.3747/3) = 36.1249
Mean Square Error = (Error Sum of Square / error d.f.) = (17.6003/ 36) = 0.4889
F value = (Mean Square Model / Mean Square Error) = (36.1249 / 0.4889) = 73.893, 36
R square = (Model Sum of Square / Total Sum of Square)= (108.3747/ 125.975) = 0.8603
Table 2: Coefficients Table

In the Parameter Estimates table, the parameter estimates (which are the unstandardized beta)
and the standardized beta estimates and their significance values are given. In this table the first
objective is to test for Multi-collinearity. The tolerance and VIF values for all the estimates are
above .10 and below 5 respectively. Multi-collinearity is said to be present when the VIF value is
above 10 for an estimate. A VIF value between 5-10 indicates towards moderate collinearity.
Next the beta values are checked for significance. It can be observed that apart from the beta for
the variable taste, the betas for nutrition and preservation quality are significant (at 5% level of
confidence). Also, the beta coefficients of nutrition and preservation quality are in the positive
direction. Thus we accept the hypotheses H1 and H3. H2 is not accepted. Also the intercept term
is significant (at 5% level of significance). [Usually it is observed that t values in excess of 2.20
indicate that a variable is significant].
Based on the above findings the estimated regression can be written (using standardized beta) as:
= .52234

+ .28394

From the above estimated regression equation it can be said that for one unit change in
preservation quality on an average preference for biscuit will increase by .52234 units and for
one unit change in nutrition on an average preference for biscuit will increase by .28394 units.
Also, it can be pointed out that the absolute value of preservation quality estimate is larger than
that of nutrition and hence it is a major factor influencing preservation quality. Thus MRP,
biscuit company should focus on the preservation quality factor.
Standard error of the beta coefficient gives us an indication of how much the point estimate is
likely to vary from the corresponding population parameter. It measures the amount of sampling
error. Lower the sampling error relative to the beta value higher the chances of the beta being
accepted.
Developed By Saurabh Bhattacharya for QM-II Sec-A

Page 3

Calculations related to Table 2:


=

Thus, the t statistics of nutrition will be : t = (.29466/ .10285) = 2.86


The confidence interval for nutrition will be

= .29466 + 1.96 (.10285)


= (.086, .503)

Table 3: Model Summary

The above is a model summary table. The R square (coefficient of determination) is indicating to
the fact that 86.03% of the variance in the dependent variable preference is is explained by the
independent variables. The adjusted R square takes into account the sample size and the number
of independent variables in the regression model. This value is usually lower than the R- square
value as it acts as a penalization for additional variables and sample size. The adjusted R- Square
is 84.86%.
The Root mean Square Error is used only when different regression models or equations are
compared. A regression model with lower RMSE is preferred than a regression model with
greater RMSE. The RMSE is used to measure difference between the values predicted by a
model and the values actually observed or collected from respondents or firms. We dont use it
here as only one regression model or equation is being tested.
The coefficient of variation is (RMSE/ Dependent Mean)*100 = (0.69921/4.725)*100 = 14.798.
Models with Coeff Var values below 10% lead to accurate prediction. In the above regression
model the Coefficient of Variation is more than 10%. This can be attributed to the limited sample
size used.

Developed By Saurabh Bhattacharya for QM-II Sec-A

Page 4

Figure 1: To Test Normality Assumption of Regression

The above figure is used to check the Normality assumption of regression. The bell curve in the
above line indicates that residuals are almost normally distributed. However, Kernel Curve
indicates that the residuals are slightly positively skewed.
Figure 2: To Test Homoscedasticity Assumption of Regression

The above figure indicates towards the presence of heteroscedasticity as most of the residuals are
not clustered around the centre of the horizontal 0 line. The assumption of homoscedasticity is
violated.

Developed By Saurabh Bhattacharya for QM-II Sec-A

Page 5

Figure 3: To Test for Linearity

The above figure indicates that the residuals are not exactly placed on the linear line. Thus
linearity is not exactly satisfied.

Developed By Saurabh Bhattacharya for QM-II Sec-A

Page 6

You might also like