You are on page 1of 3

Assignment of Quantitative Techniques

Questions for Discussion

1. Discuss the use of “R”, “R2” and adjusted R2 in the data analysis. What they are and
why they are used?
2. Discuss the assumptions of multiple regression in detail. What they are and how they
are tested?

Submitted To:
Sir Hafiz H.M. Arshad
Submitted by:
H.M.Basat Feroz
SP-18-RMS-001
Date of Submission:
23/10/2018

Department of Management Sciences


COMSATS UNIVERSITY, SAHIWAL
1. Discuss the use of “R”, “R2” and adjusted R2 in the data analysis. What they are and
why they are used?

These values are used for predicting relations between the values in the data analysis. R are the
values of the multiple correlation coefficient between the predictors and the outcome. So
basically R is the correlation between the observed values of the outcome, and the values
predicted by the model. In the example of advertising budget and sales, when only advertising
budget is used as a predictor, this is the simple correlation between advertising and album
sales.

Whereas R2 is a measure of how much of the variability in the outcome is accounted for by the
predictors. It is the proportion of variance in the model. Therefore, if advertising accounts for
33.5%, we can tell that attractiveness and radio play account for an additional 33%.14. So, the
inclusion of the two new predictors has explained quite a large amount of the variation in
album sales.

The adjusted R2 gives us some idea of how well our model generalizes and ideally we would
like its value to be the same as, or very close to, the value of R2. In this example the difference
for the final model is small (in fact the difference between the values is .665 − .660 = .005 or
0.5%). This shrinkage means that if the model were derived from the population rather than a
sample it would account for approximately 0.5% less variance in the outcome.

2. Discuss the assumptions of multiple regression in detail. What they are and how they
are tested?

Multiple linear regression analysis makes several key assumptions:

 Linear relationship
 Multivariate normality
 No or little multicollinearity
 No auto-correlation
 Homoscedasticity

Multiple linear regression needs at least 3 variables of metric (ratio or interval) scale. A rule of
thumb for the sample size is that regression analysis requires at least 20 cases per independent
variable in the analysis, in the simplest case of having just two independent variables that
requires n > 40. G*Power can also be used to calculate a more exact, appropriate sample size.
Firstly, multiple linear regression needs the relationship between the independent and
dependent variables to be linear. It is also important to check for outliers since multiple linear
regression is sensitive to outlier effects. The linearity assumption can best be tested with
scatter plots, the following two examples depict two cases, where no and little linearity is
present.

Secondly, the multiple linear regression analysis requires all variables to be normal. This
assumption can best be checked with a histogram and a fitted normal curve or a Q-Q-Plot.
Normality can be checked with a goodness of fit test, e.g., the Kolmogorov-Smirnov test. When
the data is not normally distributed a non-linear transformation, e.g., log-transformation might
fix this issue. However it can introduce effects of multicollinearity.

Thirdly, multiple linear regression assumes that there is little or no multicollinearity in the data.
Multicollinearity occurs when the independent variables are not independent from each other.
A second important independence assumption is that the error of the mean is uncorrelated;
that is that the standard mean error of the dependent variable is independent from the
independent variables

You might also like