You are on page 1of 30

Application of Multiple

Regression Models in
Business Research
Dr Prabir K. Das
Indian Institute of Foreign Trade

The Question
Reliable Motors, Inc., a manufacturer and
marketer of electric motors would like to
build a predictive model consisting of
several variables, to predict sales. Past
data on sales and six different variables,
namely, market potential in the territory (in
Rs. Lakh), number of dealers of the
company in the territory, number of sales
persons in the territory,

The Question
index of competitor activity in the territory
on a five-point scale (1=lowest, 5= highest
level of activity by competitors), number of
service people in the territory, and number
of existing customers in the territory are
available. It is believed that these variables
along with other variables influence sales.
How to develop a predictive model?

Learning Objectives
To develop a multiple linear regression
model.
To understand the assumptions underlying
development of multiple linear regression
model.
Understand the usefulness of residual
analysis.
Cautionary comments.

Introduction
Regression analysis is the statistical
methodology for predicting values of one or
more response (dependent) variables from a
collection of predictor (independent) variable
values.
It can also be used for assessing the effects
of the predictor variables on the responses.
The name regression is in no way reflects
either the importance or breadth of
application of this methodology.

Simple Linear Regression


It is one of the most widely used technique
of statistics in Business Research.
Given n observations on dependent variable
and corresponding n observations on k
independent variables it is possible to
develop a linear regression model that will
provide a statistical relationship between
dependent variable and independent
variable (s).

The Model
Following regression model can be
hypothesized for the population:
Y=b0+b1X1+b2X2++bkXk+u
Where,
Y = Dependent variable
X1 to Xk = k independent variables
b0 = a model parameter that represents the
mean value of the dependent variable (Y)
when the value of the independent variable X
is zero (it is also called the Y intercept)

The Model
b1 to bk = k parameters (partial regression or
partial slope coefficients); It measures the
change in the mean value of the dependent
variable associated with one-unit change in
the value of the independent variable with
other variables being constant.
The parameter b1 gives the direct or net
effect of a unit change in X1 on the mean
value of Y, net of any effect that X2Xk may
have on mean Y.

The Model
u = an error (disturbance or
uncertainty) term that describe the
effects on Y of all variables other than
the already selected X variables.
The uncertainty/error term is central to
the model.

Assumptions About the Error Term


Mean of error term is zero.
The variance of the error term is a
constant and is independent of the
values of X - homoskedasticity; if the
variance of the error term is unequal, it
is known as heteroskedasticity.

Assumptions About the Error Term


The values of the independent variable
X are fixed.
Errors are independently, identically
normally distributed.

Estimation of Parameters of
the Model
Parameters of the models are estimated
using least square technique.

Least Square Technique


The least square principle states that
select b0 and b1,,bk to minimize the
sum of squared residuals.

Important properties
The regression line passes through the
point of means.
The residuals have zero covariance
with the sample X values and also with
the predicted Y values.

Important properties
The total variation in Y may be
expressed as the sum of just two
components, the variation explained
by the linear regression and the
variation unexplained by the
regression.

Multiple R
Correlation coefficient between Y and
Pred Y.

R2
The measure of regression models
ability to predict is called the
coefficient of determination (R2).
It is the ratio of the explained variation
to the total variation.

Coefficient of Determination

R-square = (Reg. SS/Total SS)


Or
R-square = 1 - (Error SS/Total SS)
SS = sum of squares
It is equivalent to square of Multiple R

R2 (Contd.)
Range 0 to 1
Interpretation: In percentage term : x%
of the total variability present in the
data is being explained by the
regression model.

Adjusted R-square

2
adj

ErrorSS
n ( k + 1)
= 1
TotalSS
n 1

Cautionary Comments
Prediction using extreme values of the
independent variable (beyond the range
of X variables) can be risky.
Linearity assumption may be
appropriate for only a limited range of
the independent variables.
Random sample provides no
information about extreme values of
independent variables.

Cautionary Comments
The data from the random sample were
obtained under a set of environmental
conditions; if they change, the model
may well be affected.
If the market environment changes, the
model parameters probably will be
affected.

Some Methods of Developing


Regression Model Using SPSS Variable Selection Methods

Enter
A procedure for variable selection in which
all variables in a block are entered in a
single step.

Stepwise
At each step, the independent variable not
in the equation that has the smallest
probability of F is entered, if that
probability is sufficiently small.
Variables already in the regression
equation are removed if their probability of
F becomes sufficiently large. The method
terminates when no more variables are
eligible for inclusion or removal.

All Possible Regressions


The all-possible regressions search
procedure computes all possible linear
multiple regression model from the
data using all variables.
If a data set contains k independent
variables, all possible regressions will
determine 2k-1 different models.

Case Study
In recent years, many US firms have intensified
their efforts to market their products in the Pacific
Rim. Among the major economic powers in that
area are Japan, Hong Kong, and Singapore.
A consortium of US firms that produce raw
materials used in Singapore is interested in
predicting the level of exports from the United
States to Singapore, as well as understanding
the relationship between US exports to
Singapore and certain variables affecting the
economy of that country.

Case Study (Contd)


Understanding this relationship would allow the
consortium members to time their marketing efforts to
coincide with favourable conditions in the Singapore
economy.
Understanding the relationship would also allow the
exporters to determine whether expansion of exports to
Singapore is feasible.
The Consortium hired the services of Market Research
Inc. to carry out the study.
The Market Research Inc. collected monthly data on five
economic variables for the period of January 1989 to
August 1995 from the Monetary Authority of Singapore
(MAS).

Case Study (Contd.)


The variables were US exports to Singapore in
billions of Singapore dollars (the dependent
variable, Exports), money supply figures in
billions of Singapore dollars (variable M1),
minimum Singapore bank lending rate in
percentages (variable Lend), an index of local
prices where the base year is 1974 (variable
Price), and the exchange rate of Singapore
dollars per US dollar (variable Exchange).
(Adapted from Aczel & Sounderpandian, 2006)
How to obtain the best predictive model?

Thank You

You might also like