You are on page 1of 34

Regression is a statistical

technique which establish a


functional relationship between
two or more variables in the
form of an equation to
estimate the value of one
variable based on the value of
another variable
Regression Analysis
Simple Linear Regression Model
y = |
0
+ |
1
x

+ c

Simple Linear Regression Equation
y = |
0
+ |
1
x

Estimated Simple Linear Regression Equation

x b b y

1 0
+ =




Principle of least squares technique
Case 1:
Observed points : (4,8); (8,1); (12,6)
Estimated points : (4,6); (8,5); (12,4)


Observed points : (4,8); (8,1); (12,6)
Estimated points : (4,2); (8,5); (12,8)



Error (graph 1) Error (graph 2)
8-6=2 8-2=6
1-5=-4 1-5=-4
6-4=2 6-8=-2

Total error=0 Total error=0

Absolute error Absolute error
I8-6I=2 I8-2I=6
I1-5I=4 I1-5I=4
I6-4I=2 I6-8I=2

Total Absolute error=8 Total Abs error=12
Case 2:

Observed points: (2,4); (6,7); (10,2)
Estimated points: (2,4); (6,3); (10,2)


Observed points: (2,4); (6,7); (10,2)
Estimated points: (2,5); (6,4); (10,3)


Abs Error Abs Error
I4-4I=0 I4-5I=1
I7-3I=4 I7-4I=3
I2-2I=0 I2-3I=1

Total Abs error=4 Total Abs error=5
Error Square ErrorSquare
(4-4)
2
=0 (4-5)
2
=1
(7-3)
2
=16 (7-4)
2
=9
(2-2)
2
=0 (2-3)
2
=1

Sum of error square=16 (Graph 1)

Sum of error square=11 (Graph 2)

Least Squares Method
Least Squares Criterion



where:
y
i
= observed value of the dependent variable
for the i th observation


| |


2
)

( min
i i
y y
n observatio ith for the
variable dependent the of value estimated y

i
=
Slope for the Estimated Regression Equation





x = value of independent variable for ith observation
y = value of dependent variable for ith observation
n = total number of observations

y-Intercept for the Estimated Regression Equation






( )( )
( )
2
2
1

=
x x n
y x xy n
b
x b y b
1 0
=
variable dependent for mean value y
t variable independen for mean value x
=
=
Simple Linear Regression

Reed Auto periodically has a special week-long
sale. As part of the advertising campaign Reed
runs one or more television commercials during
the weekend preceding the sale. Data from a
sample of 5 previous sales are shown below.


Number of TV Ads Number of Cars Sold
1 14
3 24
2 18
1 17
3 27

The HRD manager of a company wants to find a
measure which he can use to fix the monthly
income of persons applying for a job in the
production department. As an experimental
project, he collected data on 7 persons from that
department referring to years of service and their
monthly income (in 000s).

Years of
experience
11 7 9 5 8 6 10
Income
10 8 6 5 9 7 11
Find the regression equation of income on
years of service.

What initial start would you recommend for
a person applying for the job after having
served in a similar capacity in another
company for 13 years?

Do you think other factors are to be
considered (in addition to the years of
service) in fixing the income? Explain.
Properties of regression lines and
their coefficients:

1. Correlation coefficient is the
geometric mean between the
regression coefficient
2. The sign of correlation coefficient is
the same as that of regression
coefficient.
3. Regression coefficients are
dependent of the change origin but
not of scale.
In finance, it is of interest to look at the relationship
between Y, a stocks average return, and X, the
overall market return. The slope coefficient computed
by linear regression is called the stocks beta by
investment analysts. A beta greater than 1 indicates
that the stock is relatively sensitive to changes in the
market; a beta less than 1 indicates that the stock is
relatively insensitive. For the following data, compute
the beta and suggest market trend.


X
(%)
10 12 8 15 9 11 8 10 13 11
Y
(%)
11 15 3 18 10 12 6 7 18 13




Multiple regression Analysis




A linear regression equation with more
than one independent variable is called a
multiple regression model.
chance. to due error random the is
variable. t independen x the of each with
associated ts coefficien regression the are ...
constant a is
estimated be to variable dependent of value the is y
where
x ........ x x x y
: form the takes variables t independen k
with equation regression linear The
k
k 2, 1,
0
k k 3 3 2 2 1 1 0
-
-
-
-
+ + + + + + =
technique. squares least of principle the by obtained
are and ts coefficien regression partial ....b b , b , b
y variable dependent of value estimated the is y
where
) y - (y (SSE) errors squares of sum the
minimizes which x b ....... x b x b b y
be equation regression linear fitted the Let
k 3 2 1
2
k k 2 2 1 1 0
-
-
=
+ + + + =




Let us consider the case where two
independent variables and a dependent
variable.

ts. coefficien regression the are ,
intercept. - y the is
chance. to due error random the is
variables. t independen are x and x
variable dependent the is y
where
x x y
: is variables t independen two involving
model regression linear multiple The
2 1
0
2 1
2 2 1 1 0
-
-
-
-
-
+ + + =

=
-
-
-
+ + =
+ + =
2
2 1, 0
2 1
2 y2.1 1 y1.2 0
2 2 1 1 0
) y - (y (SSE) errors squres of sum the minimizes which
technique squares least of priniple the by determined are
and constants unknown the are b b , b
variables. t independen the are x , x
y. variable dependent of value estimated the is y
where
x b x b b y
x b x b b y
be equation regression linear multiple fitted the Let

or
( ) ( )
( ) ( ) ( )
( ) ( ) ( )



+ + =
+ + =
+ + =
2
2 y2.1 2 1 y1.2 2 0 2
2 1 y2.1
2
1 y1.2 1 0 1
2 y2.1 1 y1.2 0
2 1 0
x b x x b x b x y
x x b x b x b x y
x b x b nb y
. determined be can b , b , b
of values the equations following the solving By
2 y2.1 1 y1.2
2 2 y2.1 1 1 y1.2
2 y2.1 1 y1.2 0
2 y2.1 1 y1.2 0
2 2 1 1 0
X b X b Y
) x - (x b ) x (x b ) y - (y
(2) - (1)
x b x b b y
x b x b b y or
x b x b b y
be equation regression linear multiple fitted the Let
+ =
+ =
+ + =
+ + =
+ + =
-(2) - - -
-(1) - - -
( )( ) ( )( )
( )( ) ( )
( )( ) ( )( )
( )( ) ( )
x x X
x x X
y - y Y
where
X X X X
X X X Y X X Y
b
X X X X
X X X Y X X Y
b
2 2 2
1 1 1
2
2
1
2
2
2
1
1 2 1
2
1 2
y2.1
2
2
1
2
2
2
1
1 2 2
2
2 1
y1.2
=
=
=


=


=




A marketing manager of a company wants
to predict demand for the product. He is
believing strongly demand (Y) is highly
influenced by annual average price (X1) of
the product (in units) & advertising
expenditure (X2) (Rs in lakh).He has
collected past data to know the effect of
these factors on demand and given below:

Y 4 6 7 9 13 15
X1 15 12 8 6 4 3
X2 30 24 20 14 10 4
Ex: Christmas week is a critical period for most ski
resorts. Because many students and adults are
free from other obligations, they are able to
spend several days indulging in their favorite
pastime, skiing. A large proportion of gross
revenue is earned during this period. A ski resort
in Vermont wanted to determine the effect that
weather had on its sales of lift tickets. The
manager of the resort collected data on the
number of lift tickets sold during Christmas week
(y), the total snowfall in inches (x1), and the
average temperature in degrees Fahrenheit (x2)
for the past 10 years. Develop the multiple
regression model.
Tickets Snowfall Temperature
6835 19 11
7870 15 -19
6173 7 36
7979 11 22
7639 19 14
7167 2 -20
8094 21 39
9903 19 27
9788 18 26
9557 20 16
The Federal Reserve is performing a
preliminary study to determine the
relationship between certain economic
indicators and annual percentage change
in the gross national product (GNP). Two
such indicators being examined are the
amount of the federal governments deficit
(in billions of dollars) and the Dow Jones
Industrial Average (the mean value over
the year). Data for 6 years follow:

Change in GNP 2.5 -1.0 4.0 1.0 1.5 3.0
Federal Deficit 100.0 400.0 120.0 200.0 180.0 80.0
Dow Jones 2850 2100 3300 2400 2550 2700
i) Calculate the least squares equation that best
describes the data.

ii) What % change in GNP would be expected in a year
in which the federal deficit was $240 billion and the
mean Dow Jones value was 3000?


Multiple correlation analysis:

It is a measure of association
between a dependent variable and several
independent variables taken together.

The coefficient of multiple correlation is given
by,



1. and 0 between in lie always value Its
r 1
r r 2r r r
R
2
12
12 y2 y1
2
y2
2
y1
y.12

+
=
Coefficient of multiple determination:

It is the proportion of the total variation
in the multiple values of dependent
variable y, accounted for or explained by
the independent variables in the multiple
regression model.

The square of coefficient of multiple
correlation is called Coefficient of multiple
determination.

You might also like