Yesim Ozan - Simple Linear Regression-Presentation - 08.08.15

1
SIMPLE LINEAR REGRESSION
Prepared for the presentation at Kuzgun Consulting

by Yeim OZAN
Istanbul, 12.08.2015
2
Contents
1. Definition and Purpose of Simple Linear Regression

2. Derivation of Linear Regression Equations
3. Parameter Estimation
4. Hypothesis Test for Parameter Estimation
5. Analysis of Variance Approach to Test the Significance
of Regression
6. Confidence Interval on Fitted Values
7. Coefficient of Determination
3
1. Definition and Purpose of Simple

Linear Regression
Technique for determining how one variable of interest is affected by the
changes in another variable.
Used for three main purposes;
To describe the linear dependence of one variable on another

The predict values of one variable from variables of another, for which more
data are available
To correct for the linear dependence of one variable on another, in order to
clarify other features of its variability.
1.Definition and Purpose of Simple Linear
Regression (cont`d.)
Regression model:
Yi = 0 + 1Xi + i
Independent/Explanatory
Dependent/Response Residual/Error
5
1.Definition and Purpose of Simple

Linear Regression (cont`d.)
Linear regression determines the best-fit line through a

scatter plot of data, such that the sum of squared residuals
is minimized; equivalently, it minimizes the error variance.
The fit is "best" in precisely that sense: the sum of squared
errors is as small as possible.
6
2.Derivation of linear regression

equations (cont`d.)
Under these assumptions;

E(i)=0
Var(i)=2
1, 2, n are independent from each other
If E(Y/X=x)=0 + 1X is a function between random

variables X, Y and observations can be expressed as
Yi= 0 + 1Xi+i , i=1,2,,n
this model is called linear regression model.
7
3.Parameter Estimation
With these assumptions
i ~ N(0, 2)
1, 2, n are independent from each other
model parameters are
0, 1 R and 2(0,).
Provided that Xi is fixed and Yi is random variable

for i=1,2,,n ,
Yi ~N(0 + 1Xi, 2)
and Yi are independent from each other for i=1,2,n.

8
3.Parameter Estimation (cont`d.)

methods are used, least squares and maximum likelihood
Two
methods.
Maximum likelihood parameter estimators are as follows:
Once and are known, the fitted regression line can be written
as:
and residuals can be written as:

9
4.Hypothesis Test for Parameter Estimation
Hypothesis is:
The test statistic used for this test is:
The null hypothesis, , is accepted if the calculated value of the test

statistic is such that:
A similar procedure can be used to test the hypothesis on the intercept, .

10
4.Hypothesis Test for Parameter Estimation

(cont`d.)
Possible scatter plots of against . Plots (a) and (b) represent cases
when is not rejected. Plots (c) and (d) represent cases
when is rejected.
11
5. Analysis of Variance Approach to Test the Significance of

Regression
The analysis of variance (ANOVA) is another method to test for the

significance of regression.
A sample of classic ANOVA table:
Degrees of Sum of Mean Squares
Source
Freedom Squares (SS) (MS)
Regression 1
Error n-2
Total n-1
Each terms included in the above ANOVA table shall be calculated

separately. Such calculations are explained in the following slides.
12
5.1. Sum of Squares (SS)

theis sum
SS of square of deviations of all the observations, , from their mean, . In
context of ANOVA, this quantity is called the total sum of
squares (abbreviated ) because it relates to the total variance of the
observations.
In a perfect model, the regression model is such that the resulting fitted
regression line passes through all of the observations.
can be calculated using a relationship similar to the one for

obtaining. Therefore:
13
5.1. Sum of Squares (SS) (cont`d.)

variability
In a non-perfect model, a certain part still remains unexplained in the total
of the observed data. This is called as the error sum of squares
(abbreviated ).
can be obtained as the sum of squares of these deviations:
The total variability of the observed data (i.e., total sum of squares,)
can be written as:
The above equation also means to the following:

14
5.1. Sum of Squares (SS) (cont`d.)
Scatter plots showing the deviations for the sum of squares used in
ANOVA. (a) shows deviations for , (b) shows deviations for , and (c) shows
deviations for .
15
5.2. Mean Squares (MS)

squares are obtained by dividing the sum of squares by the
Mean
respective degrees of freedom. The error mean square, , can be
obtained as:
is an estimate of the variance, , of the random error term, , and can be

written as:
Similarly, the regression mean square, , can be obtained by dividing

the regression sum of squares by the respective degrees of freedom as
follows:
16
5.3. Test
test the hypothesis , the statistic used is based on the distribution. It
To
can be shown that if the null hypothesis is true, then the statistic:
The above-stated statistic follows the distribution with 1 degree of freedom

in the numerator and degrees of freedom in the denominator. is rejected if
the calculated statistic, , is such that:
is the percentile of the distribution corresponding to a cumulative

probability of and, is the significance level.
17
6.Confidence Interval on Fitted Values
A () percent confidence interval on any fitted value, , is obtained as

follows:
The width of the confidence interval depends on the value of and will be a
minimum at and will widen as increases.
18
7. Coefficient of Determination
The coefficient of determination is a measure of the amount of

variability in the data accounted for by the regression model The
coefficient of determination is the ratio of the regression sum of
squares to the total sum of squares.
can take on values between 0 and 1. The value of increases as more

terms are added to the model, even if the new term does not
contribute significantly to the model.
19
Thank you for your attention.

Yesim Ozan - Simple Linear Regression-Presentation - 08.08.15

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Yesim Ozan - Simple Linear Regression-Presentation - 08.08.15

Uploaded by

Copyright:

Available Formats

1

SIMPLE LINEAR REGRESSION

Prepared for the presentation at Kuzgun Consulting

1. Definition and Purpose of Simple Linear Regression

1. Definition and Purpose of Simple

Used for three main purposes;

To describe the linear dependence of one variable on another

1.Definition and Purpose of Simple

Linear regression determines the best-fit line through a

2.Derivation of linear regression

Under these assumptions;

If E(Y/X=x)=0 + 1X is a function between random

Provided that Xi is fixed and Yi is random variable

and Yi are independent from each other for i=1,2,n.

3.Parameter Estimation (cont`d.)

and residuals can be written as:

4.Hypothesis Test for Parameter Estimation

The test statistic used for this test is:

The null hypothesis, , is accepted if the calculated value of the test

A similar procedure can be used to test the hypothesis on the intercept, .

4.Hypothesis Test for Parameter Estimation

5. Analysis of Variance Approach to Test the Significance of

The analysis of variance (ANOVA) is another method to test for the

Each terms included in the above ANOVA table shall be calculated

5.1. Sum of Squares (SS)

can be calculated using a relationship similar to the one for

5.1. Sum of Squares (SS) (cont`d.)

The above equation also means to the following:

5.1. Sum of Squares (SS) (cont`d.)

5.2. Mean Squares (MS)

is an estimate of the variance, , of the random error term, , and can be

Similarly, the regression mean square, , can be obtained by dividing

The above-stated statistic follows the distribution with 1 degree of freedom

is the percentile of the distribution corresponding to a cumulative

6.Confidence Interval on Fitted Values

A () percent confidence interval on any fitted value, , is obtained as

The coefficient of determination is a measure of the amount of

can take on values between 0 and 1. The value of increases as more

Thank you for your attention.

You might also like