You are on page 1of 9

CLASSICAL LINEAR REGRESSION MODEL

INTRODUCTION
The classical linear regression model is a statistical model that describes a
data generation process.
SPECIFICATION
The specification of the classical linear regression model is defined by the
following set of assumptions.
Assumptions
1. The functional form is linear in parameters.
Yt = 1Xt1 + 2Xt2 + + kXtk + t
2. The error term has mean zero.
E(t) = 0 for t = 1, 2, , T
3. The error term has constant variance.
Var(t) = E(t2) = 2 for t = 1, 2, , T
4. The errors are uncorrelated.
Cov(t,s) = E(t s) = 0 for all t s
5. The error term has a normal distribution.
t ~ N for t = 1, 2, , T
6. The error term is uncorrelated with each explanatory variable.
Cov(t,Xti) = E(t Xti) = 0 for t = 1, 2, , T and i = 1, 2, , K
7. The explanatory variables are nonrandom variables.
Classical Linear Regression Model Concisely Stated
The sample of T multivariate observations (Yt, Xt1, Xt2, , Xtk) are generated
by a process described as follows.
Yt = 1Xt1 + 2Xt2 + + kXtk + t
t ~ N(0, 2)

for t = 1, 2, , T

or alternatively,
Yt ~ N( 1Xt1 + 2Xt2 + + kXtk , 2) for t = 1, 2, , T
Classical Linear Regression Model in Matrix Format
The sample of T multivariate observations (Yt, Xt1, Xt2, , Xtk) are generated
by a process described by the following system of T equations.

Observation 1
Y1 = 1X11 + 2X12 + + kX1k + 1
Observation 2
Y2 = 1X21 + 2X22 + + kX2k + 2

Observation T
YT = 1XT1 + 2XT2 + + kXTk + T
Note the following. 1) There is one equation for each multivariate
observation. 2) The parameters are constants, and therefore have the same
value for each multivariate observation. 3) The system of T equations can
be written equivalently in matrix format as follows.
y = X +
y is a Tx1 column vector of observations on the dependent variable. X is a
TxK matrix of observations on the K-1 explanatory variables X2, X3, Xk. The
first column of the matrix X is a column of 1s representing the constant
(intercept) term. The matrix X is called the data matrix or the design matrix.
is a Kx1 column vector of parameters 1, 2 k. is a Tx1 column vector
of disturbances (errors).
Assumptions in Matrix Format
1. The functional form is linear in parameters.
y = X +
2. The mean vector of disturbances is a Tx1 null vector.
E() = 0
3. The disturbances are spherical. (The variance-covariance matrix of
disturbances is a TxT diagonal
matrix).
Cov() = E( T) = 2I
Where superscript T denotes transpose and I is a TxT identity matrix.
4. The disturbance vector has a multivariate normal distribution.
~N
5. The disturbance vector is uncorrelated with the data matrix.
Cov (,X) = 0
6. The data matrix is a nonstochastic matrix.
Classical Linear Regression Model Concisely Stated in Matrix Format
The sample of T multivariate observations (Yt, Xt1, Xt2, , Xtk) are generated
by a process described as follows.
or alternatively

y = X + ,

~ N(0, 2I)

y ~ N(X, 2I)
ESTIMATION
For the classical linear regression model, there are K+1 parameters to
estimate: K regression coefficients 1, 2 k, and the error variance
(conditional variance of Y) 2.
Choosing an Estimator for 1, 2 k
To obtain estimates of the parameters, you need to choose an estimator. To
choose an estimator, you choose an estimation procedure. You then apply
the estimation procedure to your statistical model. This yields an estimator.
In econometrics, the estimation procedures used most often are:
1. Least squares estimation procedure
2. Maximum likelihood estimation procedure
Least Squares Estimation Procedure
When you apply the least squares estimation procedure to the classical
linear regression model you get the ordinarily least squares (OLS) estimator.
The least squares estimation procedure tells you to choose as your estimates
of the unknown parameters those values that minimize the residual sum of
squares function for the sample of data. For the classical linear regression
model, the residual sum of squares function is
RSS(1^, 2^ k^) = (Yt - 1^ - 2^ X12 - - k^ X1k)2
Or in matrix format,
RSS( ^) = (y - X ^)T(y - X ^)
The first-order necessary conditions for a minimum are
XTX ^ = XTy
These are called the normal equations. If the inverse of the KxK matrix XTX
exists, then you can find the
solution vector ^. The solution vector is given by
^ = (XTX)-1XTy
where ^ is a Kx1 column vector of estimates for the K-parameters of the
model. This formula is the OLS estimator. It is a rule that tells you how to
use the sample of data to obtain estimates of the population parameters.

Maximum Likelihood Estimation Procedure


When you apply the maximum likelihood estimation procedure to the
classical linear regression model you get the maximum likelihood estimator.
The maximum likelihood estimation procedure tells you to choose as your
estimates of the unknown parameters those values that maximize the
likelihood function for the sample of data. For the classical linear regression
model, the maximum likelihood estimator is
^ = (XTX)-1XTy
Thus, for the classical linear regression model the maximum likelihood
estimator is the same as the OLS estimator.
Choosing an Estimator for 2
To obtain an estimate of the error variance, the following estimator is the preferred
estimator,

2^

RSS
=
=
Tk

T
Tk

Properties of the OLS Estimator


1. Linear Estimator
The OLS estimator is a linear estimator
2. Sampling Distribution of the OLS Estimator
The OLS estimator has a multivariate normal sampling distribution
3. Mean of the OLS Estimator
The mean vector of the OLS estimator gives the mean of the sampling
distribution of the estimator for each of the K parameters. To derive the
mean vector of the OLS estimator, you need to make two assumptions:
1. The error term has mean zero.
2. The error term is uncorrelated with each explanatory variable.
If these two assumptions are satisfied, then it can be shown that the mean
vector of the OLS estimator is
E( ^) =

That is, the mean vector of the OLS estimator is equal to the true values of
the population parameters being estimated. This tells us that for the
classical linear regression model the OLS estimator is an unbiased estimator.
4. Variance-Covariance Matrix of Estimates
The variance-covariance matrix of estimates gives the variances and
covariances of the sampling distributions of the estimators of the K
parameters. To derive the variance-covariance matrix of estimates, you
need to make four assumptions:
1. The error term has mean zero.
2. The error term is uncorrelated with each explanatory variable
3. The error term has constant variance.
4. The errors are uncorrelated.
If these four assumptions are satisfied, then it can be shown that the
variance-covariance matrix of estimates is
Cov( ^) = 2(XTX)-1
For the classical linear regression model, it can be shown that the elements
in the variance-covariance matrix of OLS estimates is less than or equal to
the corresponding elements in the variance-covariance matrix for any
alternative linear unbiased estimator; therefore, for the classical linear
regression model the OLS estimator is an efficient estimator.
5. Sampling Distribution of the OLS Estimator Written Concisely
^ ~ N(, 2(XTX)-1)
The OLS estimator has a multivariate normal distribution with mean vector
and variance-covariance matrix 2(XTX)-1.
Summary of Small Sample Properties
Gauss-Markov Theorem - For the classical linear regression model, the OLS
estimator is the best linear unbiased estimator (BLUE) of the population
parameters.
Summary of Large Sample Properties
For the classical linear regression model, the OLS estimator is asymptotically
unbiased, consistent, and asymptotically efficient.
Estimating the Variance-Covariance Matrix of Estimates

The true variance-covariance matrix of estimates, 2(XTX)-1 is unknown. The


is because the true error variance 2 is unknown. Therefore, the variancecovariance matrix of estimates must be estimated using the sample of data.
To obtain an estimate of the variance-covariance matrix, you replace 2 with
its estimate 2^ = RSS / (T K). This yields the estimated variancecovariance matrix of estimates
Cov^( ^) = 2^(XTX)-1
HYPOTHESIS TESTING
The following statistical tests can be used to test hypotheses in the classical
linear regression model.
1.
2.
3.
4.
5.

t-test
F-test
Likelihood ratio test
Wald test
Lagrange multiplier test

You must choose the appropriate test to test the hypothesis in which you are
interested.
GOODNESS-OF-FIT
If our objective is to use the explanatory variable(s) to predict the dependent
variable, then we should measure the goodness of fit of the model. Goodness-of-fit
refers to how well the model fits the sample data. The better the model fits the
data, the higher the predictive validity of the model, and therefore the better values
of X should predict values of Y. The statistical measure that is used most often to
measure the accuracy of a classical linear regression model is he R-squared (R2)
statistic.
R-Squared Statistic
The coefficient of determination measures the proportion of the variation in the
dependent variable that is explained by the variation in the explanatory variables.
It can take any value between 0 and 1. If the R 2 statistic is equal to zero, then the
explanatory variables explain none of the variation in the dependent variable. If the
R2 is equal to one, then the explanatory variables explain all of the variation in the
dependent variable. The R2 statistic is a measure of goodness of fit. This is because
it measures how well the sample regression line fits the data. If the R 2 is equal to
one, then all of the data points lie on the sample regression line. If the R 2 is equal
to zero, then the data points are highly scattered around the regression line, which
is a horizontal line. The higher the R 2 statistic, the better the explanatory variables
explain the dependent variable, and using this criterion the better the model.

Important Points About the Coefficient of Determination


1. The R2 can take any value between zero and one.
2. The closer the data points to the sample regression line, the better the line fits
the data. The better the line fits the data, the lower the residual sum of squares
and the higher the R2 statistic.
3. If R2 =1, then all the data points lie on the sample regression line.
4. If R2 = 0, then the regression line is horizontal at the sample mean of Y. In this
case, the simple mean of Y predicts Y as well as the conditional mean of Y.
5. The R2 statistic can be computed by finding the correlation coefficient between
the actual values of (Yt) and the corresponding estimated values (Yt), and
squaring this correlation coefficient. This is true regardless of the number of
explanatory variables in the model.
6. The OLS estimator fits a line to the data that minimizes the residual sum of
squares. By doing this, the OLS estimator fits a line to the data that minimizes
the unexplained variation is Y, and therefore maximizes the explained variation
in Y. Thus, the OLS estimator fits a line to the data that maximizes the R 2
statistic.
7. There is no rule-of-thumb to decide whether the R 2 statistic is high or low. When
a model is estimated with time-series data, the R 2 statistic is usually high. This
is because with time-series data, the variables tend to have underlying trends
that make them highly correlated. When a model is estimated with cross-section
data, the R2 statistic is usually lower. Therefore, an R2 statistic of 0.5 may be
considered relatively low for a model estimated with time-series data, and
relatively high for a model estimated with cross-section data.
8. Peter Kennedy says, In general econometricians are interested in obtaining
good parameter estimates, where good is not defined in terms of R 2.
Consequently, the measure of R2 is not of much importance in econometrics.
Unfortunately, many practitioners act as though it is important, for reasons that
are not entirely clear. Cramer states, Measures of goodness of fit have a fatal
attraction. Although it is generally conceded among insiders that they do not
mean a thing, high values are still a source of pride and satisfaction to their
authors, however hard they may try to conceal these feelings.

Major Shortcoming of the R-Squared Statistic


The R2 statistic has a major deficiency. When you add additional independent
variables to the model, the R2 cannot decrease and will most likely increase. Thus, it
may be tempting to engage in a fishing expedition to increase the R 2.
Adjusted R-Squared Statistic
To penalize the fishing expedition for variables that increase the R 2, economists
most often use a measure called the adjusted R 2. The adjusted R2 is the R2 statistic
adjusted for degrees of freedom. The following points should be noted about the
Adjusted R2 statistic.
1. Adding an additional variable to the model can either increase or decrease
the Adjusted R2.
2. The Adjusted R2 statistic can never be larger than the R2 statistic for the
same model.

3. The Adjusted R2 statistic can be negative. A negative Adjusted R 2 statistic


indicates that the statistical model does not adequately describe the
economic data generation process.
4. If the t-statistic for a coefficient is one or greater, then dropping the
variable from the model will decrease the adjusted R 2 statistic. If the tstatistic is less than one, than dropping the variable from the model will
increase the adjusted R2.
R2 Statistic When the Model Does Not Include a Constant
If the statistical model does not include a constant, then the R 2 does not measure
the proportion of the variation in the dependent variable explained by the
independent variable, and therefore should not be used.
PREDICTION
Often times, the objective of an empirical study is to make predictions about the
dependent variable. This is also called forecasting. In general, the better the
model fits the sample data, the better the it will predict the dependent variable.
Said another way, the larger the amount of variation in the dependent variable that
the model explains, the better it will predict the dependent variable. Thus, if your
objective if prediction, then you would place more emphasis on the R 2 statistic. This
is because the higher the R2 statistic, the greater the predictive ability of the model
over the sample observations.

You might also like