You are on page 1of 11

The Linear Regression Model in Stata

Pierre Hoonhout

Abstract
There are two linear models: the first is the classicial linear regression model that is discussed
in chapter 1 ogf Hayashi. The more general model assumes less, and is discussed in chapter 2 of
Hayashi. The GMM estimator is a further generalisation and is discussed in chapter 3 of Hayashi.
This note explains how to make Stata calculate estimates, restricted estimates, standard errors
and hypothesis tests in practice.

1 Introduction
Econometrics can be defined as a set of tools aimed at obtaining a credible answer to an interesting
question using data. Many interesting questions can be answered by the specification and estima-
tion of a linear model. This course discusses the theory of linear regression models in econometrics,
with applications using Stata.
It is tempting to see a statistical program like Stata as providing answers to questions. This
is not the case. All that Stata can do for you is calculations. The decision what to calculate is
strictly yours, and requires you to have in-depth knowledge of the econometric arguments involved.
With in-depth knowledge, we mean the following:
1. Understand what you want your calculator (Stata) to calculate. How to estimate the model
and how to obtain standard errors?
2. Understand why you want Stata to calculate these objects. This requires an understanding
of the assumptions that underlie the estimator. The credibility of the assumptions will imply
the credibility of your answer.
3. Understand how to instruct Stata to calculate the objects of interest.
The econometric theory discussed in Hayashi facilitates your understanding of points 1 and 2.
This note focuses more on point 3.

2 The Classical Linear Regression Model


2.1 The OLS-estimates and the Estimated Variance Matrix
The OLS estimator is defined as
n
X
ols = arg max (yi x0i )2 .

i=1

The main results of chapter 1 in Hayashi are:

= (X 0 X)1 X 0 y (1)
 
ar =
Vd (X 0 X)1 (2)

The Stata command regress performs these calculations:

1
sysuse auto, clear
regress price weight length foreign
matrix list e(V)

(1978 Automobile Data)


Source | SS df MS Number of obs = 74
-------------+------------------------------ F( 3, 70) = 28.39
Model | 348565467 3 116188489 Prob > F = 0.0000
Residual | 286499930 70 4092856.14 R-squared = 0.5489
-------------+------------------------------ Adj R-squared = 0.5295
Total | 635065396 73 8699525.97 Root MSE = 2023.1
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | 5.774712 .9594168 6.02 0.000 3.861215 7.688208
length | -91.37083 32.82833 -2.78 0.007 -156.8449 -25.89679
foreign | 3573.092 639.328 5.59 0.000 2297.992 4848.191
_cons | 4838.021 3742.01 1.29 0.200 -2625.183 12301.22
------------------------------------------------------------------------------
symmetric e(V)[4,4]
weight length foreign _cons
weight .92048064
length -28.944101 1077.6991
foreign 123.04755 753.79727 408740.3
_cons 2623.5996 -115363.17 -634717.43 14002638

2.2 The Restricted Estimator


If we minimise SSR() subject to a set of linear restrictions R r = 0, we obtain the restricted
OLS estimator.

rols = uols (X 0 X)1 R0 {R(X 0 X)1 R0 }1 (Ruols r) (3)


= {R(X 0 X)1 R0 }1 (Ruols r)
(4)

The restricted OLS estimator is implemented by the cnsreg command:1

quietly sysuse auto, clear


constraint 1 _b[weight] - _b[length] = 0
constraint 2 _b[_cons] = _b[foreign]
cnsreg price weight length foreign, constraints(1 2)

1 See makecns if you want to add constrained estimation an option for your own estimation program. See intreg for

more general constrained estimation.

2
Constrained linear regression Number of obs = 74
F( 2, 72) = 260.85
Prob > F = 0.0000
Root MSE = 2409.8677
( 1) weight - length = 0
( 2) - foreign + _cons = 0
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | 1.564671 .1709444 9.15 0.000 1.2239 1.905443
length | 1.564671 .1709444 9.15 0.000 1.2239 1.905443
foreign | 998.5726 410.6189 2.43 0.018 180.0187 1817.126
_cons | 998.5726 410.6189 2.43 0.018 180.0187 1817.126
------------------------------------------------------------------------------

The same procedure works for most Stata commands that calculate estimators of nonlinear
models. Estimates of the Lagrange multipliers are not provided.

3
3 Linear Regression Model: Large Sample Theory
3.1 The OLS-estimates and the Estimated Variance Matrix
The main results of chapter 2 in Hayashi are:

= (X 0 X)1 X 0 y (5)
 
ar = (X 0 X)1 (X 0 D X)(X 0 X)1
Vd (6)

The option vce(robust) ensures that the heteroscedasticity-robust variance matrix is calcu-
lated:

sysuse auto, clear


regress price weight length foreign, vce(robust)
matrix list e(V)

(1978 Automobile Data)


Linear regression Number of obs = 74
F( 3, 70) = 18.20
Prob > F = 0.0000
R-squared = 0.5489
Root MSE = 2023.1
------------------------------------------------------------------------------
| Robust
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | 5.774712 1.501652 3.85 0.000 2.779761 8.769663
length | -91.37083 48.1774 -1.90 0.062 -187.4576 4.715957
foreign | 3573.092 647.5668 5.52 0.000 2281.561 4864.623
_cons | 4838.021 5043.206 0.96 0.341 -5220.337 14896.38
------------------------------------------------------------------------------
symmetric e(V)[4,4]
weight length foreign _cons
weight 2.2549593
length -69.34847 2321.0615
foreign 146.98055 1904.9374 419342.74
_cons 6346.9345 -232787.87 -934251.52 25433930

3.2 Estimates of g(): The Delta Method Using Stata


Sometimes we are more interested in (a set of) functions of the parameters that we have estimated.
If the function of interest is linear (g() = A)) we can use the Stata command lincom to obtain
estimates and standard errors of A. This command simply uses V ar[A] = AV ar[]A
0.

quietly sysuse auto, clear


regress price weight length foreign
lincom _b[weight] - _b[length]
// or: lincom weight - length

4
Source | SS df MS Number of obs = 74
-------------+------------------------------ F( 3, 70) = 28.39
Model | 348565467 3 116188489 Prob > F = 0.0000
Residual | 286499930 70 4092856.14 R-squared = 0.5489
-------------+------------------------------ Adj R-squared = 0.5295
Total | 635065396 73 8699525.97 Root MSE = 2023.1
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | 5.774712 .9594168 6.02 0.000 3.861215 7.688208
length | -91.37083 32.82833 -2.78 0.007 -156.8449 -25.89679
foreign | 3573.092 639.328 5.59 0.000 2297.992 4848.191
_cons | 4838.021 3742.01 1.29 0.200 -2625.183 12301.22
------------------------------------------------------------------------------
( 1) weight - length = 0
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 97.14554 33.71213 2.88 0.005 29.90882 164.3823
------------------------------------------------------------------------------

The lincom command can only handle one linear restriction at a time. If you want estimates for
a set of linear restrictions, you can use the command nlcom, to be discussed below. The advantage
of estimating sets of linear combinations over estimating separate linear combinations is that you
gain access to the complete variance matrix of the repeated sampling distribution of the vector of
estimates. Most hypothesis tests require this matrix. Hypothesis testing is discussed in section
3.3.
The Delta method states:
d J(0 )0 )
g(0 )) N (0, J(0 ) AV ar()
n(g()
where J(0 ) denotes the Jacobi matrix of g evaluated in 0 . This matrix can be estimated by

evaluating J in .
An example of using nlcom:

sysuse auto, clear


regress price weight length foreign headroom
nlcom (ratio1: _b[weight]/_b[length]) (ratio2: _b[weight]/_b[headroom]), post

5
(1978 Automobile Data)
Source | SS df MS Number of obs = 74
-------------+------------------------------ F( 4, 69) = 22.21
Model | 357434897 4 89358724.3 Prob > F = 0.0000
Residual | 277630499 69 4023630.42 R-squared = 0.5628
-------------+------------------------------ Adj R-squared = 0.5375
Total | 635065396 73 8699525.97 Root MSE = 2005.9
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | 5.749148 .9514243 6.04 0.000 3.851108 7.647187
length | -81.11971 33.27376 -2.44 0.017 -147.499 -14.74036
foreign | 3570.379 633.9009 5.63 0.000 2305.781 4834.976
headroom | -481.1805 324.0927 -1.48 0.142 -1127.728 165.3667
_cons | 4429.788 3720.404 1.19 0.238 -2992.214 11851.79
------------------------------------------------------------------------------
ratio1: _b[weight]/_b[length]
ratio2: _b[weight]/_b[headroom]
------------------------------------------------------------------------------
price | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ratio1 | -.0708724 .0191617 -3.70 0.000 -.1084287 -.0333161
ratio2 | -.011948 .0083214 -1.44 0.151 -.0282577 .0043617
------------------------------------------------------------------------------

3.3 Testing Hypotheses: The Wald Test


The Wald test uses a quadratic form in the unrestricted estimator to obtain a scalar test-statistic
that has a 2 distribution if the null-hypothesis is true.
The idea that underlies the Wald test is: if H0 : a() = 0 is true, then we expect that a(u )
is not too far from 0. This motivates us to find the repeated sampling distribution of a(u ) if the
null-hypothesis is true. The Delta Method gives the solution:
 
d
 
n a(u ) N 0, J(0 )AV ar(u ))J(0 )0
To obtain a test-statistic, we need to resolve two issues. Firstly, only a scalar-valued test-
statistic allows for comparison with critical values. This can be achieved by a quadratic form. If
we take the middle matrix to be the inverse of the variance matrix, we know that the resulting
statistic will have a chi-squared distribution. Secondly, the unknown matrix J(0 ) will have to be
estimated by J(u ). The unknown matrix AV ar(u )) is equal to 1 1 , and can be estimated
in the usual way. The resulting Wald test-statistic is equal to

 1
W = a(u )0 Jb
b 1 b b 1 Jb0 a(u ) (7)
 1
= a(u )0 Jb
b 1 Jb0 a(u ) if = (8)

If the null-hypothesis is true, the repeated sampling distribution of W is the chi-squared distribu-
tion with r degrees of freedom, where r denotes the number of restrictions. The testing procedure
therefore amounts to calculating the Wald statistic using the observed data and comparing its
value with the 95% quantile of the 2r distribution. If the observed W is larger than the critical
value, we regard this observation to be too unlikely to have been observed if H0 is true. We
therefore consider this evidence against the validity of the null hypothesis.

6
3.4 The Wald Test in Stata
It is a happy fact that Stata has the capacity of doing all the Wald-test calculations for you. The
command test can be used to test a single linear restriction.2 It uses the F test-statistic with
Fr,nk critical values in linear regression models. In nonlinear models it will use the Wald-test
with critical values from the 2r distribution. The command testnl can be used to test sets of
(possibly nonlinear) restrictions. An example of testing a linear restriction (using the F-test):

sysuse auto, clear


regress price weight length foreign, vce(robust)
test weight length

(1978 Automobile Data)


Linear regression Number of obs = 74
F( 3, 70) = 18.20
Prob > F = 0.0000
R-squared = 0.5489
Root MSE = 2023.1
------------------------------------------------------------------------------
| Robust
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | 5.774712 1.501652 3.85 0.000 2.779761 8.769663
length | -91.37083 48.1774 -1.90 0.062 -187.4576 4.715957
foreign | 3573.092 647.5668 5.52 0.000 2281.561 4864.623
_cons | 4838.021 5043.206 0.96 0.341 -5220.337 14896.38
------------------------------------------------------------------------------
( 1) weight = 0
( 2) length = 0
F( 2, 70) = 27.13
Prob > F = 0.0000

You can achieve the same with testnl. Because the latter command is based on the more
general Wald test theory of chapter 2 and 7 of Hayashi, the Wald test-statistic is computed:

sysuse auto, clear


regress price weight length foreign, vce(robust)
testnl (_b[weight]=0) (_b[length]=0)

2 Use contrast to test linear restrictions involving factor variables (categorical variables) and their interactions.

7
(1978 Automobile Data)
Linear regression Number of obs = 74
F( 3, 70) = 18.20
Prob > F = 0.0000
R-squared = 0.5489
Root MSE = 2023.1
------------------------------------------------------------------------------
| Robust
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | 5.774712 1.501652 3.85 0.000 2.779761 8.769663
length | -91.37083 48.1774 -1.90 0.062 -187.4576 4.715957
foreign | 3573.092 647.5668 5.52 0.000 2281.561 4864.623
_cons | 4838.021 5043.206 0.96 0.341 -5220.337 14896.38
------------------------------------------------------------------------------
(1) _b[weight] = 0
(2) _b[length] = 0
chi2(2) = 54.26
Prob > chi2 = 0.0000

Note that W = r F . We now test a set of nonlinear restrictions:

sysuse auto, clear


regress price length mpg weight trunk
testnl (_b[weight]=(1/_b[length])) (_b[_cons]=(1/_b[trunk]))

(1978 Automobile Data)


Source | SS df MS Number of obs = 74
-------------+------------------------------ F( 4, 69) = 9.62
Model | 227368175 4 56842043.6 Prob > F = 0.0000
Residual | 407697222 69 5908655.39 R-squared = 0.3580
-------------+------------------------------ Adj R-squared = 0.3208
Total | 635065396 73 8699525.97 Root MSE = 2430.8
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
length | -109.0618 43.03521 -2.53 0.014 -194.9147 -23.2089
mpg | -86.16235 84.54034 -1.02 0.312 -254.8157 82.49101
weight | 4.387537 1.178452 3.72 0.000 2.036589 6.738484
trunk | 25.59388 97.06998 0.26 0.793 -168.0554 219.2432
_cons | 14896.45 6080.278 2.45 0.017 2766.627 27026.27
------------------------------------------------------------------------------
(1) _b[weight] = (1/_b[length])
(2) _b[_cons] = (1/_b[trunk])
chi2(2) = 15.09
Prob > chi2 = 0.0005

It is important to note at this point that the Wald test is not invariant to nonlinear transfor-
mations (mathematically equivalent re-formulations) of the null-hypothesis. This could be a good
reason to prefer the LM and LR tests.

8
3.5 The LR-Test in Stata
the Stata command lrtest performs Likelihood ratio tests. The LR statistic only requires the
value of the objective function in the unrestricted and restricted estimates. The command uses
stored results from a previous estimation and combines those with the current estimation results.
Estimates can be stored using estimates store:

sysuse auto, clear


regress price weight length foreign
estimates store M_unrestr
regress price foreign
estimates store M_restr
lrtest M_unrestr M_restr
// or: lrtest M_unrestr .

(1978 Automobile Data)


Source | SS df MS Number of obs = 74
-------------+------------------------------ F( 3, 70) = 28.39
Model | 348565467 3 116188489 Prob > F = 0.0000
Residual | 286499930 70 4092856.14 R-squared = 0.5489
-------------+------------------------------ Adj R-squared = 0.5295
Total | 635065396 73 8699525.97 Root MSE = 2023.1
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | 5.774712 .9594168 6.02 0.000 3.861215 7.688208
length | -91.37083 32.82833 -2.78 0.007 -156.8449 -25.89679
foreign | 3573.092 639.328 5.59 0.000 2297.992 4848.191
_cons | 4838.021 3742.01 1.29 0.200 -2625.183 12301.22
------------------------------------------------------------------------------
Source | SS df MS Number of obs = 74
-------------+------------------------------ F( 1, 72) = 0.17
Model | 1507382.66 1 1507382.66 Prob > F = 0.6802
Residual | 633558013 72 8799416.85 R-squared = 0.0024
-------------+------------------------------ Adj R-squared = -0.0115
Total | 635065396 73 8699525.97 Root MSE = 2966.4
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
foreign | 312.2587 754.4488 0.41 0.680 -1191.708 1816.225
_cons | 6072.423 411.363 14.76 0.000 5252.386 6892.46
------------------------------------------------------------------------------
Likelihood-ratio test LR chi2(2) = 58.73
(Assumption: M_restr nested in M_unrestr) Prob > chi2 = 0.0000

The LR test-statistic is similar in value to the Wald test-statistic: 58.73 versus 54.26. They are
not completely comparable, because the LR test does not allow for heteroscedasticity consistent
variance matrices.

3.6 Critical Values in Stata


Hypothesis testing commands in Stata provide p-values. Critical values can also be obtained if
required. The following example calculates the 97.5% quantile of the 22 distribution, as well as
the p-value for an observed test-statistic that has value 1.1:

9
// critical value:
display invchi2(2, 0.95)
// p-value of 1.1:
display %9.4f 1 - chi2(2, 1.1)

5.9914645
0.5769

For the normal distribution of a two-sided test use display invnormal(0.975) and display
(1-normal(1.1)) + normal(-1.1).

10
4 Conclusions
We have discussed three related topics: the unrestricted estimator, the restricted estimator and
hypothesis testing using Wald and LR. What follows are the conclusions that can be drawn with
respect to their implementation in Stata.

Classical Regression Model

1 The regress command performs the OLS calculations.


2 The cnreg command calculates the restricted OLS estimator.

Large Sample Regression Model

3 The option vce(robust) ensures that heteroscedasticity robust standard-errors are


calculated.a
4 The nlcom command implements the Delta method (see also lincom).
5 The nltest command calculates the Wald test-statistic.
6 The lrtest command calculates the LR test-statistic.
a See newey for time-series data.

Special Tests:

7 No heteroscedasticity: use estat hettest. This computes the Breusch-


Pagan/Cook-Weisberg test. The command estat imtest calculates Whites test
and more general versions.
8 No autocorrelation: the actest command is currently the most general, as it uses
Cumby-Huizinga test. This command can calculate the Breush-Godfrey, Box-Pierce
and Ljung-Box statistics as special cases.

11

You might also like