Multiple Regression Analysis and Inference

Chapter 04
Ch.4 Multiple Regression: Inference

1.
2.
3.
4.
Sampling Distributions of OLSE

Testing Hypotheses: The t test
Confidence Intervals
Testing Hypotheses about a Single Linear
Combination of the Parameters
5. Testing Multiple Restrictions: F test
6. Reporting Regression Results
Multiple Regression Analysis

y = 0 + 1x1 + 2x2 + . . . + kxk + u
2. Inference
Econometrics
In Ch.3, we learned that OLSE is BLUE

under the Gauss-Markov assumption.
In order to do classical hypothesis testing,
we need to add another assumption (beyond
the Gauss-Markov assumptions).
Assume that u is independent of x1, x2,, xk
and u is normally distributed with zero
mean and variance 2:
u ~ N (0,2)
MLR.6
The homoskedastic normal distribution with

a single explanatory variable
CLT: The sum of independent random variables,

when standardized by its standard deviation, has a
distribution that tends to standard normal as the
sample size grows.
Econometrics
Normal Sampling Distributions
Under the CLM assumption s, conditiona l on

the sample values of the independen t variable s
~ N , Var . (4.1)
f(y|x)
We can summarize the population

assumptions of CLM as follows
y|x ~ N (0 + 1x1 ++ kxk , 2)
Large sample will let us drop normality
assumption, because of Central Limit
Theorem (CLT).
CLM Assumptions
4.1 Sampling Distributions of OLSE
Econometrics
Econometrics

~ N 0,1.
Therefore,
sd
j
E(y|x) = 0 + 1x
Normal
distributions
x1
j is distribute d normally because it is a

linear combinatio n of the errors.
x2
Econometrics
Multiple Rgression 2: Inference
Econometrics
Chapter 04
4.2 Testing Hypotheses: t Test
Cont.
Under the CLM assumptions,

j j
~ t nk 1. (4.3)
se j
Note this is a t distribution (not normal dist.)

because we have to estimate 2 by 2 .
Note the degrees of freedom : n k 1.
Econometrics
Cont.
se( j )
(4.5)
Econometrics
H1: j > 0 and H1: j < 0 are one-sided

H1: j 0 is a two-sided alternative
If we want to have only a 5% probability of

rejecting H0 if it is really true, then we say our
significance level is 5%.
Econometrics
We can reject the null hypothesis if the t

statistic is greater than the critical value.
If the t statistic is less than the critical value
then we fail to reject the null.
yi = 0 + 1xi1 + + kxik + ui
H0: j = 0, H1: j > 0
Fail to reject
reject
Econometrics
10
One-Sided Alternatives
Test: procedure
Having picked a significance level, , we

look up the (1 )th percentile in a t
distribution with n k 1 d.f. and call this c,
the critical value.
Besides our null, H0, we need an alternative

hypothesis, H1, and a significance level.
H1 may be one-sided, or two-sided.
Then, we will use our t statistic along with a

rejection rule to determine whether to accept
the null hypotheses, H0.
Cont. t
Econometrics
t Test: procedure
The statistic we use to test (4.4) is called

the t statistic of estimate j:
Knowing the sampling distribution for the

standardized estimator allows us to carry
out hypothesis tests.
Start with a null hypothesis, for example,
H0: j = 0
(4.4)
If accept null, then accept that xj has no
effect on y, controlling for other
independent variables.
The t Test
The t Test
11
Econometrics
c
12
Chapter 04
Summary for H0: j = 0
Two-Sided Alternatives
yi = 0 + 1xi1 + + kxik + ui
The alternative is usually assumed to be

two-sided.
If we reject the null, we typically say, xj is
statistically significant at the % level.
If we fail to reject the null, we typically say,
xj is statistically insignificant at the %
level.
H0: j = 0, H1: j 0
fail to reject
reject
reject
-c
0
Econometrics
13
Testing other hypotheses
aj
se
j
An alternative to the classical approach is to

ask, what is the smallest significance level at
which the null would be rejected?
So, compute the t statistic, and then look up
what percentile it is in the appropriate t
distribution this is the p-value.
Most computer packages will compute the pvalue, assuming a two-sided test (H0: j = 0).
(4.13),
where a j 0 for the standard test

Econometrics
Econometrics
16
4.4 Testing a Linear Combination
Another way to use classical statistical

testing is to construct a confidence interval
using the same critical value as was used for
a two-sided test.
A (1 - ) % confidence interval is defined
as
j c se j ,
(4.16)

where c is the 1 - percentile in a t n k 1 distribution
2
If you want a one-sided alternative, just divide

the two-sided p-value by 2.
15
4.3 Confidence Intervals
Econometrics
14
Computing p-values for t tests
A more general form of the t statistic

recognizes that we may want to test
something like
(4.12)
H0: j = aj .
In this case, the appropriate t statistic is
t
Econometrics
17
Suppose instead of testing whether 1 is

equal to a constant, you want to test if it is
equal to another parameter, that is
H0 : 1 = 2. (4.18)
Use same basic procedure for forming a t

statistic; t 1 2
(4.20)
se 1 2
Usually, its hard to calculate it by hand. So,

more generally, you can always restate the
problem to get the test you want.
Econometrics
18
Chapter 04
Example:
Cont. Example
Suppose you are interested in the difference

of return by academic record.
ln(w) = 0 + 1jc + 2univ + 3exper + u (4.17)
H 0 : 1 = 2 ,
(4.18) or
H0: 1 = 1 2 = 0 (4.24) 1 = 1 + 2,
so substitute in and rearrange

ln(w) = 0 + 1 jc + 2 (jc + univ) + 3exper + u
Econometrics
A typical example is testing exclusion

restrictions we want to know if a group of
parameters are all equal to zero.
Cont.
20
Now the null & alternative hypothesis

might be something like
H0: k-q+1 = 0, , k = 0.
H1: H0 is not true.
We cant just check each t statistic separately,

because we want to know if the q parameters
are jointly significant at a given level it is
possible for none to be individually significant
at that level.
Econometrics
Cont.
To do the test, we need to estimate the

restricted model without xk-q+1,, , xk, as
well as the unrestricted model with all xs.
(unrestricted model)
y = 0 + 1x1 +. . .+ k-qxk-q +. . .+kxk + u (4.34)
(restricted model)
y = 0 + 1x1 +. . .+ k-qxk-q + u (4.36)
Econometrics
21
Exclusion Restrictions
Econometrics
Other examples of hypotheses about a single

linear combination of parameters:
1 = 1 + 2, 1 = 52, 1 = -1/22 etc.
Testing Exclusion Restrictions
Everything weve done so far has involved

testing a single linear restriction, (e.g. 1 =
or 1 = 2 )
However, we may want to jointly test
multiple hypotheses about our parameters.
Econometrics
19
4.5 Multiple Linear Restrictions
This is the same model as originally, but

now you get a standard error for 1 2 = 1
directly from the basic regression.
Any linear combination of parameters
could be tested in a similar manner.
23
22
Exclusion Restrictions
Intuitively, we want to know if the change

in SSR is big enough to warrant inclusion of
xk-q+1,, , xk.
F
SSRr SSRur q
SSRur n k 1
(4.37),
where SSRr is the sum of squared residuals from

the restricted model and SSRur is the sum of
squared residuals from the unrestricted model.
Econometrics
24
Chapter 04
The F statistic
Cont. The
The F statistic is always positive, since the

SSRr cant be less than the SSRur.
Essentially the F statistic is measuring the
relative increase in SSR when moving from
the unrestricted to restricted model.
q = number of restrictions = dfr dfur

n k 1 = dfur
Econometrics
Cont. The
Reject area
26
Rr2 q
1 R n k 1
2
ur
2
ur
(4.41),
where again r is restricted and ur is unrestricted.

27
Overall Significance
Econometrics
28
F Statistic Summary
A special case of exclusion restrictions is to

test that none of the regressors has an effect
on y.
H0: 1 = 2 == k = 0
Because the SSRs may be large and unwieldy, an

alternative form of the formula is useful.
We use the fact that SSR = SST(1 R2) for any
regression, so we can substitute in for SSRu and
SSRur.
F
Econometrics
(4.44)
Since the R2 from a model with only an

intercept will be zero, the F statistic is
simply,
F
Econometrics
The R2 form of the F statistic

Reject H0 at
significance level
if F > c
Fail-to-reject area
To decide if the increase in SSR when we

move to a restricted model is big enough
to reject the exclusions, we need to know
about the sampling distribution of our F stat.
Not surprisingly, F ~ Fq,n-k-1, where q is
referred to as the numerator degrees of
freedom and n k 1 as the denominator
degrees of freedom.
25
F statistic
f(F)
F statistic
Just as with t statistics, p-values can be

calculated by looking up the percentile in
the appropriate F distribution.
If only one exclusion is being tested, then
F = t2, and the p-values will be the same.
R2 k
(4.46)
1 R n k 1
Econometrics
29
Econometrics
30
Chapter 04
4.6 Reporting Results
31

Multiple Regression Analysis and Inference

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multiple Regression Analysis and Inference

Uploaded by

Copyright:

Available Formats

Chapter 04

Ch.4 Multiple Regression: Inference

Sampling Distributions of OLSE

Multiple Regression Analysis

In Ch.3, we learned that OLSE is BLUE

The homoskedastic normal distribution with

CLT: The sum of independent random variables,

Normal Sampling Distributions

Under the CLM assumption s, conditiona l on

We can summarize the population

4.1 Sampling Distributions of OLSE

j is distribute d normally because it is a

Multiple Rgression 2: Inference

4.2 Testing Hypotheses: t Test

Under the CLM assumptions,

Note this is a t distribution (not normal dist.)

H1: j > 0 and H1: j < 0 are one-sided

If we want to have only a 5% probability of

We can reject the null hypothesis if the t

Multiple Rgression 2: Inference

Having picked a significance level, , we

Besides our null, H0, we need an alternative

Then, we will use our t statistic along with a

The statistic we use to test (4.4) is called

Knowing the sampling distribution for the

Summary for H0: j = 0

The alternative is usually assumed to be

Testing other hypotheses

An alternative to the classical approach is to

where a j 0 for the standard test

4.4 Testing a Linear Combination

Another way to use classical statistical

Multiple Rgression 2: Inference

If you want a one-sided alternative, just divide

4.3 Confidence Intervals

Computing p-values for t tests

A more general form of the t statistic

Suppose instead of testing whether 1 is

Use same basic procedure for forming a t

Usually, its hard to calculate it by hand. So,

Suppose you are interested in the difference

so substitute in and rearrange

A typical example is testing exclusion

Now the null & alternative hypothesis

We cant just check each t statistic separately,

To do the test, we need to estimate the

Multiple Rgression 2: Inference

Other examples of hypotheses about a single

Testing Exclusion Restrictions

Everything weve done so far has involved

4.5 Multiple Linear Restrictions

This is the same model as originally, but

Intuitively, we want to know if the change

where SSRr is the sum of squared residuals from

The F statistic is always positive, since the

q = number of restrictions = dfr dfur

where again r is restricted and ur is unrestricted.

A special case of exclusion restrictions is to

Because the SSRs may be large and unwieldy, an

Since the R2 from a model with only an

The R2 form of the F statistic

To decide if the increase in SSR when we

Just as with t statistics, p-values can be