Professional Documents
Culture Documents
(EKN309)
CHAPTER 2
TWO VARIABLE REGRESSION ANALYSIS:
SOME BASIC IDEAS
Chapter 1: the concept of regression in broad terms.
Chapter 2 and 3: introduction to the theory underlying the simplest possible
regression analysis the bivariate, or two-variable, regression:
where the dependent variable (the regressand) is related to a single
explanatory variable (the regressor).
Geometrically;
It is the locus of the conditional means of the dependent variable for the fixed
values of the explanatory variable(s).
More simply;
It is the curve connecting the means of the subpopulations of Y corresponding
to the given values of the regressor X.
. .
(. . )
From now on the term linear regression will always mean a regression that is
linear in the parameters; the s (that is, the parameters are raised to the first
power only). It may or may not be linear in the explanatory variables, the Xs.
YES
NO
YES
NO
(. . )
= +
= 1 + 2 +
=
Since
itself:
= +
(. . )
= 0 (. . )
Thus, the assumption that the regression line passes through the conditional
means of Y implies that the conditional mean values of (conditional upon
the given Xs) are zero.
1. Vagueness of theory
ignorant or unsure about the other variables affecting Y; : as a substitute for
all the excluded or omitted variables from the model.
2. Unavailability of data
Even if we know what some of the excluded variables are, we may not have
quantitative information about them; captured in .
3. Core variables versus peripheral variables
The joint influence of all or some of the variables are so small and it is
meaningless to introduce them into the model explicitly; combined effect
being treated as a random variable, .
Even if we succeed in introducing all the relevant variables into the model,
some intrinsic randomness in individual Ys that cannot be explained; s
reflecting this randomness.
5. Poor proxy variables
Errors in measurement where data may not be measured accurately;
representing the errors of measurement.
6. Principle of parsimony
To keep the as simple as possible; representing all other variables.
7. Wrong functional form
Correct variables explaining a phenomenon but not sure about the functional
form. The scattergram: helpful if two-variable model is concerned.
= 1 + 2
(. . )
= 1 + 2 +
(. . )
TO SUM UP
= 1 + 2 +
. .
= 1 + 2 +
(. . )
WHY? Because more often we do not have data for all the population.
For = , we have one (sample) observation = see tables 2.4 and 2.5 .
= +
(. . )
= + (. . )
Granted that the SRF is but an approximation of the PRF, can we devise a
rule or method that will make this approximation as close as possible?
OR:
How should the SRF be constructed so that 1 is as close as possible to the
true 1 AND 2 is as close as possible to the true 2 even though we will
never know the true 1 2 ?
CHAPTER 3: procedures that tell us how to construct the SRF to mirror the PRF as
faithfully as possible.