You are on page 1of 34

4

J.:

Introduction to
ECONOMETRICS
Christopher Dougherty

online
f f resource
centre
Introduction to
Econometrics
:
,
11-T} i EDflON

Christopher Dougherty
London School of Economics and Political Science

OXTORD
UNIVERSITY PRESS
OXFORD
UNIVERSITY PRHSS

Great Clarendon Street, Oxford, OX2 6DP,


United Kingdom
Oxford University Press is a department of the University of Oxford.
It furthers the University's objective of excellence in research, scholarship,
and education by publishing worldwide. Oxford is a registered trade mark of
Oxford University Press in the UK and in certain other countries
O Christopher Dougherty 2016
The moral rights of the author have been asserted
Second edition 2002
Third edition 2007
Fourth edition 2011
I mpression: 1
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any forro or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
hy Iaw, by licence or under tercos agreed with the appropriate reprographics
rights organization. Enquiries concerning reproduction outside the scope of the
abo y e should be sent to the Rights Department, Oxford University Press, at the
address ahoye
You must not circulate this work in any other forro
and you must impose this sarne condition un any acquirer
Published in the United States of America hy Oxford University Press
198 Madison Avenue, New York, NY 10016, United States of America
British Library Cataloguing in Publication Data
Data available
Library of Congress Control Nuiiiber: 2015951527
ISBN 978-0-19-967682-8
Printed in Italy by L.E.G.O. S.p.A.
Links to third party wehsites are provided by Oxford in good faith and
for information only. Oxford disclaims any responsihility for the materials
contained in any third party website referenced in this work.
Preface

Introduction to Econometrics
'Fhis is a textbook for a year-long undergraduate course in econometrics. It is
intended to fill a need that has been generated by the changing profile of the
typical econometrics student. Econometrics courses often used to be optional for
economics majors, but now they are becoming compulsory. Several factors are
responsible. Perhaps the most important is the recognition that an understanding
of empirical research techniques is not just a desirable but an essential part of the
basic training of an economist, and that courses limited to applied statistics are
inadequate for this purpose. No doubt this has been reinforced by the fact that
graduate-level courses in econometrics have become increasingly ambitious, with
the consequence that substantial exposure to econometrics at an undergraduate
level is now a requirement for admission to the leading graduate schools.
There are also supply-side factors. The wave that has lifted econometrics to
prominence in economics teaching comes on the heels of another that did the same
for mathematics and statistics. Without this prior improvement in quantitative
training, the shift of econometrics to the core of the economics curriculurn would
not have been possible.
As a consequence of this development, students on econometrics courses are
more varied in their capabilities than ever before. No longer are they a self-
selected minority of mathematical high-fliers. The typical student now is a regular
econorrmics major who has taken basic, but not advanced, courses in calculus
and statistics. The democratization of econometrics has created a need for a
broader range of textbooks than before, particularly for the wider audience. The
mathematical elite has for many years been served hy a number of accomplished
texts. The wider audience has been less well served. This new edition continues to
be chiefly addressed to it.

Objectives of the text


''he text is intended to provide a framework for a year's instruction with the
depth and breadth of coverage that would enable the student to continue with the
subject at graduate level. It is therefore ambitious in terms of theory and proofs,
vi Preface

given the constraints imposed by the nature of its target audience and not making
use of linear algebra.
A primary concern has been not to overwhelm the student with information.
This is not a reference work. It is hoped that the student will find the text readable
and that in the course of a year he or she would comfortably be able to traverse its
contents. For the same reason the mathematical demands on the student have been
kept to a minimum. For nearly everyone, there is a limit to the rate at which formal
mathematical analysis can be digested. lf this limit is exceeded, the student spends
much mental energy grappling with the tcchnicalities rather than the substance,
impeding the development of a unified understanding of the subject.
Although its emphasis is on theory, the text is intended to provide substantial
hands-on practical experience in the forro of regression exercises using a
computer. In particular, the Educational Attainment and Wage Equation data
provide opportunities for 60 cross-sectional exercises spread through the first
10 chapters of the text. Students start with a simple model and gradually develop
it into a more sophisticated one as their knowledge of econometric theory grows.
It is hoped that seeing how the specification of their models improves will motvate
students and help sustain their interest. The Demand Functions data set, with its
15 time series exercises, is intended to provide a similar experience in the remaining
chapters. Further data sets have been provided for specialist applications.

Changes to this edition


The main changes for the new edition are as follows.

1. The most obvious change in this edition is in the notation. The estimator of the
parameter [3 z is now written (3 2 , instead of b 2 . In introductory texts, the use of
the Latin letter as an estimator of the corresponding Greek one used to be coni-
mon. It had the advantage of making the mathematical analysis slightly cleaner
and simpler. Perhaps it was also less intimidating, an important consideration
for those students who find college algebra challenging. But the more elegant
caret-mark notation has clearly become dominant, even at this level, and so
this text has switched to it.
2. There are 40 new exercises, mostly analytical.
3. The main cross-sectional data set used for examples and computer exercises
has been updated from the National Longitudinal Survey of Youth 1979 to
its successor, the NI,SY 1997.
In addition, there has heen an effort to improve the clarity of the exposition. A
few topics have been added and a few have been deleted, but there has been no
chango to the objective that cach chapter should be readable in its entirety as a
straight narrative.
The intention is to provide enough conceptual material to support a two-semester
scquence, with the Review and the first seven chapters used as an Introducttoii
to the classical linear regression model and the remainder as a second semester
for students who are ready to tackle more sophisticated econometric issues. The
ainl remains that of providing a solid intuitive understanding of the material with
enough technical underpinning to prepare the student for further formal study of
econometrics or for the self-study of applications.
As before, the simulations have p een undertaken using Matt,ab. I will be happy
to email copies of the batch files to anyone interested.

Additional resources
The Online Resource Centre
www.oxfordtextbooks.co.tik/orc/dougherty5e/
offers the following resources for instructors and students:
PowerPoint' slideshows that offer a graphical treatment of most of the topics
in the text. Narrative boxes provide an explanation of the sudes.
Links to data sets and maniials.
Instructor's mannals for the text and data sets, detailing the exercises and
their solutions.
A student area that provides answers to the starred exercises in the text and
offers additional exercises.
It is hoped that the provision of these materials will not only be helpful for the
study of econometrics but also make it satisfying and pleasurable.

Christopher Dougherty
Contents

I NTROI)UCTION 1

Why study econometrics? 1


Aim of this text 2
Mathematics and statistics prerequisites for studying econometrics 2
Additional resources 3
Econometrics software 4

REVIEW: RAN1)OM VAR1AIRLES, SAMPLING, ESTIMATION,


AN1) I NFEIZENCE 5

R.1 The need for a solid understanding of statistical theory 5


R.2 Discrete random variables and expectations 7
Discrete random variables 7
Expected values of discrete random variables 8
Expected values of functions of discrete random variables 9
Expected value roles 10
Population variance of a discrete random variable 11
Fixed and random components of a random variable 12
R..3 Continuous random variables 14
Probability density 14
R.4 Population covariance, covariance and variance rules,
and correlation 19
Covariance 19
Independence of random variables 19
Covariance rules 20
Variance rules 21
Correlation 22
R.5 Samples, the double structure of a sampled random variable,
and estimators 23
Sampling 23
Estimators 24
x Contents

R.6 Unbiasedness and efficiency 27


Unbiasedness 27
Efficiency 28
Conflicts between unbiasedness and minimunm variance 31
R.7 Estimators of variance, covariance, and correlation 33
R.8 The normal distribution 35
R.9 Hypothesis testing 37
Formulation of a null hypothesis and development of its implications 37
Compatihility, freakiness, and the significance leve] 38
R.10 Type II error and the power of a test 43
R.11 t tests 49
The reject/fail-to-reject terminology 52
R.12 Confidence intervals 53
R.13 One-sided tests 58
H 0 : p = o , H,: p = p, 58
Generalizing from H 0 : u = p 0 , H,: p = p 1 to H 0 : p = p o , H,: ji > N 64
H o : p=P o , H 1 : p<p 0 64
One-sided t tests 65
Iniportant special case: H 0 : p = 0 65
Anomalous results 66
Justification of the use of a one-sided test 66

R.14 Probability limits and consistency 68


Probability limits 68
Consistency 70
Why is consistency of interest? 71
Simulations 73
R.15 Convergence in distribution and central limit theorems 76
Limiting distributions 77
Kcy tcrms 81
Appcndix R.1 Unbiased estimators of thc population covariance
and variance 81
Appendix R.2 Density functions of transformed random variables 83

1 SIMPLE REGRESSION ANALYSIS 85

1.1 The simple linear model 85


1.2 Least squares regression with one explanatory variable 87
1.3 Derivation of the regression coefficients 89
Least squares regression with one explanatory variable: the general case 92
Two deconipositions of thc dependent variable 95
Regression model without an intercept 96

1.4 Interpretation of a regression equation 98


Changes in the units of measurement 100
1.5 Two important results relating to OLS regressions 10S
Tlie mean value of the residuals is zero 106
i he sample correlation hetween the observations on X and the
residuals is zero 106
1.6 Goodness of fit: R 2 107
1?xample of how R 2 is calculated 109
2
Alternative interpretation of R 110
Key tercos 11 1

2 PROPERTIES OF THE REGRESSION COEFFICIENTS AND


HYPOTHESIS TESTING 113

2.1 Types of data and regression model 113


2.2 Assumptions for regression models with nonstochastic regressors 114
2.3 The random components and unbiasedness of the OLS regression
coefficients 118
The random components of the OLS regression coefficients 118
The unbiasedness of the OLS regression coefficients 122
Normal distrihution of the regression coefficients 124
2.4 A Monte Carlo experiment 126
2.5 Precision of the regression coefficients 130
Variances of the regression coefficients 130
Standard errors of the regression coefficients 133
The GaussMarkov theorem 137
2.6 Testing hypotheses relating to the regression coefficients 139
0.1 percent tests 144
p values 144
One -sided tests 145
Confidence intervals 147
2.7 The F test of goodness of fit 150
Relationship between the F test of goodness of fit and the t test on the slope
coefficient in simple regression analysis 1.52
Key tercos 153
Appendix 2.1 The GaussMarkov theorem 154

3 MULI'IPLE REGRESSION ANALYSIS 156

3.1 Illustration: a model with two explanatory variables 156


3.2 Derivation of the multiple regression coefficients 158
The general model 160
lnterpretation of the multiple regression coefficients 161
3.3 Properties of the multiple regression coefficients 164
Unbiasedness 165
Efciency 166
xii Contents

Precision of the multiple regression coefficients 166


t tests and confidence intervals 169
3.4 Multicollinearity 1 71
Multicollinearity in models with more than two explanatory variables 174
Examples of multicollinearity 174
What can you do about multicollinearity? 175
3.5 Goodness of fit: R 2 180
F tests 182
Further analysis of variance 184
Relationship between F statistic and t statistic 186

3.6 Prediction 189


Properties of least squares predictors 191
Kcy tercos 195

4 NONLINEAR MODELS AND TRANSFORMATIONS


OF VARIABLES 197

4.1 Lincarity and nonlincarity 197


4.2 Logarithmic transformations 201
Logarithmic models 201
Semilogarithmic models 205
The disturbance terco 208
Comparing linear and logarithmic specifications 209

4.3 Models with quadratic and interactivc variables 214


Quadratic variables 215
Higher-order polynoniials 217
Interactive explanatory variables 218
Ramsey's RESET test of functional misspecification 222

4.4 Nonlincar regression 225


Key tcrms 228

5 DUMMY VARIABLES 230

5.1 Illustration of the use of a dummy variable 230


Standard errors and hypothesis testing 234

5.2 Extension to more than two categories and to multiple


sets of dummy variables 237
Joint explanatory power of a group of dummy variables 240
Change of reference category 240
The dummy variable trap 242
Multiple sets of dummy variables 244

5.3 Slope dummy variables 250


Joint explanatory power of the intercept and slope dummy variables 252
5.4 The Chow test 255
Relationship between the Chow test airad the F test of the explanatory
p0wer of a set of dunumy variables 258

Key tercos 259

6 SPECIFICATION OF REGRESSION VARIABLES 261

6.1 Model specification 261


6.2 The effect of omitting a variable that ought to be included 262
'[he prohlein of bias 262
Invalidatlon of the statistical tests 265
R z iii the presence of omitted variable bias 267
6.3 The effect of including a variable that ought not to be included 272
6.4 Proxy variables 276
Unintentional proxies 278
6.5 Testing a linear restriction 280
F test of a linear restriction 281
'[he reparameterization of a regression model 282
t test of a linear restriction 284
Multiple restrictions 285
Zero restrictions 285
Key tercos 286

7 HETEROSKEDAS'l'ICITY 290

7.1 Heteroskedasticity and its i mplications 290


Possible causes of heteroskedasticity 293
7.2 Detection of heteroskedasticity 295
The GolclfeldQuandt test 296
The White test 297
7.3 Remedies for heteroskedasticity 299
Weighted least squares 299
Matheniarical ni isspecification 303
Ro>hust standard errors 305
How serious are the consequences of heteroskedasticity? 306
Key ternes 308

8 STOCHASTIC IZFGRESSORS ANI) MEASUREMENT ERRORS 311

8.1 Assumptions for models with stochastic regressors 311


8.2 Finite sample properties of the OLS regression estimators 313
Unbiasedness of the OI.S regression estinlators 313
Precision arad efficiency 314
XiV Contents

8.3 Asymptotic properties of the OLS regression cstimators 315


Consistency 316
Asymptotic norrnality of the OI,S regression estiniators 317
8.4 The consequences of incasurement errors 317
Measurernent errors in the explanatory variable(s) 318
Measurement errors in the dependent variable 320
Imperfect proxy variables 322
Example: Friedman's permanent income hypothesis 322
8.5 Instrumental variables 327
Asymptotic distribution of thc IV estimator 330
Multiple instruments 337
The DurbinWuHausman specification test 338
Key terms 340

9 SIMUI TANEOUS EQUATIONS ESTIMATION 343

9.1 Simultaneous equations models: structural and reduced


form equations 343
9.2 Simultancous equations bias 345
A Monte Carlo experiment 348
9.3 Instrumental variables estimation 351
Underidentification 354
Exact identification 355
Overidentification 356
Two-stage least squares 357
The order condition for identification 358
Unobserved heterogeneity 360
DurbinWuHausman test 361

Key tcrtns 362

10 BINARY CHOICE AND LIMITED DEPENUENT VARIABLE


MODELS, AND MAXIMUM LIKELIHOOI.) ESTIMATION 367

10.1 The linear probability model 367


10.2 Logit analysis 372
Generalization to more than one explanatory variable 374
Goodness of fit and statistical tests 375

10.3 Probit analysis 378


10.4 Censored regressions: tobit analysis 381
10.5 Sample selection bias 386
10.6 An introduction to maximum likclihood estimation 391
Generalization to a sample of n observations 395
Generalization to the case where a is unknown 395
Application to the simple regression model 398
Goodness of fit and statistical tests 400
Kcy terms 401
Appendix 10.1 Comparing linear and logarithinic specifications 402

11 MOI)ELS USING TIME SERIES DATA 405

1 1.1 Assumptions for regressions with time series data 405


11.2 Static models 408
11.3 Models with lagged explanatory variables 413
Estimating long-run effects 415
11.4 Models with a lagged dependent variable 416
The parcial adjustment model 419
The error correction model 421
The adaptive expectations model 421
More general autoregressive models 424
11..5 Assumption C.7 and the properties of estimators
in autoregressive models 427
Consistency 429
[,i miting distrihutions 431
t tests iii an autoregressive inodel 432
11.6 Simultancous equations models 435
11.7 Alternativo dynamic representations of time series processes 438
Time series analysis 439
Vector autoregressions 441
Key tercos 443

12 At1T000RRELATION 445

12.1 Definition and consequences of autocorrelation 445


Consequences of autocorrelation 447
Autocorrelation with a lagged dependent variable 449
12.2 Detection of autocorrelation 449
The BreuschGodfrey test 450
The DurbinWatson test 451
12.3 Fitting a model subject to AR(1) autocorrelation 455
Issues 456
Inference 457
'1'he common factor test 460
12.4 Apparent autocorrelation 467
12.5 Model specification: specific-to-general versus general-to-specific 472
Comparison of alternative models 473
The general-to-specife approach to model specification 475
Xvi Contents

Kcy tercos 476


Appendix 12.1 Demonstration that the DurbinWatson d statistic
approximates 2 2p in large samples 477

13 INTROI)UCTION TO NONSTATIONARY TIME SERIES 478

13.1 Stationarity and nonstationarity 478


Stationary time series 478
Nonstationary time series 484
I)eterministic trend 487
1)ifference-stationarity and trend-stationarity 488

13.2 Spurious regressions 490


Spurious regressions with variables possessing deterministic trends 491
Spurious regressions with variables that are random walks 491

13.3 Graphical techniques for detecting nonstationarity 501


13.4 Tests of nonstationarity: the augmented DickeyFullcr t test 506
Untrended process 507
Trended process 510

13.5 Tests of nonstationarity: other tests 513


The llickeyFulier test using the scaled estimator of the slope coefficient 513
The DickeyFuller F test 516
Power of the tests 516
Further tests 518
Tests of deterministic trends 518
Further complications 518
13.6 Cointegration 519
13.7 Fitting models with nonstationary time series 524
Detrending 524
Differencing 525
Error correction models 526

Kcy tercos 528

14 INTROI)UCTION TO PANEL DATA MODELS 529

14.1 Reasons for interest in panel data sets 529


14.2 Fixed effects regressions 531
Within-groups fixed effects 533
First differences fixed effects 534
Least squares dummy variable fixed effects 535

14.3 Random effects regressions 537


Assessing the appropriateness of fixed effects and random effects estimation 539
Random effects or OI.S? 541
A note on the random effects and fixed effects terminology 541
14 t)iffcrenccs in diffcrences 544
Key tercos 546

APPENll1X A: Statistical tables 547


APPENDIX R: Data sets 565
Bibliography 577
Author index 581
Subject index 582
Introduction

Why study econometrics?


Ficonometrics is the term used to describe the application of statistical methods to
the quantification and critical assessment of hypothetical relationships using data.
' I' he terco `econometrics' suggests that the methods relate only to economic analy-
sis. In fact, applications will be found far more broadly, in virtually all the social
sciences and elsewhere. It is true that economics has been responsible for much of
the development of econometrics, but other disciplines have also made substantial
contrihutions. Indeed, regression analysis, the core technique, appears initially to
have been developed in applications to astronomy by Legendre and Gauss in the
first few years of the nineteenth century.
It is with the aid of econometrics that we discriminate between competing theories
and put numerical clothing onto the successful ones. For economists, econometric
analysis may be motivated by a simple desire to improve our understanding of how
the economy works, at either the niicroeconomic or the macroeconomic leve!, but
more often it is undertaken with a specific objective in mind. In the private sector,
the financial benefits that accrue from a sophisticated understanding of relevant
markets and an ability to predict change may be the driving factor. In the public
sector, the impetus may come from an awareness that evidence-based policy initia-
tives are likely to be those that have the greatest impact.
It is now generally recognized that nearly all professional economists, not
just those actually working with data, should Nave a basic understanding of
econometrics. There used to be a view that microeconomics and macroeconom-
ics comprised the core training of an economist, and that econometrics was an
optional extra to be pursued by those with a flair for numbers and an inclination
to get their hands dirty with data. u particular, much of early macroeconomic
theory was in reality no more than conjecture propounded by (over-)confident
theorists who thought that the joh of quantifying their theories could safely be
left to others with Iesser vision and a greater willingness to apply themselves to
empirical detall.
That view is long gone. Microeconomic and macroeconomic theories are gen-
erally considered to he of little interest if they are not supported by econometric
analysis. As a consequence of the recognition of its importance, an introductory
2 Introduction

course in econometrics has now become an integral component of any serious


undergraduate degree in economics, and it is a prerequisite for admission to
postgraduate study in economics or finance.
Even for those who are not actively involved with econometrics, there are
two major benefits from its study. One is that it facilitates communication and
engagement between econometricians and the users of their work. This is espe-
cially important in the workshops that are the typical meeting ground for applied
econometricians and the policy-makers who may be influenced by their work.
Would-be policy-makers who do not spcak the language are not equipped to
participate in the discussion.
The other benefit is the development of the ability to obtain a perspective on
econometric work and to undertake critical evaluation of it. Econometric work
is more robust in some contexts than in others. Experience with the practice of
econometrics and a knowledge of the potential problems that can arise are essen-
tial for developing an instinct for judging how much confidence should he placed
on the findings of a particular study.

Aim of this text


With this in mind, the text has three specific objectives.

1. One is to provide you with the practical skills needed to fit models, given suit-
able data, in a relatively straightforward context. This is fairly easy. Generally,
such applications will be models fitted with cross-sectional data.
2. The second is to prornote the development of an understanding of the statisti-
cal properties of these techniques and hence an understanding of why the
techniques work satisfactorily in certain contexts and not in others. This is
much more demanding.
3. The third, building on the second, is to encourage you to develop a strong
intuitive understanding of the material and with it the capacity and confidence
to extend it further, either sideways, in applications in a particular field, or
vertically, moving on to more advanced study.

Mathematics and statistics prerequisites for studying


econometrics
The prerequisite for studying this subject is a solid background in basic college-
level mathematics and statistics.

Mathematics: The rnathematics requirement is two semesters of college-level cal-


culus, with an emphasis on the differential rather than the integral calculas. This
is the official requirement. The real practical requirement is that you should be
able to work through a proof involving simple college-level algebra, in comfort
and understanding everything as you go. In particular, equations involving E
notation should not present any difficulty for you. Students who have taken two
semesters of calculus in college with reasonable grades should belong to this
category.
Linear algebra (matrix algebra) is not used in this text. This is not a serious
impediment to acquiring a sound knowledge of econometrics at this level.
Although it means that, for the purpose of theoretical analysis, we have to restrict
the analysis to models with no more than two explanatory variables, this is not
a major constraint. We can still investigate nearly everything that we wish, and
greater complexity would add very little. If you continue with a higher-level course,
yu will need to learn how to use linear algebra, but once you have done that, you
will find it easy to interpret within it what we have done here. Appendix A, Matrix
Algebra, of Greene's Econometric Analysis is an excellent resource, giving you just
what you need to know for econometrics.
Statistics: You must have a clear understanding of what is meant by the sampling
distribution of an estimator and of the principies of statistical inference and
hypothesis testing. This is absolutely essential. In my experience, most problems
that students have with an introductory econometrics course are not econometric
problems at al!, hut problems with statistics, or rather, a lack of understand-
ing of statistics. There are no short cuts. If you do not have this background,
you should put your study of econometrics on hold and study statistics first.
Otherwise there will be core parts of the econometrics syllabus that you do not
begin to understand.
In addition, it would be helpful if you have some knowledge of economics.
However, although the examples and exercises in this text relate to economics,
most of them are so straightforward that a previous study of economics is not a
requirernent.

Additional resources
There are two additional major resources that you should check out as soon as
you begin to use this text: the slideshows and the study guide. Roth are avail-
able, at no cost and with no restrictions, in the Online Resource Centre at
www.oxfordtextbooks.co.uk/orc/doughertySe/.
Slideshows: The PowerPoint slideshows systematicaily cover all of the topics
treated in the text, typically with greater graphical detail. They are not intended
as a substituta for the text, but they should provide substantial support.
Study guide: This provides answers to the starred exercises in the text and addi-
tional exercises, also with solutions. It was commissioned by the University of
London International Programmes as an additional resource for distance-learning
students, and the organizers of the External Negree have kindly allowed it to be
available to anyone who is interested in using it.
The Online Resource Centre also gives unrestricted access to ah l of the data sets
used in the examples and exercises in the text.
4 Introduction

Econometrics software
There are at least ten major commercial software packages for econometrics in
use around the world and it does not matter which one you use. With little vari-
ation, they al! have the features and facilities used in econometrics at this leve!.
Many of the tables in this text reproduce output from Stata or EViews, mainly
because the format is compact and tidy. Output from other applications looks
very similar.
If you do not have access to one of these -commercial applications, then down-
load gretl and use that instead. gretl is a powerful, sophisticated econometrics
application, which is easy to use and free. Go to the Online Resource Centre, find
the link, and follow the instructions. There you will also find a downloadable
manual that tells you how to use gretl to do the exercises in this tcxt.
You should not try to use an inferior substitute. In particular, you should not try
to use the regression engine built into a spreadsheet application such as Microsoft
Excel. Excel and other spreadsheets are invaluable applications, but they are not
intended or designed for serious econometrics use. You need a dedicated applica-
tion, and gretl is an excellent one.
The aims of this text have been stated abo y e. There is one further aim, or at
least, hope. That is that you will find the study of econometrics intellectually
satisfying. By the time that you approach the end of this text, you will find that,
although the material in each chapter is new, the sane themes and concerns keep
reappearing, especially those related to the properties of estimators. When you
begin to recognize this, you will be well on your way to becoming a proper econo-
metrician, and not just someone mechanically handling data and performing tests.
And, of course, when the time comes for you to fit your own modeis with your
own data, it is hoped that you will find the practice of econometrics enjoyable too.
Amemiya, Takeshi (1981). Qualitative response models: a survey. Journal of Economic
Literature 19(4): 1483-1536.
Amemiya, Takeshi (1984). Tobit models: a survey. Journal of Econometrics 24(1): 3-61.
Baltagi, Badi H. (2013). Econometric Analysis of Panel Data (5th edn). Chichester:
Wiley.
Box, George E.P., and David R. Cox (1964). An analysis of transformations. Journal of the
Royal Statistical Society Series B 26(2): 211-43.
Box, George E.P., and Norman R. Draper (1987). Empirical Model-Building and Response
JKIfC!(.J. ivcw 1UlK. w L1 y.
Box, George E.P., and Gwilym M. Jenkins (1970). Time Series Analysis: Forecasting and
Control. San Francisco: Holden Day.
Box, George E.P., Gwilym M. Jenkins, and Gregory C. Reinsel (1994). Time Series Analysis:
Forecasting and Control (3rd edn). Englewood Cliffs, NJ: Prentice-Hall.
Breusch, Trevor S. (1978). Testing for autocorrelation in dynamic linear models. Australian
Economic Papers 17(31): 334-55.
Brown, T.M. (1952). Habit persistence and lags in consumer behaviour. Econometrica
20(3): 355-71.
Card, David (1995). Using geographic variation in college proximity to estimate the return
to schooling. In Louis N. Christofides, E. Kenneth Grant, and Robert Swidinsky (eds),
Aspects of Labour Market Behaviour: Essays in Honour of John Vanderkamp. Toronto:
University of Toronto Press.
Chow, Gregory C. (1960). Tests of equality between sets of coefficients in two linear regres-
sions. Econometrica 28(3): 591-605.
Cobb, Charles W., and Paul H. Douglas (1928). A theory of production. American
Economic Review 18(1, Suppl.): 139-65.
Cooper, Ronald L. (1972). The predictive performance of quarterly econometric models of
the United States. In Bert G. Hickman (ed.), Econometric Models of Cyclical Behavior,
Vol. II. New York: Columbia University Press.
Court, Andrew T. (1939). Hedonic price indexes with automotive examples. In The
Dynamics of Automobile Demand, Papers presented at a joint meeting of the American
Statistical Association and the Econometric Society in Detroit, December 1938. New
York: General Motors Corporation.
Davidson, James E.H. (2000). Econometric Theory. Oxford: Blackwell.
Davidson, Russell, and James G. MacKinnon (1993). Estimation and Inference in
Econometrics. New York: Oxford University Press.
578 Bibliography

Dickey, David A., and Wayne A. Fuller (1979). Distribution of the estimators for autore-
gressive time series with a unit root. Journal of the American Statistical Association
74(366): 427-31.
Dickey, David A., and Wayne A. Fuller (1981). Likelihood ratio statistics for autoregressive
time series with a unit root. Econometrica 49(4): 1057-72.
Diebold, Francis X. (1998). The past, present, and future of macroeconomic forecasting.
Journal of Economic Perspectives 12(2): 175-92.
Diebold, Francis X. (2001). Elements of Forecasting (2nd edn). Cincinnati, OH:
South-Western.
Durbin, James (1954). Errors in variables. Review of the International Statistical Institute
22(1): 23-32.
Durbin, James (1970). Testing for serial correlation in least-squares regression when some
of the regressors are lagged dependent variables. Econometrica 38(3): 410-21.
Durbin, James, and G.S. Watson (1950). Testing for serial correlation in least-squares
regression I. Biometrika 37(3-4): 409-28.
Durlauf, Steven N., and Peter C.B. Phillips (1988). Trends versus random walks in time
series analysis. Econometrica 56(6): 1333-54.
Elliott, Graham, Thomas J. Rothenberg, and James H. Stock (1996). Efficient tests for an
autoregressive unit root. Econometrica 64(4): 813-36.
Engle, Robert F., and Clive W.J. Granger (1987). Co-integration and error correction rep-
resentation, estimation, and testing. Econometrica 50(2): 251-76.
Engle, Robert F., and Clive W.J. Granger (1991). Long-Run Economic Relationships:
Readings in Cointegration (editors). Oxford: Oxford University Press.
Fowler, Floyd J. (2009). Survey Research Methods (4th edn). Thousand Oaks, CA: Sage.
Friedman, Milton (1957). A Theory of the Consumption Function. Princeton, Ni: Princeton
University Press.
Frisch, Ragnar, and Frederick V. Waugh (1933). Partial time regressions as compared with
individual trends. Econometrica 1(4): 387-401.
Godfrey, Leslie G. (1978). Testing against general autoregressive and moving average error
models when the regressors include lagged dependent variables. Econometrica 46(6):
1293-1301.
Goldfeld, Stephen M., and Richard E. Quandt (1965). Some tests for homoscedasticity.
Journal of the American Statistical Association 60(310): 539-47.
Granger, Clive W.J., and Paul Newbold (1974). Spurious regressions in econometrics.
Journal of Econometrics 2(2): 111-20.
Greene, William (2011). Econometric Analysis (7th edn). Upper Saddle River, NJ: Prentice
Hall.
Gronau, Reuben (1974). Wage comparisons-a selectivity bias. Journal of Political
Economy 82(6): 1119-55.
Hamilton, James D. (1994). Time Series Analysis. Princeton, NJ: Princeton University Press.
Hausman, Jerry A. (1978). Specification tests in econometrics. Econometrica 46(6):
1251-71.
Heckman, James (1976). The common structure of statistical models of truncation, sam-
ple selection, and limited dependent variables and a simple estimator for such models.
Annals of Economic and Social Measurement 5(4): 475-92.
Hendry, David E (1979). Predictive failure and econometric modelling in macroeconomics:
the transactions demand for money. In Paul Ormerod (ed.), Modelling the Economy.
London: Heinemann.
Introduction to Econometrics 579

Hendry, David E, and Grayham E. Mizon (1978). Serial correlation as a convenient simpli-
fication, not a nuisance. Economic Journal 88(351): 549-63.
Holden, Darryl, and Roger Perman (2007). Unit roots and cointegration for the econo-
mist. In B. Bhaskara Rao (ed.), Cointegration for the Applied Economist (2nd edn).
Basingstoke: Paigrave Macmillan.
Hsiao, Cheng (2015). Analysis of Panel Data (3rd edn). Cambridge: Cambridge University
Press.
Kalecki, Micha! (1935). A macrodynamic theory of business cycles. Econometrica 3(3):
327-44.
Koyck, Leendert M. (1954). Distributed Lags and Investment Analysis. Amsterdam:
North-Holland.
Lintner, John (1956). Distribution of incomes of corporations among dividends, retained
earnings, and taxes. American Economic Review 46(2): 97-113.
Liviatan, Nissan (1963). Tests of the Permanent-Income Hypothesis based on a reinterview
savings survey. In Carl Christ (ed.), Measurement in Economcs. Stanford, CA: Stanford
University Press.
Lovell, Michael C. (1963). Seasonal adjustment of economic time series. Journal of the
American Statistical Association 58: 993-1010.
MacKinnon, James G., and Halbert White (1985). Some heteroskedasticity-consistent
covariance matrix estimators with improved finite sample properties. Journal of
Econometrics 29(3): 305-25.
Nelson, Charles R., and Charles I. Plosser (1982). Trends and random walks in macro-
economic time series: some evidence and implications. Journal of Monetary Economics
10(2): 139-62.
Nerlove, Marc (1963). Returns to scale in electricity supply. In Carl Christ (ed.),
Measurement in Economics. Stanford, CA: Stanford University Press.
Park, Rolla E., and Bridget M. Mitchell (1980). Estimating the autocorrelated error model
with trended data. Journal of Econometrics 13(2): 185-201.
Peach, James T., and James L. Webb (1983). Randomly specified macroeconomic models:
some implications for model selection. Journal of Economic Issues 17(3): 697-720.
Phillips, Peter C.B. (1986). Understanding spurious regressions in econometrics. Journal of
Econometrics 33(3): 311-40.
Rubin, Herman (1950). Consistency of maximum-likelihood estimates in the explosive
case. In T.C. Koopmans (ed.), Statistical Inference in Dynamic Economic Models. New
York: John Wiley.
Salkever, David S. (1976). The use of dummy variables to compute predictions, prediction
errors and confidence intervals. Journal of Econometrics 4(4): 393-7.
Sims, Christopher A. (1980). Macroeconomics and reality. American Economic Review
48(1): 1-48.
Stock, James H. (1987). Asymptotic properties of least squares estimators of cointegrating
vectors. Econometrica 55(5): 1035-56.
Tinbergen, Jan (1939). Statistical Testing of Business Cycle Theories. No. 2, Business
Cycles in the United States of America 1919-1932. Geneva: League of Nations.
Tobin, James (1958). Estimation of relationships for limited dependent variables.
Econometrica 26(1): 24-36.
Waugh, Frederick V. (1929). Quality as a Determinant of Vegetable Prices. New York:
Columbia University Press.
580 Bibliography

White, Halbert (1980). A heteroskedasticity-consistent covariance matrix estimator and a


direct test for heteroskedasticity. Econometrica 48(4): 817-38.
Wichern, Dean W. (1973). The behaviour of the sample autocorrelation function for an
integrated moving average process. Biometrika 60(2): 235-9.
Wooldridge, Jeffrey M. (2010). Econometric Analysis of Cross Section and Panel Data
(2nd edn). Cambridge, MA: MIT Press.
Wu, De-Min (1973). Alternative tests of independence between stochastic regressors and
disturbances. Econometrica 41(4): 733-50.
Amemiya, Takeshi 370, 376, 386 Jenkins, Gwilym M. 440, 489, 503

Baltagi, Badi H. 529 Kalecki, Michal 420


Box, George E.P. 211, 440, 489, 503
Breusch, Trevor S. 451 Lintner, John 425
Brown, T.M. 420 Liviatan, Nissan 335
Lovell, Michael C. 491
Card, David 360
Chow, Gregory C. 255 MacKinnon, James G. 306, 522
Cobb, Charles W. 412 Mitchell, Bridget M. 457
Cooper, Ronald L. 439 Mizon, Grayham E. 460
Court, Andrew T. 190
Cox, David R. 211 Nelson, Charles R. 489, 499, 519, 524
Nerlove, Marc 288
Davidson, Russell 522 Newbold, Paul 491, 525
Dickey, David A. 516
Diebold, Francis X. 440, 443 Park, Rolla E. 457
Douglas, Paul H. 412 Peach, James T. 472
Draper, Norman R. 503 Perman, Roger 507
Durbin, James 338, 451 Phillips, Peter C.B. 494, 525
Durlauf, Steven N. 525 Plosser, Charles I. 489, 499, 519, 524

Elliott, Graham 518 Quandt, Richard E. 296


Engle, Robert E 522, 527
Reinsel, Gregory C. 489, 503
Fowler, Floyd J. 178 Rothenberg, Thomas J. 518
Friedman, Milton 323 Rubin, Herman 514
Frisch, Ragnar 491
Fuller, Wayne A. 516 Sims, Christopher A. 441
Stock, James H. 518, 522
Godfrey, Leslie G. 451
Goldfeld, Stephen M. 296 Tinbergen, Jan 438
Granger, Clive W.J. 491, 522, 525, 527 Tobin, James 386
Greene, William 3
Gronau, Reuben 386 Watson, G.S. 452
Waugh, Frederick V. 195, 491
Hamilton, James D. 507 Webb, James L. 472
Hausman, Jerry A. 338 White, Halbert 297, 305, 306
Heckman, James 386, 390 Wichern, Dean W. 502
Hendry, David F. 460 Wooldridge, Jeffrey M. 529
Holden, Darryl 507 Wu, De-Min 338
Hsiao, Cheng 529
e+
. y ^ ^ f ^ ^ A -

Acceptance region 40-2 tests for 449-53


definition 41 BreuschGodfrey test 450-1
Adaptive expectations 421-2, 463 DurbinWatson d test 451-3, 477
ADF test see Nonstationarity, detection Autocorrelation function 501
Adjusted R 2 188 Autoregressive distributed lag (ADL)
ADL see Autoregressive distributed lag models 416-25
models ADL(1,0) model 417-22
Akaike Information Criterion (AIC) 510 adaptive expectations 421-2
AR see Autoregressive process dynamics 417-19
ARIMA (autoregressive integrated moving partial adjustment 419-20
average) process 489 ADL(1,1) model 421
ARMA (autoregressive moving average) definition of ADL model 416-17
process 440-1, 489 error correction model 421
Augmented DickeyFuller test 506-12; see also properties of regression coef fi cient
Nonstationarity, detection estimators 427-34
Autocorrelation (autocorrelated disturbance asymptotic normality of regression
term) coefficients 431
apparent, attributable to model consistency 429-30
misspecification finite-sample bias 428
functional misspecification 470 inference 432-434
omission of important variable 467-70 li miting distributions 431-4
autoregressive (AR) autocorrelation t tests 432-4
first order AR(1) 446 Autoregressive integrated moving average
higher order 447 ( ARIMA) process 489
causes of 445 see also Time series processes
consequences for OLS estimators 447-9 Autoregressive moving average (ARMA)
common factor test 456, 460-1 process 440-1, 489
definition 117, 445 see also Time series processes
fitting a model subject to AR(1) Autoregressive (AR) process
autocorrelation 455-7 correlogram 501-6
CochraneOrcutt iterative procedure 459, disturbance term subject to AR see
460 Autocorrelation
PraisWinsten correction 457 stationarity conditions 481
innovation 446-7
lagged dependent variable and Balanced panel see Panel data
autocorrelation 449 Bayes Information Criterion (BIC) 510
moving average autocorrelation 447 Bias
negative autocorrelation 446 definition of 27
NeweyWest standard errors 457 possible trade-off with variance 31-3
positive autocorrelation 445-6 loss function 31
length of observation interval 446 mean square error criterion 32
robust standard errors 457 BIC (Bayes Information Criterion) 510
Subject Index 583

Binary choice models see Linear probability Data


model; Logit analysis; Probit analysis; Cross-sectional 113
Sample selection model; Tobit model Panel 113, 529
BLUE 117 Time series 113, 406
BreuschPagan lagrange inultiplier test 541 Data generation process (DGP) 405
Brown's habit persistence model 420-1 realization 406
Data sets for exercises see Appendix B
Censored regression model see Tobit model Demand Functions data set 408, 569-70
Central limnit theorem 76-80 Demeaning of regressors 101, 219-22
LindebergLevy 76, 78, 118, 129 Dependent variable in regression model 85
LindebergFeller 118, 129 two decompositions of 95-6
Chi-squared distribution, critical values see Deterministic trend 487-8
Appendix A Table A.4 Detrending 524-5
Chow test 255-9 DGP see Data generation process
Classical linear regression model 113 Difference-stationarity 488-9
CochraneOrcutt iterative procedure 459, 460 DickeyFuller test see Nonstationarity,
Coefficient of determination see R2 detection
Cointegrated time series, Cointegration see Differences in differences 544-6
Nonstationary time series processes Differences-in-differences ( DIFFDIFF) data
Common factor test 460-2 set 576
Confidence interval 53-7 Discrete random variables see Random
regression coefficients 147-8, 169 variables
predictions 193 Distributed lags see Autoregressive distributed
Consistency 70-1 lag models
definition 70 Disturbance term 85
hyperconsistency 518 autocorrelated see Autocorrelation
of IV estimators 328-9 estimation of variance 135, 167
of OLS estimators 316 innovation 440
superconsistency 514, 522 noise 87
Consumer Expenditure Survey (CES2013) data origin of 86-7
set 570-2 standard error of regression equation 168
Consumption function white noise 440
Brown's habit-persistence model 420-1 see also Regression model assumptions
Friedrnan's Permanent Income Hypothesis Double structure of a random variable see
cointegration of consumption and Random variable, double structure
income 519-22 Dummy variables
critique of OLS estimation 322-3 benefits from use of 234
fitted using adaptive expectations Chow test 255-9
model 422-4 relationship with F test for full set of
permanent income and consumption, dummy variables 258-9
definitions 322-3 definition of 232
transitory income and consumption, dummy variable trap 242-4
definitions 323 F test of the joint explanatory power of a
Continuous random variables see Random set of dummy variables 240,
variables 252-3
Convergence in distribution 76-80 intercept duminy variable 232
Corrected R 2 188 interpretation of coefficient, logarithmic
Correlation coefficient dependent variable 235
population 22 multiple categories of 237-40
sample 34 change of reference category 240-2
Correlogram 501-6 choice of reference (omitted) category 237
Co y see Covariance omitted category 237
Covariance reference category 237
definition 19 multiple sets of 244-6
estimator 34, 81-3 slope dummy variable 250-2
rules 20-1 definition 250
Cross-sectional data 113 t tests of dummy variable coefficients 234
584 Subject Index

DurbinWatson d test 451-3, 477 F tests


table of du and d L see Appendix A Table A.5 of goodness of fit of regression equation
DurbinWuHausman (DWH) test 338-40 multiple regression 182-4, 285-6
in context of simultaneous equations simple regression 150-1
estimation 361-2 of homoskedasticity (GoldfeldQuandt
in context of fixed and random effects 540 test) 296
of joint explanatory power of group of
Educational attainment and wage equations explanatory variables 184-6
(EAWE) data sets 565-9 of set of dummy variables 240, 252-3
Educational expenditure data set (EDUC) 575 of validity of combining two samples to fit
Efficiency 28-31 regression (Chow test) 255-9
comparative concept 30 of validity of linear restriction 281-2
definition 29 First differences regression see Panel data
mean square error criterion 31-3 Fitted model 87; see also Regression model
Elasticity Fitted value 87-8
definition 201-2 Fixed effects regression 531-537; see also Panel
income, price elasticities 410 data
interpretation of elasticity 202 Friedman's Permanent Income Hypothesis see
Endogenous variable see Simultaneous Consumption function
equations estimation FrischWaughLovell theorem 161, 524
Engel curve 204 graphing relationship between two variables in
Ensemble distribution 480-1, 484-6 multiple regression model 161-2
Error correction models 421, 526-7 Functional misspecification see Model
Errors in variables see Measurement errors misspecification
ESS see Explained sum of squares
Estimator GaussMarkov theorem 137-8, 154-5, 166,
consistency 70-1 314
definition of 24-7 GoldfeldQuandt test 296
difference between estimate and estimator 24 Goodness of fit, F test of 150-1, 182-4; see also
efficiency 28-33 R2
of population mean 24 Granger causality 443
of regression coefficients see Regression GrangerNewbold spurious
coefficients regressions 491-500
unbiasedness 27-8 gretl regression software 4
see also Indirect least squares; Instrumental
variables; Maximum likelihood; Habit persistente model see Consumption
Ordinary least squares; Two-stage least function
squares Hausman test see DurbinWuHausman test
Exact identification see Simultaneous equations Heckman two-stop procedure see Sample
estimation selection model
Exogenous variable see Simultaneous equations Hedonic pricing 189-91
estimation Heteroskedasticity 290-310
Expectation see Expected value apparent heteroskedasticity caused by
Expected value functional misspecification 303-5
of continuous random variable 18 causes of 293-5
of discrete random variable 8-9 consequences for OLS estimators 292-3,
of function of continuous random variable 18 306-8
of function of discrete random variable 9-10 definition of 117, 292
rules 10-11 heteroskedasticity-consistent (robust) standard
Explained sum of squares (ESS) 108 errors 305-6
Explanatory variable 85; see also Regressor measures to mitigate 299-301
Extraneous information used to mitigate tests for
multicollinearity 178-9 GoldfeldQuandt 296
White 297
F distribution, critical values see Appendix A weighted least squares (WLS)
Table A.3 regression 299-301
F statistic see F tests Homoskedasticity, definition 117, 290-2
Subject Index 585

Hyperconsistency 518 LFP2011 see Labor force participation data set


Hypothesis Likelihood function 393; see also Maximum
alternative, definition 37 likelihood estimation
null, definition 37 Likelihood ratio statistic, test 400
testing 37-42 LindebergLevy central limit theorem 76, 78,
118, 129
Ideal proxy see Proxy variables LindebergFeller central limit theorem 118, 129
Identification see Simultaneous equations Linear probability model 367-70
estimation problems with 368-70
ILS see Indirect least squares Linear restriction see Restriction
Imperfect proxy variable see Proxy variables Linearity of regression model
Inconsistency, definition 71, 493 in parameters 197
Independence of two random variables 19 in variables 197
Independent variable 85; see also Regressor Linearization of nonlinear regression model
Information criteria 510 disturbance term assumptions 208-9
Innovation 440; see also Disturbance term logarithmic model 202-3
Instrument see Instrumental variables semilogarithinic model 206
Instrumental variables (IV) estimators variable redefinition 197
asymptotic normality 330-2 Log-likelihood function 394; see also Maximum
comparison with OLS likelihood estimation.
DurbinWuHausman test 338-40, 361-2 Logarithmic model 201-5
simulation 332-5, 353-4 comparison with linear model 209-12, 402-4
consistency of IV estimator 329 disturbance term 208-9
definition of 328 Logarithmic transformations 201-12
multiple instruments 337 rules for 203
population variance of IV estimator Logit analysis, logit model 372-6
asymptotic 330-2 goodness of fit 375-6
finite sample, simulation 332-5 marginal effects 373
requirements for use 330 Loglinear model 202
use to fit Permanent Income Hypothesis Longitudinal data set see Panel data
model 335-6 Loss function 31
use in sitnultaneous equations LSDV regression see Least squares dummy
estimation 351-9 variable regression
Integrated time series 589
Interactive regressors 218-22; see also Dummy MA see Moving average process
variable, slope dummy variable Maximum likelihood estimation
Inverse of Mills's ratio 388; see also Sample (MLE) 391-400
selection model asymptotic efficiency 391
Irrelevant variables see Model misspecification goodness of fit 400
IV see Instrumental variables likelihood function 393
likelihood ratio test 400
Jacobian term 84 log-likelihood function 394
maximum likelihood principle 393
Labor force participation 2011 (LFP2011) data simple regression model 398-9
set 371, 387, 574 Mean of a random variable
Lag distribution 419 population mean 9
parsimonious 417 maximum likelihood estimator of 394, 395,
Lagged dependent variable see Autoregressive 398
distributed lag models generalized unbiased estimator 28
Lagged variable, definition of 413 sample mean 28
Least squares criterion estimator of population mean 28
linear regression 88, 91, 94 efficient 30
nonlinear regression 225 unbiased 28
see also Regression analysis, ordinary least variance of 25-7
squares Mean square error 32-3
Least squares dummy variable (LSDV) Measurement errors
regression 535-6 in dependent variable 320-2
586 Subject Index

Measurement errors (cont.) explanatory variables 178


in explanatory variable 318-20 reduction in correlation of explanatory
imperfect proxy variables 322 variables 178
proof of inconsistency of OLS use of extraneous information 178-9
estimators 319-20 use of theoretical restriction 179
see also DurbinWuHausman test; Multiple regression analysis 156; see also
Friedman's Permanent Income Regression analysis, two explanatory
Hypothesis variables
Mills's ratio 388; see also Sample selection
model National Longitudinal Survey of Youth 2000
MLE see Maximum likelihood estimation ( NLSY2000) panel data set
Model A, B, C see Regression model see also Educational Attainment and Wage
Model misspecification Equations data set
functional form Nested models see Model specification
potential cause of apparent Newey-West standard error 457
autocorrelation 467-70 NLSY97- see National Longitudinal Survey of
potential cause of apparent Youth 1997 panel data set
heteroskedasticity 303-4 Noise see Disturbance term
irrelevant variables 272-5 Nonlinear regression 225-8
consequences of 273 fitted using grid search 424
omitted variables 261-9 fitted using iterative procedure 225-6
consequences of 262-6 fitted using nonlinear specification 227-8
derivation of bias 264 linearized by logarithmic transformation 202-3,
direction of bias 264 206
effect on R 2 267-9 linearized by redefining variables 197
invalidation of statistical tests 265 use of
potential cause of apparent higher-order polynomials 217-18
autocorrelation 467-70 interactive tercos 218-22
Model specification 261, 472-6 quadratic variables 215-17
comparison of alternative models 473-5 Nonlinear restriction see Restriction
general-to-specific and specific-to-general Nonsense regressions see Spurious regressions
approaches 472, 475-6 Nonstationarity, detection
nested and non-nested models 473-4 graphical techniques 501-6
see also Model misspecification autocorrelation function 501
Monte Carlo experiment 126-30 correlogram 501-3
see also Simulation experiment tests of deterministic trends 518
Moving average (MA) process 447 unit root tests
correlogram 502 ADFGLS test 518
disturbance term subject to MA see Augmented DickeyFuller (ADF)
Autocorrelation tests 506-12
Multicollinearity Critical values Appendix A Tables A.6, A.7,
caused by correlated explanatory A.8
variables 174 DickeyFuller t test 508
caused by approximate linear relationship DickeyFuller scaled coefficient test 513-15
among explanatory variables 174 DickeyFuller F test 516
consequences 171 for trended processes 507-10
definition of 171 for untrended processes 510-12
different impact on F tests and t tests 183-4 power of unit root tests 516-17, 518
effect on prediction error 194-5 Nonstationary time series processes 484-9
exact multicollinearity 173 cointegration 519-22
dummy variable trap 242-4 definition 520-1
measures to mitigate simultaneous equations bias, asymptotic
combination of explanatory variables 178 attenuaton 522
exclusion of explanatory variables 178 superconsistency of OLS 522
inclusion of additional explanatory tests for 522
variables 177 deterministic trend 487-8
increase in sample size 176-7 difference-stationarity 488-9
increase in mean square deviation of ensemble distribution 484-6
Subject Index 5.81

fitting models with nonstationary Parameter of regression model 85


processes 524-7 linearity in parameters 197
detrending 524-5 Partial adjustment model 419-20
differencing 525 Brown's habit persistence model 420-1
error correction models 526-7 Permanent income hypothesis see Consumption
integrated processes 489 function
random walk 485 Pli n see Probability limit
random walk with drift 487 Polynomial regression specification 217-18
tests for nonstationarity see Nonstationarity, Pooled OLS regression see Panel data
detection Population covariance see Covariance
trend-stationarity 488-9 Population mean see Mean of a random
see also Nonstationarity, detection; Spurious variable
regressions Population variance of a random variable see
Nonstochastic regressors see Regressor Variance of a random variable
Normal distribution 35-6 Population variance of sample mean see Mean
asymptotic normality of a random variable
of IV estimators 330-2 Power of a test 43-7
of OLS estimators 128-9 definition 43
normal distribution table see Appendix A see also Tests, one-sided; Nonstationarity,
Table A.1 detection
standard normal distribution 36 Predetermined variables, use as
Normal equations instruments 436-7
multiple regression model 159-61 Prediction 191-5
simple regression model 93 confidence interval 193
error 192
OECD employment and GDP growth rates impact of multicollinearity 194-5
( OECD2000) data set 572-3 population variance 193
OLS see Regression analysis, ordinary least zero expected prediction error 192
squares Probability density function of random
Omitted category see Dummy variables variable 16
Omitted variables see Model misspecification Probability limit 68-9
One-sided test see t tests; Tests, one-sided definition of 68
Online Resource Centre 3 rules 71-2
Order condition for identification 358-9; see Probit analysis, probit model 378-9
also Simultaneous equations estimation marginal effects 378-9
Ordinary least squares (OLS) see Regression Proxy variables 276-9
analysis, ordinary least squares consequences of use of 276-7
Outliers 269 ideal proxy 276
Overidentification 256-8; see also Simultaneous imperfect 277, 322
equations estimation unintentional 278-9

pvalues 144-5; see also t tests Quadratic regression specification


Panel data 113, 529 215-17
appropriateness of OLS, fixed effects, random Qualitative response models see Linear
effects regressions 539-41 probability model; Logit analysis;
Breusch-Pagan lagrange multiplier test 541 Probit analysis; Sample selection model;
Durbin-Wu-Hausman test 540 Tobit model
balanced panel 530 Qualitative explanatory variables see Dummy
definition 529 variables
fixed effects regressions 531-7
first differences 534 R 2 107-11, 180-1
least squares dummy variable adjusted (corrected) 188
(LSDV) 535-6 alternative interpretation 110-11
within-groups 533-4 coefficient of determination 108
pooled OLS regression 541 definition 108, 180-1
random effects regression 537-9 effect of omitted variable on 267-9
unbalanced panel 530 F test of goodness of fit 150-1, 182
unobserved effect 532 Ramsey's RESET test 222
588 Subject Index

Random effects regression 537-9; see also Panel unbiasedness 122, 313-14
data two explanatory variables
Random variables analytical decomposition 165
continuous 7, 14-19 derivation of expressions 158-60
discrete 7-13 population variance 166-7
double structure 24 standard errors 167-9
expected value 8-9, 18-19 unbiasedness 165-6
fixed and random components 12-13 Regression model
independence of two random variables 19 assumptions 114-18, 164-5, 311-13, 405-8
standard deviation 11 fitted 87
variance 11-12, 18-19 Model A 113
Random walk 485 Model B 114, 311
with drift 487 Model C 114,405
Granger-Newbold spurious regressions Regressor (explanatory variable, independent
Rank condition for identification 359 variable)
Realization 23, 479; see also Data generation nonstochastic 118, 122
process stochastic 311
Reduced form equation 344; see also reparameterizaton of model specification
Simultaneous equations estimation estimation of long-run effects in dynamic
Redundant variable see Model misspecification model 415-16, 418
Reference category see Dummy variables standard error of linear combination of
Regression analysis, ordinary least squares parameters 282-3
(OLS) t test of linear restriction 282-5
simple regression analysis 85-89 RESET test 222
least squares criterion 88, 91, 94 Residual
multiple regression analysis 156-62 definition of 88
normal equations 93, 159-61 OLS regressions with intercept
see also Disturbance term; Nonlinear zero correlation with explanatory
regression analysis; R 2 ; Regression variables 107-8
model assumptions; Residual zero sample mean 106
Regression coefficients, IV see Instrumental use of outliers in improving model
variables specification 269
Regression coefficients, OLS Residual sum of squares (RSS) 88
as random variables 122-4 Restriction
asymptotic properties, Model B 315-17 benefits from exploitation 179, 280
asymptotic normality 317 definition
consistency 316 linear restriction 179, 280
reason for interest 315 nonlinear restriction 456
confidente intervals 147-8, 169 tests
effects of changes in units of variables 100-1 common factor test of nonlinear
hypothesis testing 139-44, 169; see also t tests restriction 460-2
inconsistency caused by likelihood ratio test 400
measurement error in explanatory F test of linear restriction 281-2
variable 318-20 F test of multiple linear restrictions 285
simultaneous equations bias 345-7 t test of linear restriction 284-5
interpretation use in mitigation of problem of
logarithmic model 202 multicollinearity 179, 280
multiple linear regression model 161-3 zero restrictions 285-6
semilogarithmic model 206 RSS see Residual sum of squares
simple linear regression model 98-100
one explanatory variable Sample selection model 386-90
analytical decomposition 118-22 Heckman two-step estimation procedure 387,
consistency 316 388
derivation of expressions 92-4 Sample mean see Mean of a random variable
Monte Carlo experiment 126-30 Sample selection bias see Sample selection model
population variante 130-3 School costs (SC) data set 573-4
standard errors 133-6 Schwarz Information Criterion (SIC) 510
Subject Index 59

Semilogarithmic model 205-8 Superconsistency 514, 522


Serial correlation see Autocorrelation
SIC (Schwarz Information Criterion) 510 t distribution
Significance level (size) of test, definition 41 table, critical values see Appendix A Table A.2
Simple regression analysis 85-9; see also t statistic 49
Regression analysis, one explanatory t tests 49-52
variable degrees of freedom 50
Simulation experiment 73-4; see also Monte estimation of sample mean SO
Carlo experiment multiple regression analysis 169
Simultaneous equations bias see Simultaneous simple regression analysis 140-1
equations estimation equivalence of t test of slope coefficient and F
Simultaneous equations estimation test, simple regression 152
Durbin-Wu-Hausman test 361-2 interpreted as marginal F test, multiple
endogenous variables 344 regression 186-8
exogenous variables 344 p values 144-5
identification regression coefficients 139-47
exact identification 355-6 reporting results 143-4
order condition 358-9 of linear restriction 284-5
overidentification 356-8 equivalence to F test 284-5
rank condition 359 see also Regression coefficients; Tests; Tests,
underidentification 354-5 one-sided
instrumental variables estimation 351-4 Tests
reduced forin equation 344 power 43
simulation comparison of IV and OLS 348-50, significance level (size) 41
353-4 trade-off between size, power 43-47
simultaneous equations bias 345-8 Type I error 39
structural equation 344 Type II error 39
time series models 435-7 Tests, one-sided 58-68
two-stage least squares 357-8 anomalous results 66
unobserved heterogeneity 360-1 benefits from 61, 63-4
Size of a test 41 comparison of power with that of a two-sided
Slideshows 3 test 63-4
Slope dummy variables see Dumrny variables justification 66-7
Specification error see Model rnisspecification logic underlying 58-60
Spurious regressions 490-500 power, compared with two-sided test 63-4
caused by deterministic trends 491 regression coefficients 145-7
Granger-Newbold random walks 492-500 sample mean 65-6
Standard deviation of a random variable 11 see also Tests; t tests
Standard error 49 Time series analysis 439-41
regression coefficient see Regression autoregressive moving average (ARMA)
coefficients, OLS models 440-1
regression equation 168-9 Box Jenkins method 440-1
sample mean 49 forcing process 440
Static time-series models 408-10 see also Nonstationarity, detection; Time series
Stationarity see Stationary time series process processes
Stationary time series process 478-84 Time series data 113
conditions for stationarity 481 Time series processes
definition of stationarity 479-81 alternative dynamic representations 438-43
difference-stationarity 488-9 vector autoregressive (VAR) 441-3
ensemble distribution 484-6 Granger causality 443
strong stationarity 481 vector autoregressive moving average
trend-stationarity 488-9 (VARMA) 443
weak (covariance) stationarity 481 vector error correction models
Stochastic regressor see Regressor (VECM) 443
Structural equation 344; see also Simultaneous autocorrelation function 501
equations estimation autoregressive integrated moving average
Subject guide (Study guide) 3 (ARIMA) 489
590 Subject Index

Time series processes (cont.) possible trade-off with variance 31-3


autoregressive moving average see also bias
(ARMA) 440-1 Underidentification 254-255; see also
correlogram 501-6 Simultaneous equations estimation
ensemble distribution 480-1, 484-6 Unit root tests see Nonstationarity, detection
univariate 439 Unobserved effect see Panel data
see also Nonstationary time series processes; Unobserved heterogeneity 360-1
Stationary time series processes; Time
series analysis Var see Variance of a random variable
Time series regression models VAR see Vector autoregression
dynamic 413 -25 Variable misspecification see Model
estimation of long-run effects 415, 418 misspecification
simultaneous equations models 435 -7 Variance of a random variable
predetermined variable as instrument 436-7 continuous random variable 18 -19
static 408-10 discrete random variable 11-12, 18-19
see also Autoregressive distributed lag models; estimator 34, 81-3
Time series analysis; Time series maximum likelihood estimator 398
processes rules 21-2
Tobit analysis, tobit model 381-6 Vector autoregression ( VAR) 441-3
Total sum of squares (TSS) 107 Granger causality 443
Trend-stationarity 488-9 Vector autoregressive moving average (VARMA)
TSLS see Two-stage least squares process 443
TSS see Total sum of squares Vector error correction model (VECM) 443
Two-stage least squares (TSLS) 357-8; see also
Simultaneous equations estimation Weighted least squares (WLS)
Type I, Type II errors 39, 43-7 regression 288-301
White heteroskedasticity-consistent standard
Unbalanced panel see Panel data errors 305-6
Unbiased estimator, unbiasedness White noise 440; see also Disturbance term.
definition 27 -8 White test for heteroskedasticity 297
of regression coefficients 122, 165-6, 313-14 Within-groups fixed effects regression see Panel
of sample mean 28 data
"Excellent textbook ... the explanations are very clear, and yet it is very concise
and does not overwhelm students."
Thomas Chadefaux, Trinty College Dublin

"This is the best introductory text for undergraduates on the market."


Bruce Morley, University of Bath

The most accessible econometrics text focusing on only the essential maths.

Keeping maths to a minimum, this book provides a non-technical introduction to econometrics, making
it the perfect companion for anyone new to the subject.

A revision chapter at the beginnng of the book gives you the opportunity to brush up on statistics,
whilst diagrams have been included wherever possible to ensure clarity of explanation. Packed
with plenty of examples and regression exercises, including 50 on the same data set, Introduction to
Econometrics gives you lots of hands-on experience and ensures that you hone the skills needed to
successfully fit models given suitable data.

Whatever your level of experience, this book will develop your confidence in econometrics, providing
a launch pad for further study and equipping you with the tools to answer economic questions.

New to this edition:


Additional exercises included at the end of each chapter
Opening outlines have been added to the start of each chapter to further enhance clarity
and accessibility
Any non -essential equations have been stripped out to ensure that the text is accessible
to those with a limited background in mathematics
In the latter chapters short sections hav been included which introduce the meaning
and application of more advanced topics
Further informatiion sources have been included to help you to develop your learning
independently

Christopher Dougherty is at the London School of Economics and Political Science


o
o

a)

online resource centre


www.oxfordtextbooks.co.uk/orc/dougherty5e/
a)!
O;
This book is accompanied by an Online Resource Centre, which includes:
For students: For adopting lecturers (password protected): E

PowerPoint sudes covering all the topics in the text Instructor's manual containing answers to the
Datasets from the text available in Stata, Excel, exercises in the text a
Eviews, and ASCII formats Instructor PowerPointslides
Study guide with further exercises Ei
a),
>;
I

ISBN 978-0-19-967682-8
OXFORD
UNIVERSITY PRESS

9 780199 676828
www.ouv.com

You might also like