Professional Documents
Culture Documents
ETH Zürich
cordas@ethz.ch
Economists use two main type of statistical models to forecast and provide
policy analysis.
1 Single-equation models study a variable of interest with a single
(linear or non-linear) function of a number of explanatory variables.
2 In multiple or simultaneous equation models, the variable of interest
is a function of several explanatory variables which are related to each
other with a set of equations.
Specific estimation techniques may be needed depending on the data type:
1 A times series is a time-ordered (daily, weekly, . . . ) sequence of data
(price, income, . . . ) which often requires special statistical treatment.
2 a cross section refers to data collected by observing many subjects
(individuals, firms or countries) at the same point in time. Its analysis
usually consists of comparing the differences among the subjects.
Here we provide some background on demand estimation and regression
analysis in the context of a single-equation approach.
From Example 1, we notice that the slope of the demand function being
negative, GZ’s games are a normal good!
We can also compute the price elasticity resulting directly from the price
change (arc elasticty):
Q1 −Q0 7000−6000
Q0 6000 2
P1 −P0
= 15−20 =−
P0 20
3
Note that in the context of a linear function, the arc elasticty is equal to the
point elasticity:
∂Q P0 20 2
= −200 =−
∂P Q0 6000 3
Economists usually plot the inverse demand, i.e., the price variable is on the
y -axis. The inverse demand function is useful in several contexts.
Economists usually plot demand functions with the price variable is on the
y -axis and the quantities in the x-axis:
Revenue-maximizing output level
Demand Aggregation
Domestic Foreign Total
Price
Demand (QD ) Demand (QF ) Demand (Q)
100 0 0 0
80 20000 0 20000
60 40000 5000 45000
40 60000 10000 70000
... ... ... ...
The above figures come from the following linear demand functions:
Domestic: P = 100 − 0.001QD Foreign: P = 80 − 0.004QF
Inverting the above functions and aggregating the quantities, we get the
market demand (domestic + foreign),
QD + QF = Q = 120000 − 1250P ⇔ P = 96 − 0.008Q (2)
The parameters of model (4) could be estimated with the linear model (5).
The most popular technique to estimate the coefficients of functional forms
which are linear in the parameters is linear regression.
C. Ordás Criado (CEPE-ETH) Managerial Economics (Fall 2010) Demand Estimation 11 / 40
Linear Regression
Linear regression consists in finding the best-fitting line that minimizes the
sum of squared deviations between the regression line and the set of
original data points. This technique is also know as the Ordinary Least
Squares (OLS) method.
Problem (7) has a closed form and unique solution when the explanatory
variables are linearly independent, i.e., no exact linear relationships exist
between two or more explanatory variables.
Most statistical softwares possess pre-implemented routines/functions to
perform regression analysis (Excel, Matlab, R, SPSS, S-Plus, Stata, . . . )
where ȳ is the mean of the yi s and ŷ¯ is the mean of the fitted values (ŷi s).
Note that R ∈ [0, 1]. The closer R is to 1, the better the fit.
Other important goodness of fit measures (the R 2 and the F -statistic) rely
on a decomposition of the variation of the dependent variable y into ‘total’,
‘explained’ and ‘unexplained’ variation:
X
n
SST = (yi − ȳ )2 sum of squared deviations in y ≡ total variation (9)
i=1
Xn
SSR = (ŷi − ȳ )2 sum of squares of regression ≡ explained variation (10)
i=1
Xn
SSE = (yi − ŷi )2 sum of squared errors ≡ unexplained variation (11)
i=1
The R 2 in (12) is equal to the square of R in (8) only when regression (6)
includes an intercept. The closer the R 2 is to 1, the larger the share of
variation explained by the model.
Note that adding explanatory variables to the regression never penalizes the
R 2.
The R 2 can be compared across models as long as the y variable shares the
same units of measurement.
2
Note that Radj is not the share of total variance explained by the regression
model (it can be negative even in the presence of an intercept).
2
Preference should be given to the Radj when comparing regression models
with different number of predictors.
2
The closer Radj to 1, the better the model.
1.2
F(df1=1,df2=10,alpha=5%)
F(df1=2,df2=10,alpha=5%)
1.0
F(df1=5,df2=10,alpha=5%)
F(df1=5,df2=100,alpha=5%)
0.8
Density
0.6
0.4
0.2
0.0
0 2 4 6 8 10
Rejecting the null hypothesis of the F -test ensures that the regression’s
predictors as a whole contribute to explain a statistically significant portion
of the variation in the dependent variable y . We can then proceed to
analyze the relationship between each explanatory variable and y .
Before interpreting their sign and magnitude, the precision and
reliability of each individual coefficient can be assessed with the help of:
The detailed calculation of seβ̂k is not shown here (it is part of any standard
regression output).
When the size of a coefficient (or some deviation from it) is large as
compared to its standard deviation, the relationship between xk and y is
expected to be strong.
0.4
stud(df=1,alpha=5%)
stud(df=3,alpha=5%)
norm(0,1,alpha=5%)
0.3
Density
0.2
0.1
0.0
−10 −5 0 5 10
β̂k ± t(n−K
∗
,α/2) seβ̂k (16)
If that interval does not include some arbitrary (and possibly null) value β ∗ ,
the regression coefficient is significantly different from β ∗ at the α
significance level.
Once you have carried out the appropriate individual t-tests on the β̂k s, you
can proceed to interpret the coefficients.
Note that when you have more than one explanatory variable in a regression,
the regression coefficients are partial regression coefficients, i.e.,
β̂1 6= cor(Q̃, P̃) in equation (17).
C. Ordás Criado (CEPE-ETH) Managerial Economics (Fall 2010) Demand Estimation 29 / 40
A Regression Example with Excel 2007/2010
To replicate this example, use the file regression.xlsx from the course
website. These data are from Hirschey (2009, P.190).
We estimate the following single equation demand model:
UNIT SOLD = β0 + β1 PRICE + β2 ADVERT + β3 PERS SELL + ǫ (19)
For performing regression analysis with Excel 2007/2010, you need first to
enable Excel’s Data Analysis Toolbox:
1 go the the File tab or click the Office button and then click on Options
2 click on Add-Ins, select ‘Analysis Toolpak’ in the ‘Inactive Application
Add-ins’ and click on the ‘Go. . . ’ button
3 The ‘Add-Ins’ window will pop up. Select ‘Analysis Toolpak’ and click
OK
You can check that the Data Analysis Toolbox has been properly enabled by
selecting the Data tab in Excel and checking that the ‘Data Analysis’ option
is available under the ‘Analysis’ buttons.
Then open regression.xlsx in Excel and use the data in the ‘data’ sheet.
C. Ordás Criado (CEPE-ETH) Managerial Economics (Fall 2010) Demand Estimation 30 / 40
A Regression Example with Excel - Steps (1) and (2)
Select ‘Analysis Toolpak’ and click on the ‘Go. . . ’ button
A Regression Example with Excel - Step (3)
To replicate the results, use the same options in the Regression Window.
A Regression Example with Excel - Regression Output
A Regression Example with Excel - Regression Output