Lecture 4 Day 3 Stochastic Frontier Analysis

Efficiency and Productivity Measurement:
Stochastic Frontier Analysis
D.S. Prasada Rao

School of Economics
The University of Queensland, Australia
Stochastic Frontier Analysis

It is a parametric technique that uses standard
production function methodology.
The approach explicitly recognises that production
function represents technically maximum feasible
output level for a given level of output.
The Stochastic Frontier Analysis (SFA) technique
may be used in modelling functional relationships
where you have theoretical bounds:
Estimation of cost functions and the study of cost efficiency
Estimation of revenue functions and revenue efficiency
This technique is also used in the estimation of multi-output
and multi-input distance functions
Potential for applications in other disciplines
2
Some history.
Much of the work on stochastic frontiers began in
70s. Major contributions from Aigner, Schmidt,
Lovell, Battese and Coelli and Kumbhakar.
Ordinary least squares (OLS) regression
production functions:
fit a function through the centre of the data
assumes all firms are efficient
Deterministic production frontiers:

fit a frontier function over the data
assumes there is no data noise
SFA production frontiers are a mix of these two.

3
Production Function
It is a relationship between output and a set of
input quantities.
We use this when we have a single output
In case of multiple outputs:
people often use revenue (adjusted for price differences) as
an output measure
It is possible to use multi-output distance functions to study
production technology.
The functional relationship is usually written in

the form:
q f ( x1 , x2 ,..., xN ) v
4
Production Function Specification

A number of different functional forms are used in the
literature to model production functions:
Cobb-Douglas (linear logs of outputs and inputs)

Quadratic (in inputs)
Normalised quadratic
Translog function
N
1 N N
ln q 0 n ln xn nm ln xn ln xm u
2 n 1 m1
n 1
Translog function is very commonly used it is a generalisation of the
Cobb-Douglas function
It is a flexible functional form providing a second order approximation
Cobb-Douglas and Translog functions are linear in parameters and can
be estimated using least squares methods.
It is possible to impose restrictions on the parameters (homogeneity
5
conditions)
Cobb-Douglas Functional form

Linear in logs
Advantages:
easy to estimate and interpret
requires estimation of few parameters: K+3
Disadvantages:
simplistic - assumes all firms have same
production elasticities and that subsitution
elasticities equal 1
Translog Functional form

Quadratic in logs
Advantages:
flexible functional form - less restrictions on
production elasticities and subsitution
elasticities
Disadvantages:
more difficult to interpret
requires estimation of many parameters:
K+3+K(K+1)/2
can suffer from curvature violations
Functional forms
Cobb-Douglas:
lnqi = 0 + 1lnx1i + 2lnx2i + vi - ui

Translog:
lnqi = 0 + 1lnx1i + 2lnx2i + 0.511(lnx1i)2

+ 0.522(lnx2i)2 + 12lnx1ilnx2i + vi - ui
Interpretation of estimated parameters

Cobb-Douglas:
Production elasticity for j-th input is: Ej = j
Scale elasticity is: = E1+E2
Translog:
Production elasticity for i-th firm and j-th input is:

Eji = j+ j1lnx1i+ j2lnx2i
Scale elasticity for i-th firm is: i = E1i+E2i
Note: If we use transformed data where inputs are
measured relative to their means, then Translog
elasticities at means would simply be i.
9
Test for Cobb-Douglas versus Translog

Using sample data file which comes with the
FRONTIER program
H0: 11=22=12=0, H1: H0 false
Compute -2[LLFo-LLF1] which is distributed as

Chi-square (r) under Ho.
For example, if:
LLF1=-14.43, LLF0=-17.03
LR=-2[-17.03-(-14.43)]=5.20
Since 32 5% table value = 7.81 => do not reject H0
10
Deterministic Frontier models

In this model all the errors are assumed to be
due to technical inefficiency no account is
taken of noise.
Consider the following simple specification:
ln yi xi ui for firms i 1,2,... N

xis are in logs and include a constant
ui is a non-negative random variable.
Therefore ln yi xi .
Given the frontier nature of the model we can
measure technical efficiency using:
11

We have
yi
exp( xi ui )
TE i
exp( ui )
exp( xi )
exp( xi )
Frontier is deterministic since it is given by

exp(xi) which is non-random.
Estimation of Parameters:
Linear programming approach
Min ui ln yi xi
i
subject to ui 0.
12

Aigner and Chu suggested a Quadratic
prrogramming approach
Min
ui
i
ln yi xi
subject to ui 2 0.
Afriat (1972) suggested the use of a Gamma

distribution for ui and the use of maximum
likelihood estimation.
Corrected ordinary least squares [COLS]
If we apply OLS, intercept estimate is biased
downwards, all other parameters are unbiased.
13

So COLS suggests that the OLS estimator from
OLS be corrected.
If we do not wish to make use of any probability
distribution for yi then
o (COLS) o (OLS) max imumi ui : i 1,2,...,N
where u i is the OLS residual for i-th firm.
If we assume that ui is distributed as Gamma
then
2
o (COLS ) o (OLS ) [OLS ]
It is a bit more complicated if ui follows halfnormal distribution.
14
Production functions/frontiers
Deterministic
SFA
OLS
15
Production functions/frontiers
OLS:
qi = 0 + 1xi + vi
Deterministic :
qi = 0 + 1xi - ui
SFA:
qi = 0 + 1xi + vi - ui
where
vi = noise error term - symmetric
(eg. normal distribution)
ui = inefficiency error term - non-negative
(eg. half-normal distribution)
16
Stochastic Frontier: Model Specification

We start with the general production function as before and add a
new term that represents technical inefficiency.
This means that actual output is less than what is postulated
by the production function specified before.
We achieve this my subtracting u from the production
function
Then we have
q f ( x1 , x2 ,..., xN ) v u
In the Cobb-Douglas production function with one input we

can write the stochastic frontier function for the i-th firm as:
ln qi 0 1 ln xi vi ui
qi exp(0 1 ln xi vi ui )
17
Stochastic frontiers
deterministic frontier
qi = exp(0 + 1 ln xi)
yi
q*A ? exp(0 + 1ln xA + vA)

q*B ? exp(0 + 1ln xB + vB)
noise effect
noise effect
inefficiency effect
qB ? exp(0 + 1ln xB + vB uB)
inefficiency
effect
qA ? exp(0 + 1ln xA + vA uA)
xA
xB
18
Stochastic Frontier: Model Specification

qi exp(0 1 ln xi ) exp(vi ) exp(ui )
noise
inefficiency
deterministic
component
In general, we write the stochastic frontier model with several inputs and
a general functional form (which is linear in parameters) as
ln qi xi vi ui
We stipulate that ui is a non-negative random variable

By construction the inefficiency term is always between 0 and 1.
This means that if a firm is inefficient, then it produces less than what is
expected from the inputs used by the firm at the given technology.
We can define technical efficiency as the ratio of observed or realised
output to the stochastic frontier output
qi
exp(xi vi ui )
TEi
exp(ui )
exp(xi vi )
exp(xi vi )
19
Stochastic Specifications
The SF model is specified as:
ln qi xi vi ui
The following are the assumptions made on the distributions of v

and u.
Standard assumptions of zero mean, homoskedasticity and

independence is assumed for elements of vi.
We assume that uis are identically and independently

distributed non-negative random variables.
Further we assume that vi and ui are independently distributed.
The distributional assumptions are crucial to the estimation of

the parameters. Standard distributions used are:
Half-normal (truncated at zero)
ui
iidN (0, u2 ).
Exponential
Gamma distribution
20
Truncated normal distribution for u
var = 1
var= 4
var = 9
-0.5
0.0
0.5
1.0
Distribution of u: ui
1.5
2.0
2.5
3.0
3.5
iidN (0, u2 ).
We note that: As u is truncated from a normal distribution

with mean equal to 0, E(u) is towards zero and therefore
technical efficiency tends to be high just by model
construction.
21
Truncated normal with non-zero means

2.5
2
mu = -2
1.5
f(x)
mu = -1
mu = 0
mu = 1
0.5
mu = 2
0
0
A more general specification: u N ( , u )

This forms the basis for the inefficiency effects model where
u N ( i , u2 )
i 0 k Z ki
k
22
Estimation of SF Models
Parameters to be estimated in a standard SF model
2
2
are: , v and u
Likelihood methods are used in estimating the
unknown parameters. Coelli (1995)s Montecarlo
study shows that in large samples MLE is better than
COLS.
Usually variance parameters are reparametrized in
the following forms.
2 v2 u2
and
2 u2 v2 0.
2
2
2 and u / .
Aigner, Lovell and Schmidt
Battese and Corra
Testing for the presence of technical inefficiency

depends upon the parametrization used.
23
Estimation of SFA Models

In the case of translog model, it is a good idea to
transform the data divide each observation by its
mean
Then the coefficients of ln Xi can be interpreted
as elasticities.
Most standard packages such as SHAZAM and
LIMDEP.
FRONTIER by Coelli is a specialised program for
purposes of estimating SF models.
Available for free downloads from CEPA
website: www.uq.edu.au/economics/cepa
24
FRONTIER Instruction File

Table The FRONTIER Instruction File
1
chap9.txt
chap9_2.out
1
y
344
1
344
10
n
n
n
1=ERROR COMPONENTS MODEL, 2=TE EFFECTS MODEL

DATA FILE NAME
OUTPUT FILE NAME
1=PRODUCTION FUNCTION, 2=COST FUNCTION
LOGGED DEPENDENT VARIABLE (Y/N)
NUMBER OF CROSS-SECTIONS
NUMBER OF TIME PERIODS
NUMBER OF OBSERVATIONS IN TOTAL
NUMBER OF REGRESSOR VARIABLES (Xs)
MU (Y/N) [OR DELTA0 (Y/N) IF USING TE EFFECTS MODEL]
ETA (Y/N) [OR NUMBER OF TE EFFECTS REGRESSORS (Zs)]
STARTING VALUES (Y/N)
Here MU refers to inefficiency effects models and ETA refers to

time-varying inefficiency effects (we will come to this shortly)
The program uses the ratio of variances as the transformation
It allows for the use of single cross-sections as well as panel data
sets
25
FRONTIER output
the final mle estimates are :
coefficient
beta
beta
beta
beta
beta
beta
0
1
2
3
4
5
standard-error
t-ratio
0.27436347E+00
0.15110945E-01
0.53138167E+00
0.23089543E+00
0.20327381E+00
-0.47586195E+00
0.39600416E-01 0.69282978E+01
0.67544802E-02 0.22371736E+01
0.79213877E-01 0.67081892E+01
0.74764329E-01 0.30883101E+01
0.44785423E-01 0.45388387E+01
0.20221150E+00 -0.23532883E+01
beta 6
0.60884085E+00
beta 7
0.61740289E-01
beta 8
-0.56447322E+00
beta 9
-0.13705357E+00
beta10
-0.72189747E-02
sigma-squared 0.22170997E+00
0.16599693E+00 0.36677839E+01
0.13839069E+00 0.44613038E+00
0.26523510E+00 -0.21281996E+01
0.14081595E+00 -0.97328160E+00
0.92425705E-01 -0.78105703E-01
0.24943636E-01 0.88884383E+01
gamma
0.88355629E+00 0.36275231E-01
mu is restricted to be zero
eta is restricted to be zero
log likelihood function = -0.74409920E+02
0.24357013E+02
26
SF Models - continued
Predicting Firm Level Efficiencies:
Once the SF model is estimated using MLE method, we
compute the following:
ui* (ln qi xi ) u2 / 2
and
*2 v2 u2 / 2 .
We use estimates of unknown parameters in these equations

and compute the best predictor of technical efficiency for
each firm i :
ui*
TE i E exp(ui ) qi *
*2
ui*
exp ui* .
*
2
We use standard normal density and distribution functions to

evaluate technical efficiency.
27
SF Models - continued
Industry efficiency:
Industry efficiency can be computed as the average of
technical efficiencies of the firms in the sample
Industry efficiency can be seen as the expected value of a
randomly selected firm from the industry. Then we have
u2
TE E exp(ui ) 2 u exp .
2
Confidence intervals for technical efficiency scores (for the

firms and the industry as a whole) can also be computed.
We note that there are no firms with a TE score of 1 as in

the case of DEA.
No concept of peers exists in the case of SFA.
28
FRONTIER output
technical efficiency estimates :
firm
eff.-est.
1
0.77532384
2
0.72892751
3
0.77332991
341
0.76900626
342
0.92610064
343
0.81931012
344
0.89042718
mean efficiency = 0.72941885
Mean efficiency can be interpreted as the industry

efficiency.
29
Tests of hypotheses
e.g., Is there significant technical inefficiency?
H0: =0 versus H1: >0
Test options:
t-test
t-ratio = (parameter estimate) / (standard error)
Likelihood ratio (LR) test

[note that the above hypothesis is one-sided therefore must use Kodde and Palm critical
values (not chi-square) for LR test
LR test safer
30
Likelihood ratio (LR) tests

Steps:
1) Estimate unrestricted model (LLF1)
2) Estimate restricted model (LLF0)
(eg. set =0)
3) Calculate LR=-2(LLF0-LLF1)
4) Reject H0 if LR>R2 table value,
where R = number of restrictions
(Note: Kodde and Palm tables must be
used if test is one-sided)
31
Example - estimate translog production

function using sample data file which
comes with the FRONTIER program 344 firms
t-ratio for = 24.36, and N(0,1) critical
value at 5% = 1.645 => reject H0
Or the LR statistic = 28.874, and Kodde
and Palm critical value at 5% = 2.71 =>
reject H0
The LR statistic has mixed Chi-square

distribution
32
Distributional assumptions
the truncated normal distribution
N(,2) truncated at zero
More general patterns
Can test hypothesis that =0 using t-test
or LR test
The restriction =0 produces the halfnormal distribution: |N(0,2)|
33
Scale efficiency
For a Translog Production Function (Ray,
1998)
An output-orientated scale efficiency measure
is:
SEi = exp[(1-i)2/2]
where i is the scale elasticity of the i-th firm
and
K K
jk
j 1 k 1
If the frontier is concave in inputs then <0.

Then SE is in the range 0 to 1.
34
Stochastic Frontier Models: Some

Comments
We note the following points with respect to SFA
models
It is important to check the regularity conditions
associated with the estimated functions local and
global properties
This may require the use of Bayesian approach to impose
inequality restrictions required to impose convexity and
concavity conditions.
We need to estimate distance functions directly in the

case of multi-output and multi-input production
functions.
It is possible to estimate scale efficiency in the case of
translog and Cobb-Douglas specifications
35
Panel data models
Data on N firms over T time periods

Investigate technical efficiency change (TEC)
Investigate technical change (TC)
More data = better quality estimates
Less chance of a one-off event (eg. climatic) influencing
results
Can use standard panel data models
no need to make distributional assumption
but must assume TE fixed over time
The model: i=1,2,N (cross-section of firms); t=1,2T

(time points)
ln yit xit vit uit ; vit N (0, v2 );uit N (0, u2 )

36
Panel data models

Some Special cases:
1.
2.
Firm specific effects are time invariant: uit = ui .

Time varying effects: Kumbhakar (1990)
uit 1 exp(bt ct ) ui
3.
Time-varying effects with convergence Battese and Coelli

(1992)
uit exp (t T ui
Sign of is important. As t goes to T, uit goes to ui.

In FRONTIER Program, this is under Error Components
Model.
37
Time profiles of efficiencies
0
1
K90 ( = .5, = -.04)

BC92 ( = -.01)
10
11
12 13
K90 ( =
14
15
= -.02)
BC92 ( = .1)
Note: These are all smooth functions of trends of technical efficiency
over
time. These trends are also independent of any other data on the
firms. There is scope for further work in this area.
38
Accounting for Production Environment

Technical efficiency is influenced by exogenous factors that
characterise the environment in which production takes place
Government regulation, ownership, education level of the farmer, etc.
Non-stochastic Environmental Variables

In this case firm-level technical efficiency levels predicted will vary
with traditional inputs and environmental variables.
ln qi xi zi vi ui
Inefficiency effects model (Battese, Coelli 1995)
ln yit xit vit uit ; uit
N ( zit , u2 )
where is a vector of parameters to be estimated. In the FRONTIER

program, this is the TEEFFECTS model
39
Current research
There is scope to conduct a lot of research in this area.
Some areas where work is being done and still be done
are:
Modelling movements in inefficiency over time
incorporating exogenous factors driving inefficiency
effects
Possibility of covariance between the random disturbance
and the distribution of the inefficiency term
Endogeniety due to possible effects of inefficiency and
technical change on the choice of input variables
Modelling risk into efficiency estimation and
interpretation
40
Current research
We have seen how technical efficiency can be computed,
but it is difficult to compute standard errors.
Peter Schmidt and his colleagues have been working on a
number of related topics here.
Bootstrap estimators and confidence intervals for
efficiency levels in SF models with pantel data
Testing whether technical inefficiency depends on firm
characteristics
On the distribution of inefficiency effects under
different assumptions
Bayesian estimation of stochastic frontier models

Posterior distribution of technical efficiencies
Estimation of distance functions
41
Application to Residential Aged Care

Facilities
Residential aged care is a multi-billion dollar industry in
Australia
Considered even more important in view of an ageing
population.
Aged care facilities are funded by the Commonwealth
government and are run by local government,
community/religious and private organisations.
Efficiency of residential aged care facilities is considered
quite important in view of reducing costs.
CEPA conducted a study for the Commonwealth
Department of Health and aged Care.
Data was collected by the Department from the
residential aged care facilities.
42

Facilities
Data:
912 Aged Care Facilities
30% response rate response bias
Data validation
Actual observations used: 787
Methods:
DEA and SFA Methods
Peeled DEA
43

Facilities
Preferred model:
Output variables:
High care weighted bed days
Low care weighted bed days
Input variables:
Floor area in square meters
Labour costs
Other costs
44
General findings
Average technical efficiency was calculated to be 0.83. 17%
cost savings could be achieved if all ACFs operated on the
frontier.
Results consistent between SFA and DEA models
Found variation across ACFs in different states (lowest 0.79
in Victoria and 0.87 in NSW/ACT).
Privately run ACFs has higher mean TE of 0.89.
Average scale efficiency of 0.93 was found.
An average size of 30 to 60 beds was found to be near optimal
scale.
A second stage Tobit model was run to see the factors driving
inefficiency.
Potential cost savings was calculated to be $316 m for the
number of firms in the sample greater savings for the
industry!!
Economies of Scope were examined using new methodology
developed but no significant economies were found.
45

Lecture 4 Day 3 Stochastic Frontier Analysis

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 4 Day 3 Stochastic Frontier Analysis

Uploaded by

Copyright:

Available Formats

Efficiency and Productivity Measurement:

Stochastic Frontier Analysis

D.S. Prasada Rao

Stochastic Frontier Analysis

Deterministic production frontiers:

SFA production frontiers are a mix of these two.

The functional relationship is usually written in

Production Function Specification

Cobb-Douglas (linear logs of outputs and inputs)

Cobb-Douglas Functional form

Translog Functional form

lnqi = 0 + 1lnx1i + 2lnx2i + vi - ui

lnqi = 0 + 1lnx1i + 2lnx2i + 0.511(lnx1i)2

Interpretation of estimated parameters

Production elasticity for i-th firm and j-th input is:

Test for Cobb-Douglas versus Translog

Compute -2[LLFo-LLF1] which is distributed as

Deterministic Frontier models

ln yi xi ui for firms i 1,2,... N

Deterministic Frontier models

Frontier is deterministic since it is given by

Deterministic Frontier models

Afriat (1972) suggested the use of a Gamma

Deterministic Frontier models

o (COLS ) o (OLS ) [OLS ]

It is a bit more complicated if ui follows halfnormal distribution.

Stochastic Frontier: Model Specification

In the Cobb-Douglas production function with one input we

q*A ? exp(0 + 1ln xA + vA)

qA ? exp(0 + 1ln xA + vA uA)

Stochastic Frontier: Model Specification

We stipulate that ui is a non-negative random variable

The following are the assumptions made on the distributions of v

Standard assumptions of zero mean, homoskedasticity and

We assume that uis are identically and independently

Further we assume that vi and ui are independently distributed.

The distributional assumptions are crucial to the estimation of

Truncated normal distribution for u

We note that: As u is truncated from a normal distribution

Truncated normal with non-zero means

A more general specification: u N ( , u )

Aigner, Lovell and Schmidt

Battese and Corra

Testing for the presence of technical inefficiency

Estimation of SFA Models

FRONTIER Instruction File

1=ERROR COMPONENTS MODEL, 2=TE EFFECTS MODEL

Here MU refers to inefficiency effects models and ETA refers to

We use estimates of unknown parameters in these equations

We use standard normal density and distribution functions to

Confidence intervals for technical efficiency scores (for the

We note that there are no firms with a TE score of 1 as in

No concept of peers exists in the case of SFA.

Mean efficiency can be interpreted as the industry

Likelihood ratio (LR) test

Likelihood ratio (LR) tests

Example - estimate translog production

The LR statistic has mixed Chi-square

If the frontier is concave in inputs then <0.

Stochastic Frontier Models: Some

We need to estimate distance functions directly in the

Panel data models

Data on N firms over T time periods

The model: i=1,2,N (cross-section of firms); t=1,2T