You are on page 1of 19

Journal of Economic Dynamics and Control 18 (1994) 231-249.

North-Holland

Risk, uncertainty, and complexity


Alfred L. Norman
The University of Texas at Austin, Austin, TX 78712, USA

David W. Shimer
California Institute of Technology, Pasadena, CA 91125, USA

Received January 1992, final version received August 1992

Complexity theory provides formal procedures for analyzing problem difficulty. Frank H. Knight in
Risk, Uncertainty and Profit assumed intelligence is finite and stressed the difficulty of solving
problems involving uncertainty. In this paper, a risk decision is a stochastic optimization problem
where the parameters and the functional forms required to determine the optimal decision are
known. An uncertain decision is a stochastic optimization problem where at least one parameter or
functional form must be estimated. Using complexity theory, a valid distinction can be made
between risk and uncertainty which is consistent with Bayesian statistics. From the perspective of
bounded rationality Knight's concepts of consolidation and specilization can be reconciled with the
Bayesians.

1. Introduction
C o m p l e x i t y t h e o r y p r o v i d e s a f o r m a l m e t h o d o l o g y for a n a l y z i n g the
difficulty of solving p r o b l e m s . The p u r p o s e of using c o m p l e x i t y t h e o r y to
i n t e r p r e t F r a n k H. K n i g h t ' s Risk, Uncertainty and Profit (1921/1971) is to o b t a i n
a valid d i s t i n c t i o n between risk a n d u n c e r t a i n t y which is consistent with
B a y e s i a n statistics. Thus, e c o n o m i c theories can be f o r m u l a t e d using c o m p l e x i t y
theory.
K n i g h t c o n s i d e r e d three types of p r o b a b i l i t y : risk, statistical, a n d uncertainty.
T h e risk case, which K n i g h t defined as 'objective' probabilities, dealt with
p r o b a b i l i t i e s with k n o w n p a r a m e t e r s a n d functional forms. F o r all decisions
involving risk, the b u s i n e s s m a n can eliminate his risk t h r o u g h insurance. In
contrast, the u n c e r t a i n t y case, which K n i g h t defined as 'subjective' p r o b a b i l i t y ,
is c h a r a c t e r i z e d by u n k n o w n p a r a m e t e r s o r functional forms which the business-
m a n m u s t e s t i m a t e w i t h o u t a n y p a s t reference u p o n which to base such an

Correspondence to: Alfred L. Norman, Department of Economics, The University of Texas at Austin,
Austin, TX 78712-1173, USA.

0165-1889/94/$06.00 1994--Elsevier Science Publishers B.V. All rights reserved


232 A.L. Norman and D.W. Shimer, Risk, uncertainty, and complexity

estimate. Statistical probability covered the intermediate case where the busi-
nessman had some prior observations upon which to form an estimate.
The development of statistics based on Savage's (1954) axioms has raised
serious questions concerning Knight's three categories. For example, Bayesian
statistics has made Knight's separation of risk and uncertainty into disjoint
decision problems appear to be of dubious value, as there is no distinction
between 'objective' and 'subjective' probabilities in Bayesian statistics and the
completely unknown case can be handled by the construction of a diffuse prior.
Bayesians insist there is no valid distinction between risk and uncertainty [for
example, see Cyert and DeGroot (1987)]. In reply to such criticisms, Bewley
(1986) provided an interpretation of the difference between risk and uncertainty
based on modifying Savage's axioms.
In this paper the authors propose an alternative interpretation of Knight.
Uncertainty for Knight creates the greatest of logical difficulties both in forming
the estimates and in making decisions based on such estimates. This paper
focuses on the second of the two difficulties. A risk decision is defined as
a stochastic optimization problem where the parameters and the functional
forms required to determine the optimal decision are known. And an uncertain
decision is defined as a stochastic optimization problem where at least one
parameter or functional form must be estimated. The difficulties in making risk
and uncertain decisions will be defined in terms of computational complexity. In
this alternative interpretation, the relevant question becomes: 'Is the computa-
tional complexity of an uncertain decision greater than or equal to the computa-
tional complexity of the corresponding risk decision?'
To address this question, a simple monopoly model is developed in section 2.
The use of a monopoly model permits the separation of the computational
complexity issues of production decisions from the computational complexity
issues of business strategy in the game theory models of oligopoly. In the model
the parameter of the production function is known in the case of risk and
unknown in the case of uncertainty. For the case of known and unknown
parameters, Knight's three types of probability will be defined in a manner
consistent with Bayesian statistics. This model will be formulated to demon-
strate that uncertain decisions are difficult to solve even in the simplest case of
uncertainty, a single unknown parameter which can be estimated using a conju-
gate Bayesian distribution.
To consider the computational complexity of the two optimization problems
requires a model of computational complexity. The Traub, Wasilkowski, and
Wo~niakowski (1988) real number computational model based on the concept
of information-based complexity is presented in section 3. This model provides
a framework to study both exact and approximate solutions of problems. The
difficulty in solving the risk and uncertainty monopoly problems is considered
in section 4. The monopoly problem with risk can be solved exactly in three
operations for a time horizon of arbitrary length whereas the same problem with
A.L. Norman and D.W. Shimer, Risk, uncertainty, and complexity 233

uncertainty cannot be solved exactly in a finite number of operations for any


time horizon greater than one. The result is immediately generalizable to linear
quadratic control problems.
A nonlinear case is considered in section 5. With few exceptions a meaningful
comparison of risk and uncertainty in a nonlinear model must be made between
approximate solutions with positive errors. The concluding remarks, section 6,
provide a reconciliation of the views of Knight with those of the Bayesians based
on the concept of bounded rationality.

2. The model
In this section we shall construct a model to demonstrate that uncertain
decisions are more difficult to solve than risk decisions. As was previously
defined an uncertain decision is a stochastic optimization problem where at least
one parameter or functional form must be estimated. We wish to show that even
in the simplest case of uncertainty, a single unknown parameter, difficulties arise
in solving the uncertain decision. This is true even in the simplest case of
Bayesian estimation, a conjugate distribution.
In order to construct such a model a Bayesian interpretation of Knight's three
probability types is required. A good starting point for this task is a review of
Knight's original definitions:

(1) A priori probability. 'Absolutely homogeneous classification of instances


completely identical except for really indeterminate factors. This judgment
of probability is on the same logical plane as the propositions of mathemat-
ics (which also may be viewed, and are viewed by the writer, as 'ultimately'
inductions from experience).'

(2) Statistical probability. 'Empirical evaluation of the frequency of associ-


ation between predicates, not analyzable into varying combinations of
equally probable alternatives. It must be emphasized that any high degree of
confidence that the proportions found in the past will hold in the future is
still based on an a priori judgment of indeterminateness. Two complications
are to be kept separate: first, the impossibility of eliminating all factors not
really indeterminate, and, second, the impossibility of enumerating the
equally probable alternatives involved and determining their mode of com-
bination so as to evaluate the probability by an a priori calculation. The
main distinguishing characteristic of this type is that it rests on an empirical
classification of instances.'

(3) Estimates. 'The distinction here is that there is no valid basis of any kind for
classifying instances. This form of probability is involved in the greatest
234 A.L. Norman and D. W. Shimer, Risk, uncertainty, and complexity

logical difficulties of all, and no very satisfactory discussion of it can be given,


but its distinction from the other types must be emphasized and some of its
complicated relations indicated.'

Knight's three definitions are broadly defined. To construct a simple model


for complexity analysis a Bayesian interpretation of a linear equation with an
unknown parameter will be considered. The more general case of an unknown
functional form is not required to interpret Knight. One source of uncertainty in
business emphasized by Knight is knowledge of the production process. For
simplicity, consider an agent with the following linear production process:

qt=flx~+et, t= 1,2,...,T, (1)

where qt is the tth observation of net output, xt is the tth level of the production
process, fl is the unknown scalar parameter, and et is the tth unobserved
disturbance term which represents such factors as rejects or exogenous influen-
ces such as the weather. The et are i.i.d, normal with mean zero and known
variance one. The use of a normal disturbance, which implies instances of
negative net output, is needed in order to obtain a conjugate distribution. We
wish to show that our results hold even in the simplest case of Bayesian statistics
that of a conjugate distribution. Instances of negative net output can occur with
a crop failure in agriculture or plant failure in other processes where the output
in one period is a feedstock in the next period's production.
Given a normal prior on fl at time t = l, the prior information on fl at time t is
a normal distribution N(mt, ht), where mt is the mean and ht is the precision. The
mean and precision have the following difference relationships:

ht=ht_l +x2-1, (2)

mt = ( m t - l h t - 1 + q t - l x t - 1 ) / h t . (3)

The Bayesian interpretation of the three cases is:

(1) A priori probability. The agent knows fl precisely.

(2) Statistical probability. The agent's prior information on fl is N(ml,hl),


where m~ and hi are defined by eqs. (2) and (3) assuming the agent has
K observations on eq. (1) and initially had a subjective locally uniform prior
as defined in case 3 below.

(3) Estimates. The agent's prior information on fl is a locally uniform prior


represented by N(ml, hi), where hi has a very small positive value.
A.L. Norman and D. I4/. Shimer, Risk, uncertainty, and complexity 235

In case 1 the agent has either been given precise knowledge of fl or has
observed eq. (1) a countable number of times such that his prior on fl has
asymptotically converged, using eqs. (2) and (3) above, to N(fl, ~ ). This inter-
pretation is consistent with Knight as he gives rolling a perfect die as an example
of a priori probability in Chapter VI and mentions 'or from statistics of past
experience' in Chapter VII. In Chapter VIII, Knight calls case 1 'objective
probability' or 'risk'. The case 2 definition is a simplification of Knight's
definition of case 2.
The use of a subjective prior for case 3 is consistent with Knight's alternative
definition of case 3 as 'subjective probability' or 'uncertainty' given in Chapter
VIII. While the subjective prior could have been modeled as complete ignorance
using an improper uniform prior, with prior belief p(fl) oc constant, the authors
choose to define this case as an agent who 'knows little' in order to obtain
a well-defined decision problem. For a discussion of alternative representations
of ignorance see Zellner (1971).
In defining case 3 Knight uses the enigmatic phrase 'there is no valid basis o f
any kind f o r classifying instances'. Bewley (1986) has proposed an interpretation
of Knight based on modifying Savage's (1954) axioms. In this paper the authors
prefer to interpret Knight in a manner consistent with standard Bayesian
methodology. The Bayesian interpretation of the three types represents a con-
tinuum from little information to asymptotically converged estimates and pro-
vides a,a integrated framework for objective and subjective probability.
In dealing with these three probability situations Knight asserts two types of
difficulties created by uncertainty. The first difficulty is 'the formation of the
estimate'. In this paper the authors have deliberately chosen a conjugate distri-
bution to make the difficulty of forming an estimate as simple as possible. In the
case of uncertainty, the decision maker will have the difficulty of generating
a locally uniform prior in order to formulate a well-defined decision problem.
The second difficulty is 'the estimation of its value'. This difficulty can be
interpreted as being the difficulty of making decisions based on estimates of the
parameters. In his book, Knight gives as an example of the second type of
difficulty a businessman's decision whether or not to increase the capacity of his
firm.
In order to analyze the second difficulty, a decision model must be created.
Assume the decision maker is a monopolist who faces an inverse demand
function

p, = a - dq,, (4)

where a and d are known. Another source of uncertainty emphasized by Knight


and studied in the bandit problem literature is unknown parameters in the
demand function. But as a single unknown parameter is sufficient to provide
236 A.L. Norman and D. W. Shimer, Risk, uncertainty, and complexity

a valid interpretation of Knight, only an unknown fl in the production relation-


ship (1) will be considered. The monopolist's profit function is

it, = p t q , - c(q,) . (5)

As the complexity results are invariant as to whether the cost function is


defined as a zero, linear, or quadratic function, the cost function will be defined
as c ( q O = 0 to simplify the notation. The monopolist is interested in maximizing
his expected discounted profit over a finite time horizon:

J r = sup H r ( x r ) ,
xT
where
IIr(x r) = E v'-lp,(x~)q,(x,)lqt-l,x '-1 , (6)
t

and where z is the discount factor, q t - 1 is ( q l , q 2 . . . . ,qt-1), and x t - 1 is


(xl,x:, ... , x , _ ~ ) . q t - ~ and x t-~ represent the fact that the decision maker
anticipates complete information which is observed exactly without delay. The
choice of a finite-time horizon was chosen for two reasons:

(i) A finite-time horizon represents the length of a product cycle.


(ii) In the case of uncertainty, the problem will be shown to become 'hard'
within a time horizon of only two periods. Thus an infinite-time horizon,
which is useful to study whether the optimal strategy leads to convergence to
the true parameter values, is not needed to characterize the computational
complexity difference between risk and uncertainty.

Now consider (6) from the perspective of the Bayesian interpretation of


Knight's three probability classes. For case 1 the optimization problem for case
1, OPT1, is (6) subject to (4) and (1), and this problem is well-defined because fl is
a known parameter. The optimization problem for cases 2 and 3, OPT2-3, is (6)
subject to the augmented states variable equations, (4), (1), (2), and (3). For the
later optimization problem, fl in (1) is replaced with the normal prior on fl at
time t. The serious measure-theoretic problems raised in the later optimization
problem are discussed in the appendix.

3. Information-based complexity
If we assume the standard definition of rational agents, namely that the agent
can exactly solve an arbitrary optimization problem instantaneously without
cost, then there is n6 difference between risk and uncertainty. While Knight
A.L. Normanand D.W. Shimer, Risk, uncertainty, and complexity 237

assumed that the behavior of economic agents is rational in the sense of being
purposeful, he also assumed that human intelligence is finite and emphasized the
difficulties of solving uncertainty problems. In this paper we shall define finite
intelligence in terms of computational complexity. The first task in analyzing the
computational complexity of risk and uncertainty is to define the appropriate
model of computation.
Two major branches of computational complexity are information-based
complexity and combinatorial complexity. The former deals with the difficulty of
approximate solutions of problems where information is partial, noisy, and
costly. The latter deals with problems that can be solved exactly in a finite
number of computations and where information is complete, exact, and costless.
For an example of combinatorial complexity analysis in economic theory see
Norman (1987). As will become apparent later, uncertainty problems are gener-
ally best analyzed using information-based complexity, whereas many risk
problems can be analyzed using combinatorial complexity. In this paper we
need a computational model which can be used to study both information-based
and combinatorial complexity. For this purpose the appropriate model is the
information-based computational model of Traub, Wasilkowski, and
Wo~niakowski (1988). In this model of computation all arithmetic and combina-
torial operations are assumed to be done with infinite precision.
Let the problem sets for OPT1 and OPT2-3 be designated Fl,r and F23,r,
respectively. These problem sets are

FI,T = {f: f = 7~r('), (1), (4) for (a,d,/3)},

F23,T = {f: f = nT('), (1)--(4) for (a, d, fl, ml,hl)}, (7)

where a, d, fl, hx ~ ~+ and mx e ~. The solution operation St: F ~ ~ and with


solution elements S(f), is defined as

S t ( f ) = sup I I r ( x r ) . (8)
xT

Let U ( f ) be the computed approximation to S ( f ) with absolute error meas-


ured by I S ( f ) - U(f)l. We shall say that U ( f ) is an 6-approximation iff
I S ( f ) - U ( f ) l < 6.
To compute these 6-approximations we may need information about f. We
gather knowledge about f through the use of computations L: F --. H. Each
information operation L is called an oracle. For each problem element f ~ F we
compute a number of information operations, which will give us all the informa-
tion we will have to implement our algorithm. In considering 6-approximations
for OPT2-3, adaptive information in the following form will be considered:

N ( f ) = [Hr(x lr), Hr(x r), ..., IIr(xr~)], (9)


238 A.L. Norman and D. W. Shimer, Risk, uncertainty, and complexity

where the information is computed sequentially as determined by optimization


algorithm. We shall consider the implications of approximate information. The
computation of 6-approximations for OPT1 does not require information
operations. In numerical analysis an example of an information operation
would be the number of operations necessary to compute the value of function
at a point in an integration procedure based on function evaluations. In
economics information operations could be used to represent the cost of acquir-
ing data in the marketplace.

The information model of computation is defined through two basic points as


follows.

(1) Each information operation is performed with a constant cost c, where


c>0.

(2) Arithmetic operations and comparison of real numbers are performed


exactly with unit cost.

For e a c h f ~ F, we desire to compute a 6-approximation, U ( f ) , of the true


solution S ( f ) , where 6 = 0 corresponds to an exact solution. Through knowing
N ( f ) , the approximation U ( f ) is computed by a mapping, qS, which corre-
sponds to an algorithm, where U ( f ) = c~(N(f)), with

c~: N ( f ) - ~ ~ , (10)

and the goal is to compute ck(N(f)) with minimal cost. If no information is


required, O(N(f)) reduces to ~b(f). This very generalized conception of an
algorithm is called an idealized algorithm. Much complexity analysis is per-
formed restricting idealized algorithms to realizable algorithms which are based
on a particular computer model such as a Turing machine or based on computa-
tional considerations such as the class of algorithms that are linear functions of
the input.
The cost of computing tk(N(f)) will be denoted by qh (~b,N ( f ) ) , the cost of
computing the information by q~2(N, f ) . The total cost of computing the
approximation is

q)(U,f) = ~oz(N,f) + qh(ck, N ( f ) ) , (11)

where U stands for a pair consisting of information N and algorithm ~b.


For this paper we shall concern ourselves with only the worst-case setting of
complexity. For this particular problem the worst case and the average case are
equivalent because, as we shall see, the number of operations does not depend
A.L. Norman and D.W. Shimer, Risk, uncertainty, and complexity 239

on which problem element we choose. In the worst-case setting the error and
cost of approximation are defined over all problem elements as follows:

e(U) = sup IS(f) - U(f)[, (12)


f~F

q~(U) = sup ~o(U,f). (13)


f~F

An important concept developed by Traub, Wasilkowski, and Wo~niakowski


(1983, ch. 3) is the radius of approximate information, r(Np), which is roughly
the smallest ( for which there exists an element belonging to S ( f ) for all
f ~ F23, T that have the same approximate information a s f Since e(U) >_r(Np), it
is impossible to find an algorithm such that e(U) < r(No). See their Corollary
3.1.
We wish to characterize the two optimization problems on the basis of how
costs grow with increasing problem size, which in this case is the parameter T.
For this purpose we will consider the asymptotic behavior. Consider two
nonnegative functions Z -- Z(T) and C = C(T).

Definition 1. Z is of upper-order [lower-order] C, written O(C) [o(C)], if there


exist k, m > 0, such that Z(T) <_ [ >_] mC(T) for all T > k.

Definition 1 requires a slight modification to handle the rate of growth


measured in terms of 1/6 ~ ~ . This definition can now be employed to
characterize the computational complexity of the two optimization problems by
applying the definition of upper and lower order to the cost functions of ~, the
class of all algorithms which use information operator N.

Definition 2.OPTI(2-3) has 6-computational complexity C if there exists an


6-approximate algorithm U ( f ) ~ such that tp(T) is O(C), and for all 6-
approximate algorithms U ~ ~, tp(T) is o(C).

Like Definition 1, Definition 2 requires a slight modification to handle the


rate of growth measured in terms of 1/6. To say that OPT1 has 0-computational
complexity T O means that OPT1 can be computed exactly (6 = 0) in a fixed
number of computations independent of the length of the time horizon T.
Definition 1 divides algorithms into equivalence classes. For example, an algo-
rithm which can compute OPT1 in six operations is equivalent to another
algorithm which can compute OPT1 in eight operations. For algorithms whose
cost functions are polynomial in T, the equivalence classes are defined by the
highest power of T.
240 A.L. Norman and D. IV. Shimer, R&k, uncertainty, and complexity

4. Complexity analysis of risk versus uncertainty


In this section, we use the definitions of complexity to provide an example
that solving profit maximization problems with uncertainty is much more
difficult than solving the same problem with risk.
First consider the optimization problem under risk, OPT1, that is, (6) subject
to (4) and (1), where fl is a known parameter. Substituting (4) and (1) into (6) the
expression to be optimized can be analytically expressed as

T
177. = sup ~ rt-l(aflxt - dfl2x 2 - d). (14)
Xt t=l

Because (14) is a sum of quadratics in xt, the optimal xt can be exactly


determined as a function of the parameters of f ~ F without recourse to the
information operator as

x,* = a / 2 d B . (15)

We wish to establish the (6 = 0)-computational complexity of OPT1. Note


that in the case of risk the optimization problem is separable into T single-
period problems, the solution to each period is the same, and the number of
computations is the same for all f s F1, r. It is obvious that a small number of
efficient algorithms, which differ only in the order in which the indicated
arithmetic operations are performed, exist to solve (15). This leads to the
following result:

Theorem 1. The O-computational complexity of OPT1 & T .

Proof The solution (15) needs to be computed only once and requires two
multiplications and one division. Thus there exists a 0-approximate algorithm
whose Cost(T) = 3. Any 0-approximate algorithm will have a Cost(T) > 3. By
Definitions 1 and 2, the 0-computational complexity of OPT1 is T . |

As has been pointed out, the definitions of asymptotic complexity divide


problems into equivalence classes, with T O being the polynomial zero class.
While it would be a simple matter to characterize the complexity of OPT1
exactly at 3, such an effort was not considered necessary to illuminate the
difference between risk and uncertainty.
Now consider the optimization problem under uncertainty, OPT2-3, that is,
(6) subject to (1), (2), (3), and (4). To illustrate the computational difficulty with
OPT2-3 the simplest nontrivial example, a time horizon of only two periods,
T = 2, will be considered.
A.L. Norman and D.W. Shimer, Risk, uncertainty, and complexity 241

Because OP2-3 satisfies Thiemann's (1985) conditions for Bellman's optimal-


ity principle, the problem can be solved by dynamic programming. The optimal
solution in the second period is

am2
x * - 2d(m z + h 2 1 ) , (16)

and the value function in the terminal period J1 (ql) is

a2m 2
Jl(ql) = 4d(m2 + h2 l) d. (17)

Thus the value function in the first period is

J2(qo) = sup [ami x i - d(m 2 + h { l) xxZ _ d + ~E(J i (qi))]. (18)


Jl

Substituting (2) and (3) into Jl(ql) gives us the following:

a2((mihl + q x x i ) / h x ) 2
Ji(ql) -- 4 d ( [ ( m i h l + q i x l ) / ( h l + x2)] 2 + (hi + x2) - i ) - d, (19)

while the expectation of J ~ ( q l ) has the form

E[-QI(q')I - d, (20)
LQ2(ql)J

where Ql(ql) and Q2(ql) are quadratic forms in the normal variable ql. As
pointed out by Aoki (1967), this expectation cannot be carried out explicitly to
give an analytic closed expression. For a discussion of the integration issue see
Moses (1971).
Now consider the determination of x*. To simplify the discussion of the
determination of x* consider

//I(X1) = /'/I(X1,X~) = a m x x l -- b(m 2 + h ? l ) x 2 - d + 6 E ( J x ( q x ) ) .


(21)
The optimization problem to determine x* becomes

J2(qo) = sup//1 (xl). (22)


Xl
242 A.L. Norman and D.W. Shimer, Risk, uncertainty, and complexity

In this form the optimization problem is a search to find x* where the adaptive
information operator N ( f ) returns Fll(xi) for each xl selected in the search.
Since (20) cannot be analytically integrated, any finite integration scheme with
finite cost C will have a radius of approximate information, r(Np) > 0. This
implies:

Theorem 2. The O-complexity of OPT2-3 & transfinite.

Proof. Consider the two-period problem which is a subset of T > 2 problems.


Any finite 0-approximate algorithm can use no more than a finite number of
information operations, Hr(Xl), and each information operation must be com-
puted with a finite integration scheme for the cost of the information operation
to be finite. But as E(Jl(ql)) cannot be analytically integrated, any finite
integration scheme will involve an absolute error. Because the problem set is
continuous in the parameters a and d, the radius of approximate information
r(Np) > 0. No finite 0-approximate algorithm exists by Corollary 3.1 of Traub,
Wasilkowski, and Wo~niakowski (1983). II

From the perspective of complexity, as this example demonstrates, there is


a fundamental difference between risk and uncertainty. This difference exists
even for the simplest case of Bayesian statistics, a conjugate distribution. The
example shown is just one example of a large class of risk and uncertainty
problems. To illustrate another example consider a random coefficient produc-
tion model:

qt=~txt+et, t = 1,2 . . . . , T , (1 ')

where in this case [3t is an i.i.d, random coefficient. The case of risk for this
problem is where the first two moments of fit are known. In this case the optimal
decision is

x* = aE {jSt} /2dE {~2 } . (15')

It is obvious that the 0-computational complexity of this risk random coefficient


model is T . Note that for this problem the knowledge of the distribution of/~t
beyond the first two moments is extraneous. Now, to consider a simple case of
uncertainty, assume the mean of/~t is unknown. For this problem the value of
E{flt} must be estimated. Knowledge of the distribution of fit is now crucial in
how the estimation and control problem is formulated. If fit is assumed to be
normal, then an estimate of the mean of fl can be formulated as a conjugate
distribution. This problem is a slight variation of the problem discussed by Aoki
(1967, p. 111). In an argument similar to Theorem 2 the 0-computational
A.L. Norman and D. IV. Shimer, Risk, uncertainty, and complexity 243

complexity of this is transfinite. The point is one does not need to assume that
the distribution of fit is unknown to obtain transfinite 0-computational com-
plexity.
In the examples in this section we have specified uncertainty examples which
have a single unknown parameter for a specific purpose. We want to show that
even in the simplest case of conjugate Bayesian distributions uncertainty leads
to intractable profit maximization problems. But to what extent are these results
generalizable? The sample problems are a simple examples of linear quadratic
control problems. For variations of linear quadratic control problems see Chow
(1975). The 0-computational complexity of many linear quadratic control prob-
lems with known coefficients is known to be low-order polynomial [Norman
and Jung (1977)]. The 0-computational complexity of linear quadratic control
problems with random coefficients models with known first two moments is also
known to be low-order polynomial, whereas the computational complexity of
linear quadratic models with unknown coefficients is known to be transfinite
[Norman (1994)]. Thus the results are immediately generalizable to the case of
linear quadratic control models with the uncertain decision specified as one or
more unknown coefficients.

5. Nonlinear models
Let us consider a very simple nonlinear generalization of (1):

qt = flg(xt) + et, t = 1, 2 . . . . . T. (23)

For the purpose of discussion, we have assumed that the nonlinear production
function is linear in fl so that the updating relationships for fl in the case of
uncertainty become

h, = hi_ x + 9 ( x t - 1) 2 , (24)

mt= (mt-l ht-1 + qt- x O(xt-1))/ht (25)

The complexity of the risk case depends on the difficulty in solving this
nonlinear version of (14): As was the case with the linear model, because there
are no dynamics, solution of the T-time-period problem can be partitioned into
T single-period problems with the same solutions:

H t = sup Z t , (27)
Xt
244 A.L. Norman and D. W. Shimer, Risk, uncertainty, and complexity

where
Z, = r ' - l (aflg(x,) - d~2g(x,) 2 - d) . (28)

The difficulty in solving the single-period problem depends on the properties of


the function Zt. If (27) affords an analytical solution, then the (6 = 0)-computa-
tional complexity of the risk case is T o, as was the case with a linear model. For
the same reasons as in the linear case, the (6 = 0)-computational complexity of
the uncertainty case is transfinite.
However, the comparison of risk and uncertainty in the nonlinear case with
6 = 0, in general, is not meaningful because the worst-case number of computa-
tions is unbounded in both cases. To obtain a meaningful comparison between
risk and uncertainty, the comparison must be made with approximate solutions,
that is, with 6 > 0. From the perspective of rational men, such approximate
solutions are called 6-rationality.
For example, consider the case where Newton's method converges globally to
a unique solution. For example, such would be the case if Z~ is concave and has
the property ]Z"(x,) - Z"(yt)] < 7lxt - Y,] for any xt, Yt [see Ortega and Rhein-
boldt (1970)]. Given the assumptions on Zt, the Newton algorithm will achieve
quadratic convergence, hence such an algorithm will achieve 6 > 0 accuracy in
a finite number of steps. Because at least one computation is required, the
(6 > 0)-computational complexity of this nonlinear risk case is T o. In contrast,
the computational complexity of the corresponding uncertainty case is bound
from below by T 1 because the solution requires solving a dynamic program-
ming problem of T periods. The 6-computational complexity of dynamic pro-
gramming has been characterized only for special cases such as the discounted,
infinite time horizon [Chow and Tsitsiklis (1989)]. Because of the 'curse of
dimensionality', the computational complexity of the uncertainty case for this
simple nonlinear model may well be exponential in the time horizon.
For nonlinear models with dynamics both the risk and the uncertainty cases
can require solving a two-point boundary value problem. Nevertheless, the
uncertainty case always has the added problem of updating the estimates which,
with the exception of conjugate distributions, adds to the problem solving
difficulty. This leads to a general Knightian risk-uncertainty decision conjecture:

The (6 > O)-computational complexity of an uncertain decision is a member of an


equal or higher computational complexity equivalence class than the corresponding
risk decision. Also, the uncertain decision will always be absolutely more costly to
compute than the corresponding risk decision.

6. Concluding remarks
In Part Three of Risk, Uncertainty and Profit Knight details methods which
economic agents use to reduce uncertainty. It is important to note that Knight
A.L. Norman and D. 14/. Shimer, Risk, uncertainty, and complexity 245

uses the word uncertainty to cover the three probability cases previously
discussed. Case 1 is risk which is defined as measurable uncertainty, and case 3 is
uncertainty or, more precisely, unmeasurable or true uncertainty. Thus, the
word uncertainty has two meanings. When Knight discusses ways to reduce
uncertainty, he is using uncertainty in the broad sense to cover the three cases.
Knight defines six methods to reduce uncertainty: 'The two fundamental
methods of dealing with uncertainty, based respectively upon reduction by
grouping and upon selection of men to 'bear' it, are 'consolidation' and 'special-
ization', respectively'. The other methods for reducing uncertainty are 'control of
the future', 'increased power of prediction', 'diffusion', and 'the possibility of
directing industrial activity more or less along lines in which a minimal amount
of uncertainty is involved'. In this section we shall focus of presenting an
interpretation of consolidation and specialization which reconciles Knight with
the Bayesians.
First, let us consider how to reconcile Knight with the Bayesians concerning
specialization. Knight problems for which only subjective probabilities are
available, case 3, are much more difficult to solve than those involving objective
probabilities, cases 1 and 2. Human ability to solve such problems varies and the
market selects those people with superior problem-solving ability to become
successful entrepreneurs. In the Bayesian optimization example presented in the
previous sections, the complexity is transfinite for both cases 2 and 3. Also,
rational behavior assumes economic agents exactly optimize, which precludes
differences in ability. This apparent conflict can be resolved by making the
comparison on the basis of bounded rationality.
From the perspective of modern terminology Knight was a proponent of
bounded rationality [Simon (1975)]; he was concerned with the problem of how
finite intelligence was able to deal with a world of infinite variety. Knight states:
'The ordinary decisions of life are made on the basis of'estimates' of a crude and
superficial character. In general the future situation in relation to which we act
depends upon the behavior of an infinitely large number of objects, and is
influenced by so many factors that no real effort is made to take account of them
all, much less estimate and summate their separate significances.' In discussing
a manufacturer considering an investment decision, he states: 'He 'figures' more
or less on the proposition, taking account as well as possible of the various
factors more or less susceptible of measurement, but the final result is an
'estimate' of the probable outcome of any proposed course of action.'
Similarly, long before formal complexity analysis Bayesians have been aware
that Bayesian optimization problems are hard to solve. One bounded rational
approach which has now been studied for over two decades is how to incorpo-
rate simplifying assumptions to obtain a more computationally tractable prob-
lem. One approach is to separate the estimation problem from the optimization
problem. This approach is known as passive learning and in the case of the
example reduces the complexity from transfinite to linear; see Norman (1994).
246 A .L. Norman and D.W. Shimer, Risk, uncertainty, and complexity

Another approach is to reduce the complexity of incorporating the value of


learning into the optimization problem. This approach is known as active
learning. Passive and active learning strategies which either are in or could
readily be adapted to a Bayesian formulation are discussed in MacRae (1972),
Chow (1975), and Norman (1976). The MacRae active learning strategy for the
example would result in a transfinite two-point boundary value problem for
which the complexity of each iteration is linear.
When little is known, case 3, the performance of alternative estimation and
control strategies based on simplifying assumptions is known to be problem-
specific [Norman (1976)]. Such problems present severe difficulties regarding
the development of an appropriate strategy for a new problem. When more is
known about the parameters simple strategies based on separation of estimation
from control can be expected to have good performance. From the perspective
of bounded rationality, the division of Bayesian optimization problems is much
closer to the division proposed by Knight than is the case for exact optimization,
because the less that is known about the parameters the more difficult it is to
construct simple strategies with good performance. And as the ability of re-
searchers to propose estimation and control strategies and apply them varies,
specialization occurs both in research and economic activities.
We are now in a position to interpret in what sense Knight considers objective
uncertainty different from subjective uncertainty. The difference is not math-
ematical but lies in the consequences of making decisions under uncertainty.
Knight in Chapter VIII asserts that 'when an individual instance only is at issue,
there is no difference for conduct between a measurable risk and an unmeasur-
able uncertainty'. While Knight did not specify the Bayesian updating relation-
ship for conjugate distributions, he was intuitively aware that subjective and
objective probabilities can be combined. He states: 'We hold also that both the
objective and subjective types may be involved at the same time, though no
doubt most men do not carry their deliberations so far . . . ' Thus Knight's
distinction is not a mathematical one. Rather, in business decisions when the
decision maker knows little (empirically the case when only subjective probabil-
ities are available) it is difficult to construct a simple estimation and control
strategy which has good performance. When more is known (empirically when
objective probabilities are also available), the task is easier.
Now let us use the bounded rational explanation of specialization to provide
an interpretation of consolidation, which is a method for reducing uncertainty
by grouping of similar instances. An important institutional example of consoli-
dation is insurance. In the paper we interpreted Knight's probability model as
a continuum from knowing very little to asymptotically converged estimates.
This model is consistent with Knight's discussion of insurance as an institutional
mechanism for reducing uncertainty. For Knight the ability of the insurance
industry to offer a specific type of policy depends on the 'measurement of
probability on the basis of a fairly accurate grouping into classes'. For Knight
A.L. Norman and D.W. Shimer, Risk, uncertainty, and complexity 247

life insurance has the most precise estimates, whereas sickness and accident
insurance the least. Our Bayesian probability model is consistent with Knight if
we equate accurate classification as the formulation of a nonexperimental model
without specification error and also equate instances with observations.
Knight points out insurance is unavailable for many business decisions: 'The
typical uninsurable (because unmeasurable and this because unclassifiable)
business risk relates to the exercise of judgment in making of decisions by the
business man.' In such situations Knight states: 'The possibility of thus reducing
uncertainty by transforming it into a measurable risk through grouping consti-
tutes a strong incentive to extend the scale of operations of a business establish-
ment.' We shall argue that the incentive for such consolidation is improved
expected performance.
Consider the example monopoly model presented in the previous sections. To
discuss the benefits of Knightian consolidation assume there are ten such firms
each with the same unknown value of fl and each with a monopoly in its local
market. Since each has a monopoly in its local market, the incentive to combine
is not to form a cartel. To understand the incentive from the perspective of
estimation and control theory assume a case 3 probability model, where each
firm starts with a locally uniform prior on the distribution of the unknown
parameter. Now, in the spirit of bounded rationality, assume each firm uses
a 'heuristic' certainty equivalence strategy. This means that each period each
firm uses the prior mean offl in (15) to compute an approximation of the optimal
decision. After each decision has been executed and observed each firm updates
its estimate of fl using (2) and (3). From the perspective of ten firms operating
independently, each firm only has its own observations to update its estimate of
fl because such information would be proprietary. From the perspective of
a consolidated firm the corporate headquarters would collect the observations
each period from the ten plants and combine them using (2) and (3) to obtain
a better estimate. Now let us assume a weak sufficiency condition, that the
production of the ten plants is positive each period. As the precision of the
combined estimate is greater than the precision of the separate estimates,
expected performance would be better, because the combination would obtain
a better estimate of the true, but unknown control law for each plant.
It should be noted that the benefits of this type of Knightian consolidation
could have been obtained by a research and development consortium without
the actual consolidation of the firms. This type of consolidation is explored more
fully in Norman (1993).

Appendix: Dynamic programming


There are numerous delicate issues involved in the establishment of Bellman's
optimality condition for dynamic programming with uncountable probability,
248 A.L. Norman and D.W. Shirner, Risk, uncertainty, and complexity

state and action spaces. F o r example, if f is a measurable function in two


variables, then supyf(x, y) is not required to be a measurable function in x. Also,
even if a s u p r e m u m is attained for each x of this function, a measurable function
~b such that s u p y f ( x , y ) = f ( x , ( o ( x ) ) i s not required to exist.
Blackwell (1965) was the first to tackle these measurability problems by
utilizing Borel state and action spaces and measurable utility functions and
policies. In a further generalization, Blackwell, Freedman, and Orkin (1974)
introduce analytic state and action spaces, a semianalytic utility, and analyti-
cally measurable policies. A n o t h e r a p p r o a c h is that of Shreve and Bertsekas
(1978) who utilize outer integration to dispense with measurability of functions
altogether. The measurability problems disappear in the last two cases; however,
at the expense of definitional problems associated with outer integration on the
one h a n d or t h r o u g h the imposition of the weak t o p o l o g y on the spaces
involved.
In a recent m o n o g r a p h , T h i e m a n n (1985) solves the above mentioned measur-
ability problems without recourse to topological restrictions on the spaces
involved. T h i e m a n n uses a generalized decision model framework with analytic
state and action spaces and with Souslin or universally measurable policies
and utility functions. In our model we use an augmented state space
St:= {(p,,q, mt, ht)~ ~ 4 } with the coordinates of st ~ St defined by (4), (1), (2),
and (3). O u r action space is ~ , while our transition law is a conjugate normal
Bayes m a p defined t h r o u g h our augmented state space so that continuity is
obvious. O u r spaces are analytic and our transition law and reward function are
universally measurable.

References
Aoki, Masanao, 1967, Optimization of stochastic systems (Academic Press, New York, NY).
Bewley, Truman F., 1986, Knightian decision theory: Part 1, Cowles Foundation discussion paper
no. 807 (Cowles Foundation, New Haven, CT ).
Blackwell, David, 1965, Discounted dynamic programming, Annals of Mathematical Statistics 36,
226-235.
Blackwell, David, D. Freedman, and M. Orkin, 1974, The optimal reward operator in dynamic
programming, Annals of Probability 2, 926-941.
Chow, Chee-Seng and John N. Tsitsiklis, 1989,The complexity of dynamic programming, Journal of
Complexity 5, 466-488.
Chow, Gregory C., 1975, Analysis and control of dynamic economic systems (Wiley, New York,
NY).
Cyert, Richard M. and Morris H. DeGroot, 1987, Bayesian analysis and uncertainty in economic
theory (Rowman & Littlefield, Totowa).
Easley, David and N.M. Keifer, 1988, Controlling a stochastic process with unknown parameters,
Econometrica 56, 1045-1064.
Knight, Frank H., 1971 (original 1921), Risk uncertainty and profit (University of Chicago Press,
Chicago, IL).
MacRae, E.C., 1972, Linear decision with experimentation, Annals of Economic and Social
Measurement 4, 437-447.
A.L. Norman and D.W. Shimer, Risk, uncertainty, and complexity 249

Moses, Joel, 1971, Symbolic integration: The stormy decade, Communications of the ACM 14,
548-560.
Norman, Alfred L., 1987, A theory of monetary exchange, Review of Economic Studies 54, 499-517.
Norman, Alfred L., 1993, Informational society: An economic theory of discovery, invention and
innovation (Kluwer Academic Press, Boston, MA).
Norman, Alfred L., 1994, On the complexity of linear quadratic control, European Journal of
Operations Research, forthcoming.
Norman, Alfred L. and Woo S. Jung, 1977, Linear quadratic control theory for models with long
lags, Econometrica 45, 905-917.
Ortega, J.M. and W.C. Rheinboldt, 1970, Iterative solution of nonlinear equations in several
variables (Academic Press, New York, NY).
Savage, Leonard, 1954, The foundations of statistics (Wiley, New York, NY).
Simon, H.A., 1957, Models of man (Wiley, New York, NY).
Shreve, Stephen E. and Dimitri P. Bertsekas, 1978, Alternative theoretical frameworks for finite
horizon discrete-time stochastic optimal control, SIAM Journal of Control and Optimization 16,
953-978.
Thiemann, J.G.F., 1985, Analytic spaces and dynamic programming: A measure-theoretic approach
(Centrum voor Wiskunde en Informatica, Amsterdam).
Traub, J.F., G.W. Wasilkowski, and H. Wo~niakowski, 1983, Information, uncertainty, complexity
(Addison-Wesley, Reading, MA).
Traub, J.F., G.W. Wasilkowski, and H. Wo~niakowski, 1988, Information-based complexity (Aca-
demic Press, Boston, MA).
Zellner, Arnold, 1971, An introduction to Bayesian inference in econometrics (Academic Press,
New York, NY).

You might also like